+1 Penghui
On Fri, Apr 3, 2026 at 10:38 AM Matteo Merli <[email protected]> wrote: > https://github.com/apache/pulsar/pull/25462 > > > # Background knowledge > > [PIP-89](https://github.com/apache/pulsar/blob/master/pip/pip-89.md) > introduced the concept of structured event > logging for Pulsar and was approved in 2021. It proposed replacing SLF4J > string-interpolated logging with structured > key-value logging, using a custom wrapper (`structured-event-log` module) > built on top of SLF4J's MDC. > > Since then, a standalone library called [slog]( > https://github.com/merlimat/slog) has been developed that provides a > more complete and performant implementation of structured logging. slog > builds on the same principles as PIP-89 but > with a significantly improved API, better performance characteristics, and > support for all standard log levels. > > ### Structured logging > > Traditional logging uses string interpolation to embed values into log > messages: > > ```java > log.info("[{}] Opening ledger {} with {} entries", name, ledgerId, > entries); > ``` > > Structured logging instead attaches context as typed key-value attributes: > > ```java > log.info().attr("ledgerId", ledgerId).attr("entries", > entries).log("Opening > ledger"); > ``` > > This makes log records machine-parseable, filterable, and queryable without > complex regex patterns. > > ### slog API > > slog provides a `Logger` interface with the following key features: > > * **All standard levels**: `trace()`, `debug()`, `info()`, `warn()`, > `error()` — each with four variants: > simple message, printf-formatted, fluent builder, and > lambda/consumer-based. > * **Fluent Event builder**: Chain `.attr(key, value)` calls to attach > structured attributes, with primitive overloads > (`long`, `int`, `double`, `float`, `boolean`) to avoid autoboxing. > * **Derived loggers with context**: `logger.with().attr("topic", > topic).build()` creates a child logger where > context attributes are automatically included in every log call. > * **Context composition**: `logger.with().ctx(otherLogger).build()` > inherits context from another logger, enabling > cross-component context propagation without coupling. > * **Zero overhead when disabled**: Level guards are checked before > constructing any objects. The fluent builders > return a no-op singleton `Event` when the level is disabled, making all > chained calls effectively free. > * **Timed events**: `log.info().timed()` automatically records elapsed > duration between the call and `.log()`. > * **Lazy evaluation**: `log.debug(e -> e.attr(...).log(...))` — the lambda > only executes when the level is enabled. > * **Exception support**: `.exception(t)` for full stack traces, > `.exceptionMessage(t)` for message-only. > * **Lombok integration**: `@CustomLog` annotation generates the static > logger field automatically. > > slog auto-discovers the backend at runtime: Log4j2 (preferred) or SLF4J > 1.x/2.x as fallback. > > # Motivation > > While PIP-89 was approved and the `structured-event-log` module was added > to the Pulsar codebase, adoption has been > very limited. Almost all Pulsar components still use traditional SLF4J > logging with string interpolation. The > `structured-event-log` module's API is verbose and lacks support for > standard log levels like `debug` and `trace`, > which has hindered adoption. > > Meanwhile, the slog library has evolved to address these shortcomings with > a cleaner, more complete API. This PIP > proposes to adopt slog as the standard logging approach across all Pulsar > components, replacing both direct SLF4J > usage and the internal `structured-event-log` module. > > ### Problems with the current state > > **Inconsistent log formatting**: Every developer formats log messages > differently. The same information appears in > different positions, with different delimiters, and different key naming: > ```java > // Various inconsistent patterns found across the codebase: > log.info("[{}] Opening managed ledger {}", name, ledgerId); > log.info("[{}] [{}] Cursor {} recovered to position {}", ledgerName, > cursorName, cursor, position); > log.warn("Error opening ledger {} for cursor {}: {}", ledgerId, cursorName, > errorMsg); > ``` > > **Repeated boilerplate context**: The managed ledger name, cursor name, > topic name, and other context identifiers > are repeated as positional arguments in nearly every log statement — often > as the first one or two `{}` placeholders. > This is verbose, error-prone, and frequently inconsistent. > > **isDebugEnabled guards**: Debug-level logging requires explicit level > guards to avoid the cost of string > construction and argument evaluation: > ```java > if (log.isDebugEnabled()) { > log.debug("[{}] Reading entry ledger {}: {}", name, > position.getLedgerId(), position.getEntryId()); > } > ``` > This adds significant boilerplate — in `ManagedLedgerImpl` alone, there > were 58 such guards. > > **Poor queryability**: Log analysis tools (Splunk, Elasticsearch, Loki) > cannot reliably extract values embedded in > interpolated strings. Regex-based parsing is fragile and varies by message > format. > > **Missing context in error scenarios**: When debugging production issues, > engineers frequently find that critical > context was omitted from log messages because adding another positional > argument was cumbersome. > > ### Why slog over `structured-event-log` > > * **Complete level support**: slog supports all five standard levels > (`trace`, `debug`, `info`, `warn`, `error`), > while `structured-event-log` only supports `info`, `warn`, and `error`. > * **Simpler API**: slog's `log.info().attr("k", v).log("msg")` is more > natural than > `log.newRootEvent().resource("k", v).log("msg")`. > * **Better performance**: slog uses a cached generation-counter scheme for > level checks with a single integer > comparison and no volatile fence. Primitive `attr()` overloads avoid > autoboxing. > * **Derived loggers**: `logger.with().attr(...)` creates an immutable child > logger with persistent context — cleaner > than creating `EventResources` objects. > * **Lazy evaluation**: Lambda-based logging (`log.debug(e -> ...)`) > eliminates `isDebugEnabled` guards entirely. > * **Lombok integration**: `@CustomLog` works with slog, reducing logger > declaration boilerplate. > * **Standalone library**: slog is maintained as an independent library, > decoupled from Pulsar's release cycle. > Being a separate library makes it easier to adopt across related projects > (e.g., BookKeeper, Oxia Java client > which already uses slog), enabling consistent structured logging and > cross-project context propagation. > > # Goals > > ## In Scope > > * **Adopt slog** as the standard structured logging library for all Pulsar > components. > * **Remove the internal `structured-event-log` module** from the Pulsar > codebase, since slog supersedes it. > * **Migrate all components** from SLF4J to slog, one module at a time. > Components can be migrated independently > and there is no concern with a coexistence period where some components > use SLF4J and others use slog. > * **Establish logging conventions**: standardized attribute names > (`managedLedger`, `cursor`, `ledgerId`, `topic`, > `position`, etc.) to ensure consistency across the codebase. > * **Use derived loggers for context propagation**: Classes that carry > identity (e.g., a managed ledger's name, a > cursor's name, a topic) should create derived loggers that automatically > attach this context to every log call. > > * **Cross-boundary context propagation with BookKeeper**: Since Pulsar > embeds BookKeeper, slog's context > composition (`ctx()`) enables sharing context across the Pulsar → > BookKeeper boundary. For example, a BookKeeper > write failure event log would automatically carry the Pulsar topic name > as an attribute, without BookKeeper needing > to know about Pulsar concepts. This dramatically improves debuggability > of storage-layer issues. > > ## Out of Scope > > * Defining centralized event enums (as discussed in PIP-89). slog uses > string messages, and adopting an enum-based > log event model would be highly impractical given the scale of the > codebase. > * Modifying the log output format or Log4j2 configuration. slog works > transparently with existing logging backends > and configurations. > * Introducing distributed tracing or trace ID propagation. This is > orthogonal and can be layered on top later. > > ### Pulsar client libraries > > The `pulsar-client` module is a special case because its logs are emitted > within users' applications rather than > on the server side. slog auto-discovers the logging backend at runtime > (Log4j2 or SLF4J), so it will integrate > transparently with whatever logging framework the user's application > already has configured. No action is required > by users — the client will simply emit structured log records through the > existing backend, and users who configure > a JSON appender will automatically benefit from the structured attributes. > > # High Level Design > > The migration follows a straightforward, incremental approach: > > 1. **Add slog as a dependency** in the Pulsar build system (Gradle version > catalog). > 2. **Configure Lombok** with `@CustomLog` support for slog. > 3. **Migrate modules one at a time**, starting with `managed-ledger` as the > first module (already done as a > proof of concept). Each module migration involves: > - Replacing SLF4J imports and logger declarations with slog. > - Converting log statements from parameterized format to structured > format. > - Introducing derived loggers for repeated context (e.g., ledger name, > cursor name). > - Removing `isDebugEnabled()` / `isTraceEnabled()` guards. > 4. **Delete the `structured-event-log` module** — it has no existing usages > and can be removed immediately. > > Since slog auto-discovers the logging backend (Log4j2 or SLF4J), no changes > to logging configuration files are > needed. The structured attributes are emitted via the backend's native > mechanism (Log4j2 key-value pairs or SLF4J > MDC), so existing log appenders and formats continue to work. Users who > want to leverage structured attributes in > their log output can configure a JSON appender. > > # Detailed Design > > ## Design & Implementation Details > > ### Logger declaration > > For simple classes, use Lombok's `@CustomLog` annotation: > > ```java > @CustomLog > public class OpAddEntry implements AddCallback { > // Lombok generates: private static final Logger log = > Logger.get(OpAddEntry.class); > > public void initiate() { > log.info().attr("ledgerId", ledger.getId()).log("Initiating add > entry"); > } > } > ``` > > For classes with persistent identity context (e.g., a managed ledger, a > cursor, a topic handler), create a > derived instance logger: > > ```java > public class ManagedLedgerImpl implements ManagedLedger { > private static final Logger slog = Logger.get(ManagedLedgerImpl.class); > protected final Logger log; > > public ManagedLedgerImpl(..., String name, ...) { > this.log = slog.with().attr("managedLedger", name).build(); > } > > // All subsequent log calls automatically include managedLedger=<name> > public void openLedger(long ledgerId) { > log.info().attr("ledgerId", ledgerId).log("Opening ledger"); > // Emits: msg="Opening ledger" managedLedger=my-topic > ledgerId=12345 > } > } > ``` > > ### Context composition across components > > Classes that operate on behalf of another component can inherit its > context: > > ```java > public class RangeEntryCacheImpl implements EntryCache { > private static final Logger slog = > Logger.get(RangeEntryCacheImpl.class); > private final Logger log; > > RangeEntryCacheImpl(ManagedLedgerImpl ml, ...) { > this.log = slog.with().ctx(ml.getLogger()).build(); > // Logger name is RangeEntryCacheImpl, but carries managedLedger > context from ml > } > } > ``` > > ### Cross-boundary context: Pulsar → BookKeeper > > One of the most powerful benefits of slog's context model is the ability to > propagate context across component > boundaries. Since Pulsar embeds BookKeeper, a BookKeeper ledger operation > can inherit context from the Pulsar > component that initiated it: > > ```java > // In Pulsar's ManagedLedgerImpl (already has managedLedger context) > public void createLedger() { > // Pass the Pulsar logger context to BookKeeper > bookKeeper.asyncCreateLedger(..., log, ...); > } > > // In BookKeeper's LedgerHandle > public class LedgerHandle { > private final Logger log; > > LedgerHandle(Logger pulsarLogger, ...) { > // BookKeeper's logger inherits Pulsar's context > this.log = slog.with().ctx(pulsarLogger).attr("ledgerId", > ledgerId).build(); > } > > void handleWriteError(int rc) { > // This log entry automatically includes managedLedger=my-topic > from Pulsar's context > log.error().attr("rc", BKException.getMessage(rc)).log("Write > failed"); > // Emits: msg="Write failed" managedLedger=my-topic ledgerId=12345 > rc=NotEnoughBookies > } > } > ``` > > This eliminates the long-standing problem of BookKeeper errors being logged > without any Pulsar-level context, > making it very difficult to correlate storage failures with the affected > topics. > > ### Conversion patterns > > | SLF4J pattern | slog equivalent | > |---|---| > | `log.info("msg")` | `log.info("msg")` | > | `log.info("[{}] msg {}", name, val)` | `log.info().attr("key", > val).log("msg")` | > | `log.error("msg", exception)` | `log.error().exception(e).log("msg")` | > | `log.warn("msg: {}", e.getMessage())` | > `log.warn().exceptionMessage(e).log("msg")` | > | `if (log.isDebugEnabled()) { log.debug(...) }` | > `log.debug().attr(...).log(...)` | > | Manual timing with `System.nanoTime()` | `log.info().timed() ... > event.log("done")` | > > ### Attribute naming conventions > > To ensure consistency across the codebase, the following attribute names > should be used: > > | Attribute | Description | Example value | > |---|---|---| > | `managedLedger` | Managed ledger name | `my-tenant/my-ns/my-topic` | > | `cursor` | Cursor name | `my-subscription` | > | `topic` | Topic name | `persistent://tenant/ns/topic` | > | `ledgerId` | BookKeeper ledger ID | `12345` | > | `entryId` | Entry ID within a ledger | `42` | > | `position` | Ledger position (ledgerId:entryId) | `12345:42` | > | `readPosition` | Current read position | `12345:42` | > | `markDeletePosition` | Mark-delete position | `12345:41` | > | `state` | Component state | `Open`, `Closed` | > | `rc` | BookKeeper return code | `OK`, `NoSuchLedger` | > | `subscription` | Subscription name | `my-sub` | > | `consumer` | Consumer name | `consumer-0` | > | `producer` | Producer name | `producer-0` | > > ## Public-facing Changes > > ### Configuration > > A new Lombok configuration is added to `lombok.config`: > > ```properties > lombok.log.custom.declaration = io.github.merlimat.slog.Logger > io.github.merlimat.slog.Logger.get(TYPE) > ``` > > ### Log output > > The log message format changes. With traditional SLF4J: > ``` > [ManagedLedgerImpl] [my-topic] Opening ledger 12345 with 100 entries > ``` > > With slog (pattern appender): > ``` > [ManagedLedgerImpl] Opening ledger {managedLedger=my-topic, ledgerId=12345, > entries=100} > ``` > > With slog (Log4j2 JSON appender): > ```json > { > "instant": {"epochSecond": 1775174210, "nanoOfSecond": 277000000}, > "thread": "pulsar-ordered-executor-0-0", > "level": "INFO", > "loggerName": "org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl", > "message": "Opening ledger", > "contextMap": { > "managedLedger": "my-topic", > "ledgerId": "12345", > "entries": "100" > }, > "endOfBatch": false, > "loggerFqcn": "io.github.merlimat.slog.impl.EventImpl", > "threadId": 42, > "threadPriority": 5 > } > ``` > > Users who rely on parsing specific log message patterns will need to update > their parsers. However, the structured > format makes parsing significantly easier and more reliable. > > # Monitoring > > No changes to monitoring. Structured logging improves the ability to create > log-based alerts and dashboards by > making attribute-based filtering possible (e.g., alert on all `ERROR` logs > where `managedLedger=X`). > > # Security Considerations > > No security implications. This is an internal change to how log messages > are formatted. No new endpoints, commands, > or permissions are introduced. > > # Backward & Forward Compatibility > > ## Upgrade > > * Log message formats will change as modules are migrated. Users who parse > log messages with regex patterns will > need to update their parsers. The structured format is strictly more > parseable than the previous format. > * To get the most out of structured logging, users should consider > switching to a JSON log appender if they haven't > already. However, this is optional — traditional pattern-based appenders > continue to work. > * No configuration changes are required for the upgrade. > > ## Downgrade / Rollback > > No special steps required. Reverting to a previous version restores the > original SLF4J logging. > > ## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations > > No impact on geo-replication. Logging is a local concern and does not > affect inter-cluster communication. > > # Alternatives > > ### Continue using `structured-event-log` > > The existing internal module could be extended. However, it lacks > debug/trace levels, has a more verbose API, and > maintaining a custom logging library inside Pulsar adds unnecessary burden. > slog is a superset of the > `structured-event-log` API with better performance and more features. > > ### Use SLF4J 2.0 Fluent API > > SLF4J 2.0 introduced a fluent logging API (`log.atInfo().addKeyValue("k", > v).log("msg")`). However: > * It does not support derived loggers with persistent context. > * It does not support context composition across components. > * No support for timed events or sampling. > * Attribute values are stored as strings only (no primitive overloads). > * Level guards still allocate objects in some implementations. > > ### Use Log4j2 directly > > Log4j2 does not expose a public structured logging API for applications. > While it is possible to pass structured > key-value data through low-level, non-public Log4j2 interfaces (which is > what slog does internally as its > backend), this is not something easily or reliably usable directly from > application code. It would also tightly > couple Pulsar to Log4j2 as the only backend, preventing users from using > other SLF4J-compatible backends. > slog abstracts over these internals and supports both Log4j2 and SLF4J > backends transparently. > > ### Use Flogger > > [Flogger](https://github.com/google/flogger) is Google's fluent logging > API > for Java. While it offers a fluent API > and some structured logging features, it has significant performance > drawbacks: > * Extremely slow on the enabled path (~0.7 ops/μs vs slog's 13–25 ops/μs — > roughly **18–35× slower**). > * Very high allocation rate (1624–1904 B/op vs slog's 0–80 B/op — **20–24× > more allocations**). > * No support for derived loggers with persistent context or context > composition. > * No support for timed events. > * Tightly coupled to Google's internal infrastructure; limited backend > flexibility. > > ### Performance comparison > > Benchmarks from the [slog repository](https://github.com/merlimat/slog) > (JMH, higher ops/μs is better, lower B/op > is better): > > **Disabled path** (log level is off — measures the cost of checking the > level): > > | Library | ops/μs | > |----------------------|---------| > | slog | 1005.4 | > | Log4j2 | 413.1 | > | SLF4J | 367.5 | > | Flogger | 360.3 | > > slog's disabled path is **2.4× faster** than Log4j2 and **2.8× faster** > than SLF4J/Flogger. This matters because > the vast majority of debug/trace log statements are disabled in production. > > **Enabled path** (log level is on — measures the cost of formatting and > emitting a log message with 2 key-value > attributes): > > | Library | ops/μs | > |----------------------|--------| > | slog (simple) | 24.6 | > | Log4j2 (simple) | 14.6 | > | slog (fluent) | 13.2 | > | slog (fluent+ctx) | 12.0 | > | SLF4J (simple) | 11.2 | > | SLF4J (fluent) | 4.2 | > | Flogger | 0.7 | > > slog is the fastest in both simple and fluent modes. Flogger is **18–35× > slower** than slog. > > **Allocation rate** (bytes allocated per log operation): > > | Library | B/op | > |----------------------|------| > | slog (simple) | 0 | > | Log4j2 (simple) | 24 | > | SLF4J (simple) | 24 | > | slog (fluent+ctx) | 40 | > | slog (fluent) | 80 | > | SLF4J (fluent) | 1104 | > | Flogger | 1624 | > > slog's fluent API allocates **6× less** than Log4j2's fluent API, **14× > less** than SLF4J's fluent API, and > **20× less** than Flogger. Since structured logging requires the fluent > API, this is the most relevant comparison > for this PIP's use case. > > # General Notes > > The `managed-ledger` module has already been fully migrated as a proof of > concept, converting all source and test files > (54 files, ~400 log statements). The migration reduced total lines of code > (1481 additions vs 1392 deletions) while > making every log statement more informative and consistent. > > # Links > > * Original PIP-89: > https://github.com/apache/pulsar/blob/master/pip/pip-89.md > * slog library: https://github.com/merlimat/slog > * PoC: managed-ledger migration: > https://github.com/merlimat/pulsar/pull/20 > > > -- > Matteo Merli > <[email protected]> >
