Re: PIP-89: Structured documented logging for Pulsar

Ivan Kelly Fri, 06 Aug 2021 09:16:14 -0700

Thanks for the detailed feedback Matteo,

Responses inline.


>  1. I like the idea of using enum to identify an event, but I fear
> that the enum name itself will not be, in many cases, as expressive of
> the context as a complete phrase.

I'm not convinced by this. From just eyeballing one of our production
cluster logs, when you remove what should be attributes and what could
be considered stop words, you end up with 3-4 words which could easily
be an enum.
There's also nothing to stop you adding more info in an attribute if
you so wish. It'll all end up in the log.
```
....newEvent()
    .attr("reason", "I want to say something more")
    .info(Events.LOOKUP)
```
I can add an override to add String as an override for the event name,
so that people can put something longer. I expect this will lead to
laziness though, as new events will not end up documented. However, as
I said earlier in the thread, bad logging doesn't take away from the
good logging.

>    For example, in the proposed scenario it's used "Events.LOOKUP".
> There is a lot of context that would be missing there. Eg: is this an
> HTTP or Protobuf lookup, did we log when the operation started or when
> it was completed, etc...
To avoid a proliferation of redundant logs, we should log at the end
unless it's a very long running operation. For timed events, we should
include the start timestamp as well as duration.
W.r.t. http vs binary, you have two options:
a) put it as a attributed .attr("protocol", "http"),
b) or have separate events LOOKUP_HTTP & LOOKUP_BINARY.

>  2. We should make it easy to automatically generate documentation in
> HTML for the event. For this, using annotation might be a better
> option than Javadoc.
Can you give an example of how you would use the annotation?
I like javadoc, because you can write long form free text. With
annotations it's a java strings. Until java has multiline strings (I
saw some talk of it, but it's not in 8), this becomes cumbersome.
Perhaps there can be an annotation so that all "Events" enums get
picked up to generate specialized javadoc.

>  3. We should have an easy way to create partial loggers with
> pre-filled context. This context will be used for all the events
> created through that logger and at the same time we could add new
> context specific to the particular event.
>      For example, a `ServerCnx` object should create a new Logger when
> it gets created, with context = {localAddress="xxx",
> remoteAddress="yyy"}. This context would be carried by every event
> that happens from the `ServerCnx`. The same would be super-useful for
> many other classes, where we want to be consistent in having the
> precise context in every event.
Ya, I was thinking along the same lines for ServerCnx. It's almost as
if, for each object like this we should have a set of persistent
attributes. That can be attached to the event, but also get attached
to each child event. Perhaps we could label this as "resource" rather
than "attribute" as is done in the OTel model:
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/data-model.md#field-resource

>  4. Thread Local context --  It might be difficult to have meaningful
> TLC between Pulsar & BK code when we do have most of the code being
> asynchronous with the results coming out of a callback that is
> triggered in a separate thread.
The TLS stuff is just to avoid having to modify the API to pass down
context. I had a quick attempt at implementing it, and I think rather
than automatically putting the context on TLS, we should have an
explicit "stash/unstash" call.

>  5. We should borrow some of the good parts of the APIs in SLF4J 2.0.
> For example, given that it now supports Java 8, we should be also
> taking `Supplier<Object>` for the arguments so that we can use lambdas
> to avoid object creations when the logger level is disabled.
>     Similarly, instead of doing :
>     e.attr("result", lookupResult)
>       .info(Events.LOOKUP);
>
>    we could change the order into:
>     e.atInfo(Events.LOOKUP)
>       .attr("result", lookupResult)
>       .log();
>    This would avoid doing any extra work if the level info is disabled.
I had been thinking along the same lines since I sent the PIP. This
would be especially useful for sampled events where you really do not
want to do anything if you're not going to log. It does make one
pattern harder though. If you defer the event name to the end, you can
accumulate attributes before knowing what event you log with. For
example, for LOOKUP, if an error occurs you log with LOOKUP_ERROR
rather than LOOKUP. Maybe we need an explicit attribute object for
collecting the attributes, rather than always having it as a fluent
api.

> 6. Even if we should make structured argument the preferred way to log
> event, I'd be wary of not permitting to have:
>     a) Custom string messages
I can add this. I think it will lead to laziness, but that's just my opinion.

>     b) String templating for cases in which adding to context map is
> not the best choice.
If they have a), they can always use String.format(). I really want to
encourage using the attributes rather than snowflake string messages.
The latter makes searching the logs so much harder (you're reduced to
regexes rather than proper jq/log aggregator queries).

-Ivan

Re: PIP-89: Structured documented logging for Pulsar

Reply via email to