I think endOfBatch flag is still falling short of addressing the main
issue: the appender is oblivious to the batching. Put another way, it
cannot have an assumption whether endOfBatch is employed or not. Consider
appender A honoring endOfBatch flag, though the preceding filter F never
takes advantage of it, hence A will keep on piling events until an OOM
exception. This is my main motivation for introducing batching explicitly
at the appender interface. For instance,
interface Appender {
default void append(LogEvent logEvent) {
append(Batch.of(logEvent)); } // for backward-compatibility
void append(Batch<LogEvent> batch);
}
interface Batch<E> {
void forEach(Consumer<E> item);
}
This way we can introduce reusable (and hence, mutable) Batch<E>
implementations, while the appender is free to perform its own batching
magic tuned for the underlying sink, e.g., JDBC connection, network socket,
etc.
Regarding your remark about "being sympathetic to the ring buffer design",
I did not fully understand this. Would you mind elaborating on this a
little bit more and maybe even sharing a comparison between this and the
aforementioned explicit batching at the interface level, please?
In conclusion, IMHO, making batching/aggregation explicit at the interface
level will help us to avoid quite some code repetition and make the life
easier for future appenders.
On Thu, Sep 24, 2020 at 5:56 PM Carter Kozak <[email protected]> wrote:
> The AsyncAppender more or less implements what you've described, except it
> forwards each event in the batch individually to a delegate appender and
> sets the end-of-batch marker on the last one so the API isn't quite as
> pretty. We could implement something similar that groups events up to some
> interval or maximum capacity before forwarding them, but we'd have to be
> careful to avoid creating garbage collection problems. Potentially holding
> otherwise short lived parameters in the background can have unintended
> impact on garbage collector performance :-)
>
> The LogEvent.isEndOfBatch method is designed to be sympathetic to the ring
> buffer design, where we never create an intermediate collection, but
> instead signal when an event is the last in a group that we're aware of.
>
> Batching comes up fairly frequently for network and database appenders, I
> think we've duplicated that code a few times. I'd be supportive of a
> canonical implementation (or reusable utility functionality) if we can
> extract it out.
>
> -ck
>
> On Thu, Sep 24, 2020, at 11:46, Ralph Goers wrote:
> > Well, Log4j only feeds events to Appenders one at a time. It has no
> place to aggregate them. However, one Appender could wrap others to create
> a batch at which point this would become useful. I suppose some other
> component could be invented to create a batch but I am not sure where or
> how that would fit into the process - does the batch apply only to a
> single appender or to multiple appenders? What filtering takes place
> before it is added to the batch and what, if any, happens after?
> >
> > Ralph
> >
> > > On Sep 24, 2020, at 8:00 AM, Volkan Yazıcı <[email protected]>
> wrote:
> > >
> > > If one would look close to existing appenders, almost everyone of them
> > > implements a custom batching solution. To the best of my knowledge,
> this is
> > > mainly due to the constraint in the Appender interface:
> > >
> > > void append(LogEvent event);
> > >
> > > If this would have been replaced (enhanced?) with
> > >
> > > void append(List<LogEvent> event);
> > >
> > > appenders wouldn't need to implement any batching at all. Further,
> Log4j
> > > could have also provided a batching filter replacing all the custom
> > > batching code internal to the appenders.
> > >
> > > What do you think? Am I missing something obvious here?
> >
> >
> >
>