[ 
https://issues.apache.org/jira/browse/LOG4J2-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555298#comment-15555298
 ] 

Remko Popma commented on LOG4J2-1630:
-------------------------------------

For the moment, let's set aside the topic of how the application demarcates the 
beginning and the end of a unit of work, and let's take a look an example use 
case.

{quote}
An application receives a high volume of market data events. When processing 
these events the application calls into a third-party library to do some price 
model calculations. It also calls into another third-party library to persist 
some data. There is logging going on in the application, and in both these 
libraries.

Suppose that we define two units of work: the first unit of work is the 
processing of a single market data event from the beginning to the end. The 
second unit is nested inside the main task: calling the 3rd party library to 
calculate a price. If the price calculation result is not interesting we don't 
want to log the details of the calculation. If the market data processing did 
not result in any action we want to suppress both the logging done by the 
application during this processing as well as the price calculation logging 
(the nested unit). However, we never want to suppress logging done in the 3rd 
party library for persisting data.
{quote}

Note: when units of work are nested, the outermost unit determines what gets 
logged.

Example pseudo code (only the logging):

{code}
StartUnitOfWork "onMarketDataUpdate"
com.myapp.pkg1: log "A" with marker "MarketData"
com.myapp.pkg2: log "B" with marker "MarketData"

StartUnitOfWork "priceCalculation"
com.pricecalc.pkg1: log "C"
com.pricecalc.pkg2: log "D"
DiscardUnitOfWork "priceCalculation"

com.myapp.pkg3:   log "E" with marker "MarketData"
com.persist.pkg1: log "F"
com.myapp.pkg4:   log "G" with marker "MarketData"

FlushUnitOfWork "onMarketDataUpdate"
{code}

Then we expect the result to be
{code}
(com.myapp.pkg1)   A
(com.myapp.pkg2)   B
(com.myapp.pkg3)   E
(com.persist.pkg1) F
(com.myapp.pkg4)   G
{code}

Note that the 3rd party libraries are not under our control. They currently do 
not use markers, and we cannot force them to use markers. If we want to 
suppress (or include) them, we need to use something other than markers.

I'm thinking that a configuration for the above use case could look something 
like this:

{noformat}
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT"><PatternLayout pattern="%d [%t] 
%p %c{36} - %m%n"/></Console>
    <File name="File" fileName="${sys:filename}"><PatternLayout pattern="%d 
[%t] %p %c{1.} - %m%n"/></File>

    <UnitOfWorkBuffer name="uow" size="8192">
      <AppenderRef ref="Console"/>
      <AppenderRef ref="File"/>

      <UnitOfWork name="onMarketDataUpdate">
        <Filters> <!-- filters accept log events that are part of this unit of 
work -->
          <MarkerFilter marker="MarketData" onMatch="ACCEPT" 
onMismatch="NEUTRAL"/> <!-- suppress if unit is discarded -->
          <LoggerName pattern="com.persist.pkg1" onMatch="DENY" 
onMismatch="NEUTRAL" /> <!-- never suppress (exclude from UOW) -->
        </Filters>
      </UnitOfWork>
      <UnitOfWork name="priceCalculation">
        <Filters>
          <LoggerName pattern="com.pricecalc.pkg?" onMatch="ACCEPT" 
onMismatch="NEUTRAL" /> <!-- suppress if unit is discarded -->
        </Filters>
      </UnitOfWork>
    </UnitOfWorkBuffer>
  </Appenders>
  <Loggers>
    <Root level="trace">
      <AppenderRef ref="uow"/>
    </Root>
  </Loggers>
</Configuration>
{noformat}

{{UnitOfWorkBuffer}} is the main plugin, it has a number of {{AppenderRef}} 
elements that the non-suppressed log events are forwarded to, and one or more 
{{UnitOfWork}} elements. 

A {{UnitOfWork}} element has a name and a set of filters that ACCEPT log events 
that are included in the unit of work, and DENY log events that are excluded 
from the unit of work.

When a log event is added to the UnitOfWorkBuffer, it is inspected to see if it 
marks the end of a unit of work. If it sees a "FlushUnitOfWork" command, and 
the name of the unit of work to be flushed is the same as the name at the start 
of the buffer, then the entire buffer contents can be sent to the referenced 
appenders. If the new log event says "DiscardUnitOfWork", UnitOfWorkBuffer 
loops backwards through the buffer, removing buffered log events matched by the 
unit of work filters until it encounters the log event that started the unit of 
work.

I'll take a stab at implementing this to see if/how this can be made to work.

> Unit of Work Logging
> --------------------
>
>                 Key: LOG4J2-1630
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1630
>             Project: Log4j 2
>          Issue Type: Story
>          Components: API, Core, Filters
>    Affects Versions: 2.7
>            Reporter: Remko Popma
>             Fix For: 2.8
>
>
> h3. Intent
> Provide a way to filter log events, where the decision on whether to discard 
> the message or actually log them cannot be made until after the application 
> has already logged the message.
> h3. Motivation
> In many systems, particularly event processing applications, log files 
> contain a lot of repetitive log messages. Suppose an application needs to do 
> some calculation to decide whether or not to react to some event, and a lot 
> of detail is logged during this calculation. Imagine that 99% of the time, 
> the application decides to take no action. Once the application arrived at 
> that conclusion it would be nice if we could go back and undo all the 
> detailed logging and print a summary instead. When the application _does_ 
> decide to take some action, however, we _do_ want the detailed log messages. 
> A Unit of Work for logging would allow us to group a set of log messages and 
> either discard them or log them together. (Inspired by Martin Fowler's [Unit 
> of Work|http://martinfowler.com/eaaCatalog/unitOfWork.html] pattern.)
> This should result in log files where a lot of the "uninteresting" logging is 
> filtered out, significantly reducing the amount of data logged.
> Some applications do this in an ad hoc manner, for example by passing a 
> Collection to its components, where these components can add log message 
> strings to this Collection. When the discard/retain decision is made, the 
> application then either clears the Collection or logs the contents of the 
> Collection. This works, but having to pass the Collection down the component 
> tree is clunky and the result often omits details like logger name, timestamp 
> and other details that come for free with normal logging. Log4j can provide a 
> reusable and less intrusive way to accomplish this.
> h3. How it works
> There would need to be some API for the application to mark the _start_ of 
> the unit of work, and some API to signal whether the log messages that are 
> part of that unit of work need to be _discarded_ or _logged_ (retained).
> Not all logging that occurs after a unit of work was started is part of that 
> unit of work. The application may want some messages to be logged regardless 
> of whether the unit of work was discarded or not. There needs to be a 
> flexible way (or multiple ways) to include or exclude logging statements from 
> the unit of work. 
> The application may also designate multiple units of work, which may be 
> sequential, nested or partially overlapping. Each unit of work may define its 
> own rules for which log messages are considered included in or excluded from 
> the unit of work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to