Thanks,

The issue I’m worried about is a race condition where:
  1. Log4j writes log event E0.
  2. Log collector reads log file and sees E0.
  3. Log4j writes log event E1.
  4. Log4j rotates log file.
  5. Log4j writes log event E2.
  6. Log collector reads log file and sees E2 but misses E1.

Obviously I can make sure the log collector process detects the file was rotated at step 6, and instead of reading the new file, reads the rotated one first, sees E1, then the new file and sees E2.

This looks like an overly complex hack but still seems doable. It wouldn’t work in case Log4j rotate the log file twice between two polls from the log collector, but assuming a poll interval of 1s and a 100MB rotation threshold, well, we’d be in trouble long before that happens. I’ll give this solution a try if I can’t manage to get the FIFO solution to work.

Obviously, in the end, I don’t want the logs on the local file system. They are supposed to be sent to an external logging service. For now, the Log4j configuration uses the SyslogAppender to send them. This caused us a production incident during high loads when the central syslgo server couldn’t acknowledge TCP segments in a timely manner. So Log4j waited before sending more data, and our application froze waiting for the logger calls to return.

I know Log4j has an AsyncAppender I can use to wrap around the SyslogAppender, but it stores log events in RAM, causing them to be lost in case of a JVM shutdown/restart. Also, we already have everything in place to collect the standard outputs of all our daemon processes, buffer them on disk and forward them to the central syslog at a rate it can sustain. At first it seemed easier to reuse this.

Regards,

--
Étienne

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to