[ https://issues.apache.org/jira/browse/LOG4J2-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886727#comment-15886727 ]
Charles Hache commented on LOG4J2-1784: --------------------------------------- Hi Ralph, Works like a charm. I compared the 2.5 release with 2.8.1-SNAPSHOT which was commit 2cc580f00e290a01048fce9868d5ae9fe906c881 from Jan 24th. I wasn't able to reproduce the issue on my workstation, but it is a much faster machine with a much faster storage device/interface (and also a different JRE as it's intel, and the original problem was on ARM pi's). I set up replicated environments on more raspberry pi's and was able to reproduce the issue and test the fix. I ran the test using a config very similar to the above. When I use the 2.5 jars the unit will run for a few moments fine, then freeze for 5 minutes or so (including no activity on TX/RX leds and so on), then continue on in this manner. When I check the log directory the timestamps between the log.gz files are all about 5 minutes apart. When I substituted the 2.8.1 jars I get no discontinuity in the operation of the unit, the TX/RX leds show constant activity, I can watch the log file rollover with no hesitation, and the timestamps in the log directory show several logs per minute. This was all with DefaultRolloverStrategy max="99999". Seems to have made a big difference. Thanks for addressing the issue, although now I know better than to use ridiculous DefaultRolloverStrategy values. Thanks again, Charles > Large values for max in DefaultRolloverStrategy blocks application > ------------------------------------------------------------------ > > Key: LOG4J2-1784 > URL: https://issues.apache.org/jira/browse/LOG4J2-1784 > Project: Log4j 2 > Issue Type: Improvement > Components: Appenders > Affects Versions: 2.7 > Environment: Played around a lot in 2.5, once realized issue was > related to log4j, upgraded to 2.7 and confirmed it is still happening. Using > on a raspberry pi. > Reporter: Charles Hache > Assignee: Ralph Goers > > Hi all, > I'm trying to debug an application which is using log4j2 for logging. After > discovering that by default the RollingRandomAccessFile only kept the last 7 > logs using the counter, I found the DefaultRolloverStrategy option and set > the max to something like 99999, so in theory I wouldn't have to worry about > it. > A couple days later I started looking at everything again (log files from > log4j, timestamped CSV files from the application, etc) I can see something > is going terribly wrong. What I'm finding is that when it comes time to > rotate the logs, it looks like the whole application hangs. I see the > following: > - All logging to the log file stops; > - All my data going to the CSV files stop; > - CPU usage goes to 100%; > - I cannot stop my application cleanly; > After a while (~4-5 minutes) everything goes back to normal (until the next > log rotation), but then the first lines of the logs have timestamps like this: > 2017-01-14 07:43:00,001 INFO .... > 2017-01-14 07:42:11,329 DEBUG .... > 2017-01-14 07:47:26,648 TRACE .... > 2017-01-14 07:47:26,695 INFO .... > 2017-01-14 07:47:26,697 TRACE .... > 2017-01-14 07:47:26,737 INFO .... > 2017-01-14 07:47:26,739 TRACE .... > 2017-01-14 07:47:26,741 TRACE .... > 2017-01-14 07:47:26,744 TRACE .... > 2017-01-14 07:47:26,768 TRACE .... > Of note is that there is a gap of a few minutes in the first couple lines of > the file (I'm not too concerned about the out-of-order timestamps as lots of > different threads are logging). The application always writes many (dozens, > maybe hundreds) of lines per second, so a gap of a couple minutes is > uncharacteristic. The rest of the log file looks normal (has no gaps, has > log lines every few milliseconds). From my specific log lines and where they > are w.r.t. each other, I looks to me like calls to the logger are blocking > (eg: in cases I'll have a couple log lines that usually log 1-2 milliseconds > apart, but when they occur at rollover time they'll be minutes apart). > I shrunk the file size rollover to something small (256K), which I hit in > maybe 15 seconds or so, and it is clearly reproducible. When I reduced the > default rollover strategy max file to 40, the problem when away and I can see > it rolling over the file practically instantly. > The documentation for the DefaultRolloverStrategy says "When the ascending > attribute is set to true (the default) then the counter will be incremented > and the current log file will be renamed to include the counter value. If the > counter hits the maximum value then the oldest file, which will have the > smallest counter, will be deleted, all other files will be renamed to have > their counter decremented and then the current file will be renamed to have > the maximum counter value. Note that with this counting strategy specifying a > large maximum value may entirely avoid renaming files." > I took this to mean that having a value so large that it would never be hit > would be more efficient, since it doesn't have to do anything, just keep > counting up. I'm wondering now if it's doing something like "check if > log-1.log exists... check if log-2.log exist ....... check if log-99999.log > exists" but I don't see any iobound activity in iotop or anything (although I > can't say for sure if FS calls to see if something exists would show up > there, for example). > This might not necessarily be a bug in the logger per se, but it would at > least be a documentation issue if there is some O(n) operation based on the > max value for the DefaultRolloverStrategy. > I'd usually dig deeper, but now I still have to debug the original issues > with the actual application. Let me know if there is any more info I should > provide or if there is anything else I can do! > Thanks, > Charles -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org For additional commands, e-mail: log4j-dev-h...@logging.apache.org