On 22/02/2011 19:34, Filip Hanik - Dev Lists wrote: > Wouldn't you build this into the logging framework, instead of one > specific component?
You could, if you can find an efficient way to spot duplicate log messages and then differentiate correctly between when you want the duplicates and when you don't. I'm not sure how easy that would be. Suggestions welcome. The other issue with the current code is that it is likely to chew up a large chunk of CPU time. Fixing the log mechanism won't fix that. Of course, the real solution is fix the ulimit issue in the first place but Tomcat's behaviour in this error condition seems sufficiently bad to justify some changes to handle it more gracefully. Mark > > best > Filip > > On 02/21/2011 08:21 AM, Mark Thomas wrote: >> The ASF Sonar installation managed to generate 46GB of identical log >> messages [1] today in the 8 hours it took to notice it was down. >> >> While better monitoring would/should have identified the problem sooner, >> it does demonstrate a problem with the acceptor threads in all three >> endpoints. If there is a system-level issue that causes the accept() >> call to always fail (such as hitting the ulimit) then the endpoint >> essentially enters a loop where it logs an error message for every >> iteration of the loop. This will result in many log messages per second. >> >> I'd like to do something about this. I was thinking of something along >> the lines of the following for each endpoint. >> >> Index: java/org/apache/tomcat/util/net/JIoEndpoint.java >> =================================================================== >> --- java/org/apache/tomcat/util/net/JIoEndpoint.java (revision >> 1072939) >> +++ java/org/apache/tomcat/util/net/JIoEndpoint.java (working copy) >> @@ -183,9 +183,19 @@ >> @Override >> public void run() { >> >> + int errorDelay = 0; >> + >> // Loop until we receive a shutdown command >> while (running) { >> >> + if (errorDelay> 0) { >> + try { >> + Thread.sleep(errorDelay); >> + } catch (InterruptedException e) { >> + // Ignore >> + } >> + } >> + >> // Loop if endpoint is paused >> while (paused&& running) { >> try { >> @@ -225,9 +235,15 @@ >> // Ignore >> } >> } >> + errorDelay = 0; >> } catch (IOException x) { >> if (running) { >> >> log.error(sm.getString("endpoint.accept.fail"), x); >> + if (errorDelay == 0) { >> + errorDelay = 50; >> + } else if (errorDelay< 1600) { >> + errorDelay = errorDelay * 2; >> + } >> } >> } catch (NullPointerException npe) { >> if (running) { >> >> >> >> Thoughts / comments? >> >> Mark >> >> >> [1] http://pastebin.com/CrsujeW4 >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: dev-h...@tomcat.apache.org >> >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org