[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12414762 ]
Scott Ganyo commented on NUTCH-258: ----------------------------------- For the record: I strongly object to closing this issue for the following reasons: 1) Having a *side-effect* of the entire system stop processing after merely logging a message at a certain event level is a poor practice. In fact, I believe that this would make a fantastic anti-pattern. If this kind of behavior is *really* wanted (and I argue that it should not be below), it should be done through an explicit mechanism, not as a side-effect. For example, did you realize that since Hadoop hijacks and reassigns all log formatters (also a bad practice!) in the org.apache.hadoop.util.LogFormatter static constructor that anyone using Nutch as a library and logs a SEVERE error will suffer by having Nutch stop fetching? 2) Moreover, having the system stop processing forever more by use of a static(!) flag makes the use of the Nutch system as a library within a server or service environment impossible. Once this logging is done, no more Fetcher processing in this run *or any other* can take place. This is inappropriate. You might as well call System.exit() at this point! In fact, I could even argue that the current behavior is worse than a System.exit(), as it can actually obfuscate why the system has ceased being operational even though it is still ostensibly "running." Thus, while there definitely *are* instances of inappropriate logging levels being used and I could document them, I believe that this issue is more endemic to the system and it's architecture than the utilization of a particular logging level for a certain event. > Once Nutch logs a SEVERE log item, Nutch fails forevermore > ---------------------------------------------------------- > > Key: NUTCH-258 > URL: http://issues.apache.org/jira/browse/NUTCH-258 > Project: Nutch > Type: Bug > Components: fetcher > Versions: 0.8-dev > Environment: All > Reporter: Scott Ganyo > Priority: Critical > Attachments: dumbfix.patch > > Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. > This is from the run() method in Fetcher.java: > public void run() { > synchronized (Fetcher.this) {activeThreads++;} // count threads > > try { > UTF8 key = new UTF8(); > CrawlDatum datum = new CrawlDatum(); > > while (true) { > if (LogFormatter.hasLoggedSevere()) // something bad happened > break; // exit > > Notice the last 2 lines. This will prevent Nutch from ever Fetching again > once this is hit as LogFormatter is storing this data as a static. > (Also note that "LogFormatter.hasLoggedSevere()" is also checked in > org.apache.nutch.net.URLFilterChecker and will disable this class as well.) > This must be fixed or Nutch cannot be run as any kind of long-running > service. Furthermore, I believe it is a poor decision to rely on a logging > event to determine the state of the application - this could have any number > of side-effects that would be extremely difficult to track down. (As it has > already for me.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers