[ 
http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12414762 ] 

Scott Ganyo commented on NUTCH-258:
-----------------------------------

For the record:  I strongly object to closing this issue for the following 
reasons:

1) Having a *side-effect* of the entire system stop processing after merely 
logging a message at a certain event level is a poor practice.  In fact, I 
believe that this would make a fantastic anti-pattern.  If this kind of 
behavior is *really* wanted (and I argue that it should not be below), it 
should be done through an explicit mechanism, not as a side-effect.  For 
example, did you realize that since Hadoop hijacks and reassigns all log 
formatters (also a bad practice!) in the org.apache.hadoop.util.LogFormatter 
static constructor that anyone using Nutch as a library and logs a SEVERE error 
will suffer by having Nutch stop fetching?

2) Moreover, having the system stop processing forever more by use of a 
static(!) flag makes the use of the Nutch system as a library within a server 
or service environment impossible.  Once this logging is done, no more Fetcher 
processing in this run *or any other* can take place.  This is inappropriate.  
You might as well call System.exit() at this point!  In fact, I could even 
argue that the current behavior is worse than a System.exit(), as it can 
actually obfuscate why the system has ceased being operational even though it 
is still ostensibly "running."

Thus, while there definitely *are* instances of inappropriate logging levels 
being used and I could document them, I believe that this issue is more endemic 
to the system and it's architecture than the utilization of a particular 
logging level for a certain event.

> Once Nutch logs a SEVERE log item, Nutch fails forevermore
> ----------------------------------------------------------
>
>          Key: NUTCH-258
>          URL: http://issues.apache.org/jira/browse/NUTCH-258
>      Project: Nutch
>         Type: Bug

>   Components: fetcher
>     Versions: 0.8-dev
>  Environment: All
>     Reporter: Scott Ganyo
>     Priority: Critical
>  Attachments: dumbfix.patch
>
> Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. 
>  This is from the run() method in Fetcher.java:
>     public void run() {
>       synchronized (Fetcher.this) {activeThreads++;} // count threads
>       
>       try {
>         UTF8 key = new UTF8();
>         CrawlDatum datum = new CrawlDatum();
>         
>         while (true) {
>           if (LogFormatter.hasLoggedSevere())     // something bad happened
>             break;                                // exit
>           
> Notice the last 2 lines.  This will prevent Nutch from ever Fetching again 
> once this is hit as LogFormatter is storing this data as a static.
> (Also note that "LogFormatter.hasLoggedSevere()" is also checked in 
> org.apache.nutch.net.URLFilterChecker and will disable this class as well.)
> This must be fixed or Nutch cannot be run as any kind of long-running 
> service.  Furthermore, I believe it is a poor decision to rely on a logging 
> event to determine the state of the application - this could have any number 
> of side-effects that would be extremely difficult to track down.  (As it has 
> already for me.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to