[ http://issues.apache.org/jira/browse/NUTCH-258?page=all ]
     
Chris A. Mattmann reopened NUTCH-258:
-------------------------------------

     Assign To: Chris A. Mattmann

Issue found to in fact be a real issue with the Fetcher: here's the proposed 
solution:

* add flag field (preferably a public static final short String)  in 
Configuration instance to signify whether or not a
SEVERE error has been logged within a task's context

* check this field within the fetcher to determine whether or not to stop
the fetcher, just for that fetching task identified by its Configuration
(and no others)



> Once Nutch logs a SEVERE log item, Nutch fails forevermore
> ----------------------------------------------------------
>
>          Key: NUTCH-258
>          URL: http://issues.apache.org/jira/browse/NUTCH-258
>      Project: Nutch
>         Type: Bug

>   Components: fetcher
>     Versions: 0.8-dev
>  Environment: All
>     Reporter: Scott Ganyo
>     Assignee: Chris A. Mattmann
>     Priority: Critical
>  Attachments: dumbfix.patch
>
> Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. 
>  This is from the run() method in Fetcher.java:
>     public void run() {
>       synchronized (Fetcher.this) {activeThreads++;} // count threads
>       
>       try {
>         UTF8 key = new UTF8();
>         CrawlDatum datum = new CrawlDatum();
>         
>         while (true) {
>           if (LogFormatter.hasLoggedSevere())     // something bad happened
>             break;                                // exit
>           
> Notice the last 2 lines.  This will prevent Nutch from ever Fetching again 
> once this is hit as LogFormatter is storing this data as a static.
> (Also note that "LogFormatter.hasLoggedSevere()" is also checked in 
> org.apache.nutch.net.URLFilterChecker and will disable this class as well.)
> This must be fixed or Nutch cannot be run as any kind of long-running 
> service.  Furthermore, I believe it is a poor decision to rely on a logging 
> event to determine the state of the application - this could have any number 
> of side-effects that would be extremely difficult to track down.  (As it has 
> already for me.)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to