[ 
http://issues.apache.org/jira/browse/NUTCH-152?page=comments#action_12362043 ] 

Paul Baclace commented on NUTCH-152:
------------------------------------

>re 3: Why is a separate thread needed for stdout? 

It certainly makes the code easier to read.  Using the main thread to read the 
subprocess stdout is a clever deviation from the usual idiom of using a 
separate thread.  

Programming defensively, being able to setDaemon(true) and interrupt() a 
separate thread eliminates any possibility that external, unexpected problems 
(bugs) will not cause a hang or resource leak.  

>re 4: I'd expect the io pipes to get EOF when the process is killed. 

If the subprocess is hanging in a device driver, it might not be killed in a 
timely fashion, so the EOF might not arrive immediately.  Rare, but not 
impossible.  


> TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are 
> incomplete, max heap too small
> ------------------------------------------------------------------------------------------------------------
>
>          Key: NUTCH-152
>          URL: http://issues.apache.org/jira/browse/NUTCH-152
>      Project: Nutch
>         Type: Bug
>   Components: fetcher
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>  Attachments: TaskRunner.java.patch
>
> 1. io pipes should be setDaemon(true) so that process cannot hang.
> 2. error messages for Exceptions are incomplete since e.getMessage() is used 
> and it can be empty (NullPointerException has an empty message).   Change 
> this to e.toString() which always has more meaning.
> 3. a separate thread is not used for the subprocess stdout pipe, but it must 
> be a separate thread if setDaemon(true).
> 4. TaskRunner.kill()  does not stop the io pipe threads, but it should.
> 5. If InterruptedException occurs, it was assumed to be for the current 
> (main) thread, but it should check this with Thread.interrupted() otherwise 
> spurious thread interrupts will be rethrown as IOException.
> 6. A recent run had some Tasktracker child processes that ran out of heap.  
> The default max heap size should be larger.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to