[ 
http://issues.apache.org/jira/browse/NUTCH-152?page=comments#action_12362043 ] 

Paul Baclace commented on NUTCH-152:
------------------------------------

>re 3: Why is a separate thread needed for stdout? 

It certainly makes the code easier to read.  Using the main thread to read the 
subprocess stdout is a clever deviation from the usual idiom of using a 
separate thread.  

Programming defensively, being able to setDaemon(true) and interrupt() a 
separate thread eliminates any possibility that external, unexpected problems 
(bugs) will not cause a hang or resource leak.  

>re 4: I'd expect the io pipes to get EOF when the process is killed. 

If the subprocess is hanging in a device driver, it might not be killed in a 
timely fashion, so the EOF might not arrive immediately.  Rare, but not 
impossible.  


> TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are 
> incomplete, max heap too small
> ------------------------------------------------------------------------------------------------------------
>
>          Key: NUTCH-152
>          URL: http://issues.apache.org/jira/browse/NUTCH-152
>      Project: Nutch
>         Type: Bug
>   Components: fetcher
>     Versions: 0.8-dev
>  Environment: all
>     Reporter: Paul Baclace
>  Attachments: TaskRunner.java.patch
>
> 1. io pipes should be setDaemon(true) so that process cannot hang.
> 2. error messages for Exceptions are incomplete since e.getMessage() is used 
> and it can be empty (NullPointerException has an empty message).   Change 
> this to e.toString() which always has more meaning.
> 3. a separate thread is not used for the subprocess stdout pipe, but it must 
> be a separate thread if setDaemon(true).
> 4. TaskRunner.kill()  does not stop the io pipe threads, but it should.
> 5. If InterruptedException occurs, it was assumed to be for the current 
> (main) thread, but it should check this with Thread.interrupted() otherwise 
> spurious thread interrupts will be rethrown as IOException.
> 6. A recent run had some Tasktracker child processes that ran out of heap.  
> The default max heap size should be larger.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to