Fwd: bug in Nutch wiki - FAQ

2005-12-26 Thread Stefan Groschupf
I'm sending this to you because you are active on the nutch-users list and I am too lazy to subscribe at this particular moment. Please pass on / act as you see fit. Wiki itself seems immutable at least to the likes of me. -Jeff = currently By default the [WWW] file plugin is

[jira] Commented: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-151?page=comments#action_12361242 ] Paul Baclace commented on NUTCH-151: Analysis: CommandRunner uses CyclicBarrier is to synchronize the thread that does the exec (lets call it the main thread) with the io

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-151?page=all ] Paul Baclace updated NUTCH-151: --- Attachment: CommandRunner.java Minimal required changes to fix bug NUTCH-151: 1. The pipe io threads should be daemons. 2. The main thread should always

[jira] Updated: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-151?page=all ] Paul Baclace updated NUTCH-151: --- Attachment: CommandRunner.java.patch Here is the patch for CommandRunner (previously, I attached the actual file). CommandRunner can hang after the main thread

[jira] Updated: (NUTCH-152) TaskRunner io pipes are not setDaemon(true), cleanup and exception errors are incomplete, max heap too small

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-152?page=all ] Paul Baclace updated NUTCH-152: --- Attachment: TaskRunner.java.patch The patch addresses each issue listed in the detailed description of this bug. The detailed description is suitable as a

[jira] Created: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail

2005-12-26 Thread Paul Baclace (JIRA)
TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail - Key: NUTCH-153 URL: http://issues.apache.org/jira/browse/NUTCH-153

[jira] Updated: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-153?page=all ] Paul Baclace updated NUTCH-153: --- Attachment: TextParser.java.patch A patch to reject files with %!PS-Adobe in the first 40 characters of the file. TextParser is only supposed to parse plain

[jira] Commented: (NUTCH-128) second configuration nodes overwrites first node

2005-12-26 Thread Paul Baclace (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-128?page=comments#action_12361254 ] Paul Baclace commented on NUTCH-128: In general, it might be helpful to issue an INFO level log msg whenever a configuration attribute is overridden. If the override