Hi, I have a problem with last Friday nightly build. When I try to fetch my segment the fetch process freezes"Aborting with 10 hung threads". After failing Nutch tries to run the same urls on another tasktracker but again fails.

I have tried turning fetcher.parse off, protocol-httpclient, protocol-http.

nutch-site.xml

<property>
 <name>fs.default.name</name>
 <value>linux3:50000</value>
 <description>The name of the default file system.  Either the
 literal string "local" or a host:port for NDFS.</description>
</property>

<property>
 <name>mapred.job.tracker</name>
 <value>linux3:50020</value>
 <description>The host and port that the MapReduce job tracker runs
 at.  If "local", then jobs are run in-process as a single map
 and reduce task.
 </description>
</property>

<property>
 <name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|js|pdf|msword)|index-basic|query-(basic|site|url)</value>
 <description>Regular expression naming plugin directory names to
 include.  Any plugin not matching this expression is excluded.
 In any case you need at least include the nutch-extensionpoints plugin. By
 default Nutch includes crawling just HTML and plain text via HTTP,
 and basic indexing and search plugins.
 </description>
</property>

<property>
 <name>http.content.limit</name>
 <value>-1</value>
 <description>The length limit for downloaded content, in bytes.
If this value is nonnegative (>=0), content longer than it will be truncated;
 otherwise, no truncation at all.
 </description>
</property>

<property>
 <name>fetcher.parse</name>
 <value>false</value>
 <description>If true, fetcher will parse content.</description>
</property>



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to