Hi,

I can reproduce the problem with the latest version out of svn. :-( I played arround a little bit (most of the day in fact :-) and after increasing the parameters

        <property>
          <name>mapred.task.timeout</name>
          <value>6000000</value>
          <description>The number of milliseconds before a task will be
          terminated if it neither reads an input, writes an output, nor
          updates its status string.
          </description>
        </property>


        <property>
          <name>mapred.child.heap.size</name>
          <value>2000m</value>
          <description>The heap size (-Xmx) that will be used for task    
           tracker  child processes.</description>
        </property>

the error seems to disappear. But I don't understand why. It's just some "guessing in the dark".

        Michael



Håvard W. Kongsgård wrote:

Hi, I have a problem with last Friday nightly build. When I try to fetch my segment the fetch process freezes"Aborting with 10 hung threads". After failing Nutch tries to run the same urls on another tasktracker but again fails.

I have tried turning fetcher.parse off, protocol-httpclient, protocol-http.

nutch-site.xml

<property>
 <name>fs.default.name</name>
 <value>linux3:50000</value>
 <description>The name of the default file system.  Either the
 literal string "local" or a host:port for NDFS.</description>
</property>

<property>
 <name>mapred.job.tracker</name>
 <value>linux3:50020</value>
 <description>The host and port that the MapReduce job tracker runs
 at.  If "local", then jobs are run in-process as a single map
 and reduce task.
 </description>
</property>

<property>
 <name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|parse-(text|html|js|pdf|msword)|index-basic|query-(basic|site|url)</value>
 <description>Regular expression naming plugin directory names to
 include.  Any plugin not matching this expression is excluded.
 In any case you need at least include the nutch-extensionpoints plugin. By
 default Nutch includes crawling just HTML and plain text via HTTP,
 and basic indexing and search plugins.
 </description>
</property>

<property>
 <name>http.content.limit</name>
 <value>-1</value>
 <description>The length limit for downloaded content, in bytes.
If this value is nonnegative (>=0), content longer than it will be truncated;
 otherwise, no truncation at all.
 </description>
</property>

<property>
 <name>fetcher.parse</name>
 <value>false</value>
 <description>If true, fetcher will parse content.</description>
</property>


--
Michael Nebel
http://www.nebel.de/
http://www.netluchs.de/

Reply via email to