TIMEOUT parsing XXX with org.apache.nutch.parse.html. HtmlParser@424c414 Unable to successfully parse content XXX of type text/html
resolve these similar errors by changing the *nutch-site.xml* file u can paste here the content of ur nutch-site.xml On Sun, May 29, 2011 at 3:20 AM, yuegary [via Lucene] < [email protected]> wrote: > Hi > > i am running into the exact same problem using Nutch1.2 > > > 2011-05-28 13:48:42,019 WARN parse.ParseUtil - TIMEOUT parsing XXX with > org.apache.nutch.parse.html.HtmlParser@424c414 > 2011-05-28 13:48:42,019 WARN parse.ParseUtil - Unable to successfully > parse content XXX of type text/html > 2011-05-28 13:48:42,352 WARN parse.ParseUtil - TIMEOUT parsing XXX with > org.apache.nutch.parse.html.HtmlParser@424c414 > 2011-05-28 13:48:42,352 WARN parse.ParseUtil - Unable to successfully > parse content XXX of type text/html > 2011-05-28 13:48:42,492 INFO fetcher.Fetcher - -activeThreads=10, > spinWaiting=6, fetchQueues.totalSize=500 > 2011-05-28 13:48:43,492 INFO fetcher.Fetcher - -activeThreads=10, > spinWaiting=6, fetchQueues.totalSize=500 > 2011-05-28 13:48:44,492 INFO fetcher.Fetcher - -activeThreads=10, > spinWaiting=6, fetchQueues.totalSize=500 > *2011-05-28 13:48:44,493 WARN fetcher.Fetcher - Aborting with 10 hung > threads.* > > This happened around depth 2-3 of 8. (undeterministically) > Have u resolved this problem at your end? > > I've read thru all the previous email threads related to this topic, also > tried some of the suggested solution. None of them work (those are on older > versions of nutch anyways, and fixes have already been commited to the > branch) > > Does any one else have any clue? this is a blocker! > Thx! > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://lucene.472066.n3.nabble.com/Nutch-1-2-fetcher-aborting-with-N-hung-threads-tp2411724p2997311.html > To start a new topic under Nutch - User, email > [email protected] > To unsubscribe from Nutch - User, click > here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=603147&code=YW51cmFnLml0LmpvbGx5QGdtYWlsLmNvbXw2MDMxNDd8LTIwOTgzNDQxOTY=>. > > -- Kumar Anurag ----- Kumar Anurag -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-1-2-fetcher-aborting-with-N-hung-threads-tp2411724p2997594.html Sent from the Nutch - User mailing list archive at Nabble.com.

