Hi Tejas,

Thanks, I know about this setting and already increased it to 1 minute,
getting higher percent of pages parsed successfully.

But actually more then 1 minute for every second page is tremendously slow
and looks like the page sizes themselves are not an issue, as with HBase
they are parsed the order of magnitude times faster. Root cause is somewhere
else :)

What looks suspicious to me is why map task was started only on one node
(actually, several attempts on different nodes, but always only on one node
a time).



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Slow-parse-on-hadoop-tp4040215p4040911.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to