Hi,
NUTCH-1420 is now committed, so you can update your local copy of Nutch 2.x
if you are working from HEAD source.
So there was another issue here where the parse was only running on one
node in the cluster. Is this also the case with you?

On Tue, Feb 19, 2013 at 2:48 PM, t_gra <[email protected]> wrote:

> Another workaround that can improve a situation a bit (but not solve
> all problems) will be ignoring pages with content larger then some
> given size. Will try if that help parsing at least some pages :)
>
> Yeah for this exact reason by default this is activated.
Lewis

Reply via email to