Hi,

I haven't tried the 0.8 version yet, I might give it a
look if I can find the time.  

I've investigated the problem a little more and it
seems to be related to having a high value for
"http.content.limit" and parsing pdf files.  (The site
probably only goes over the default value for pdf
files so it might just be files above that size)

I'm hoping to avoid indexing the pdfs so I'm not going
to worry about it at the moment.

thanks,


Julian.


--- sudhendra seshachala <[EMAIL PROTECTED]> wrote:

> Okay.
>   Have you tried, the 0.8 version. Seems like it is
> more stable than the 0.7.X. (The one you are using)
>   It is a bit different too.. with Hadoop and nutch
> being separate..
>   I had few issues using 0.7X. But nightly-build
> (0.8), I was upto speed comparatively sooner.
>    
>   I hope this helps.. I am not trying to go away
> from the problem, just that next release is more
> stable and more ever, there is no backward
> compatibility for 0,8X. (That is what I read in one
> of the mails achieve) You are better off using 0.8..
>    
>   Thanks
>   Sudhi


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to