Hi,

I haven't tried the 0.8 version yet, I might give it a
look if I can find the time.  

I've investigated the problem a little more and it
seems to be related to having a high value for
"http.content.limit" and parsing pdf files.  (The site
probably only goes over the default value for pdf
files so it might just be files above that size)

I'm hoping to avoid indexing the pdfs so I'm not going
to worry about it at the moment.

thanks,


Julian.


--- sudhendra seshachala <[EMAIL PROTECTED]> wrote:

> Okay.
>   Have you tried, the 0.8 version. Seems like it is
> more stable than the 0.7.X. (The one you are using)
>   It is a bit different too.. with Hadoop and nutch
> being separate..
>   I had few issues using 0.7X. But nightly-build
> (0.8), I was upto speed comparatively sooner.
>    
>   I hope this helps.. I am not trying to go away
> from the problem, just that next release is more
> stable and more ever, there is no backward
> compatibility for 0,8X. (That is what I read in one
> of the mails achieve) You are better off using 0.8..
>    
>   Thanks
>   Sudhi

Reply via email to