Hi, I haven't tried the 0.8 version yet, I might give it a look if I can find the time.
I've investigated the problem a little more and it seems to be related to having a high value for "http.content.limit" and parsing pdf files. (The site probably only goes over the default value for pdf files so it might just be files above that size) I'm hoping to avoid indexing the pdfs so I'm not going to worry about it at the moment. thanks, Julian. --- sudhendra seshachala <[EMAIL PROTECTED]> wrote: > Okay. > Have you tried, the 0.8 version. Seems like it is > more stable than the 0.7.X. (The one you are using) > It is a bit different too.. with Hadoop and nutch > being separate.. > I had few issues using 0.7X. But nightly-build > (0.8), I was upto speed comparatively sooner. > > I hope this helps.. I am not trying to go away > from the problem, just that next release is more > stable and more ever, there is no backward > compatibility for 0,8X. (That is what I read in one > of the mails achieve) You are better off using 0.8.. > > Thanks > Sudhi ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
