Re: Nutch 500 Error

2006-04-06 Thread TDLN
My guess is you have to override the searcher.dir property in nutch-site.xml and have it point to your crawl dir. Rgrds, Thomas On 4/5/06, Paul Stewart [EMAIL PROTECTED] wrote: Hi there... I was having a number of problems with my install, mainly because I'm not used to Tomcat and/or Nutch

RE: Nutch 500 Error

2006-04-06 Thread Paul Stewart
Thanks.. Tried that ... Same error HTTP Status 500 - type Exception report message description The server encountered an internal error () that prevented it from fulfilling this request. exception

please help!! inverlinks not work properly with more than 5 input parts (0.8)

2006-04-06 Thread Olive g
Hi gurus, I posted questions on how to do incremental crawls on 0.8 a few days ago and thank you all for your help. However, when I tried to workaround (see http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html), inverlinks crashed when there were more than 5 input parts.

Re: please help!! inverlinks not work properly with more than 5 input parts (0.8)

2006-04-06 Thread Andrzej Bialecki
Olive g wrote: Hi gurus, I posted questions on how to do incremental crawls on 0.8 a few days ago and thank you all for your help. However, when I tried to workaround (see http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html), inverlinks crashed when there were more than

.classpath and .project for 0.8

2006-04-06 Thread TDLN
I am (finally) moving my installation to 0.8-dev. Now I was wondering if one of the developers could post their .classpath and .project eclipse settings files. I have seen those files being posted for 0.7, so I thought I might as well ask. Rgrds, Thomas

RuntimeException running Generator

2006-04-06 Thread TDLN
nutch-users - both in the whole web and intranet scenario's, I am now getting 060406 154710 Generator: Partitioning selected urls by host, for politeness. 060406 154710 parsing jar:file:/home/tdelnoij/dev/sandbox/nutch-0.8-dev/lib/hadoop-0.1.0.jar!/hadoop-default.xml 060406 154710 parsing

Re: RuntimeException running Generator

2006-04-06 Thread TDLN
Oops, this one seems to have been fixed already: http://mail-archive.com/nutch-user%40lucene.apache.org/msg04130.html I will give it a shot with the last nightly build. Rgrds, Thomas On 4/6/06, TDLN [EMAIL PROTECTED] wrote: nutch-users - both in the whole web and intranet scenario's, I am

RE: Nutch 500 Error

2006-04-06 Thread sudhendra seshachala
It should be java -versionI think. Paul Stewart [EMAIL PROTECTED] wrote: Thanks for the reply... I apologize as I'm very new to the Java world...:) I am running the following: Fedora Core 4 Apache Tomcat 5.5.16 (binary download from Tomcat site installed to /usr/local/tomcat5) jre1.5.0_06

Re: .classpath and .project for 0.8

2006-04-06 Thread TDLN
Thanks so much for your help, but the attachment did not arrive. I think you have to resend it or copy and paste the text in the files in your post. Rgrds, Thomas On 4/6/06, Dennis Kubes [EMAIL PROTECTED] wrote: Here are my project and classpath files. I set src/java in the classpath along

Re: .classpath and .project for 0.8

2006-04-06 Thread TDLN
That't it, thanks again! On 4/6/06, Dennis Kubes [EMAIL PROTECTED] wrote: Here they are zipped up. -Original Message- From: TDLN [mailto:[EMAIL PROTECTED] Sent: Thursday, April 06, 2006 11:44 AM To: nutch-user@lucene.apache.org Subject: Re: .classpath and .project for 0.8 Thanks

Re: latest build throws error - critical

2006-04-06 Thread Raghavendra Prabhu
Check out with svn and update the svn So you should be in synch with the main code And these bugs are fixed usually by the group in the blink of an eye with good stability. Update using svn client Hope that helps. Rgds Prabhu On 4/6/06, TDLN [EMAIL PROTECTED] wrote: I am seeing the same

Re: latest build throws error - critical

2006-04-06 Thread Andrzej Bialecki
Raghavendra Prabhu wrote: This is an erro which i did not face yesterday. When i ran the crawl with updated build today, i got this error Apply the attached patch (go to the trunk/ dir and execute 'patch -p0 patch.txt'), and please report if it fixes the problem. -- Best regards,

Re: details: stackoverflow error

2006-04-06 Thread Piotr Kosiorowski
Which Java version do you use? Is it the same for all urls or only for specific one? If URL you are trying to crawl is public you can send it to me (off list if you wish) and I can check it on my machine. Regards Piotr Rajesh Munavalli wrote: I had earlier posted this message to the list but

Re: details: stackoverflow error

2006-04-06 Thread Rajesh Munavalli
Java version: JSDK 1.4.2_08 URL Seed: http://www.math.psu.edu/MathLists/Contents.html I even tried allocating more stack memory using -Xss, process memory -Xms option. However, if I run the individual tools (fetchlisttool, fetcher, updatedb..etc) separately from the shell, it works fine. Thanks,

Re: details: stackoverflow error

2006-04-06 Thread Rajesh Munavalli
Forgot to mention one more parameter. Modify the crawl-urlfilter to accept any URL. On 4/6/06, Rajesh Munavalli [EMAIL PROTECTED] wrote: Java version: JSDK 1.4.2_08 URL Seed: http://www.math.psu.edu/MathLists/Contents.html I even tried allocating more stack memory using -Xss, process memory