My guess is you have to override the searcher.dir property in
nutch-site.xml and have it point to your crawl dir.
Rgrds, Thomas
On 4/5/06, Paul Stewart [EMAIL PROTECTED] wrote:
Hi there...
I was having a number of problems with my install, mainly because I'm
not used to Tomcat and/or Nutch
Thanks.. Tried that ... Same error
HTTP Status 500 -
type Exception report
message
description The server encountered an internal error () that prevented
it from fulfilling this request.
exception
Hi gurus,
I posted questions on how to do incremental crawls on 0.8 a few days ago and
thank you all for your help. However, when I tried to workaround (see
http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html),
inverlinks crashed when there were more than 5 input parts.
Olive g wrote:
Hi gurus,
I posted questions on how to do incremental crawls on 0.8 a few days
ago and thank you all for your help. However, when I tried to
workaround (see
http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html),
inverlinks crashed when there were more than
I am (finally) moving my installation to 0.8-dev. Now I was wondering
if one of the developers
could post their .classpath and .project eclipse settings files. I
have seen those files being posted for 0.7, so I thought I might as
well ask.
Rgrds, Thomas
nutch-users -
both in the whole web and intranet scenario's, I am now getting
060406 154710 Generator: Partitioning selected urls by host, for politeness.
060406 154710 parsing
jar:file:/home/tdelnoij/dev/sandbox/nutch-0.8-dev/lib/hadoop-0.1.0.jar!/hadoop-default.xml
060406 154710 parsing
Oops, this one seems to have been fixed already:
http://mail-archive.com/nutch-user%40lucene.apache.org/msg04130.html
I will give it a shot with the last nightly build.
Rgrds, Thomas
On 4/6/06, TDLN [EMAIL PROTECTED] wrote:
nutch-users -
both in the whole web and intranet scenario's, I am
It should be java -versionI think.
Paul Stewart [EMAIL PROTECTED] wrote: Thanks for the reply... I apologize as
I'm very new to the Java
world...:)
I am running the following:
Fedora Core 4
Apache Tomcat 5.5.16 (binary download from Tomcat site installed to
/usr/local/tomcat5)
jre1.5.0_06
Thanks so much for your help, but the attachment did not arrive. I
think you have to resend it or copy and paste the text in the files in
your post.
Rgrds, Thomas
On 4/6/06, Dennis Kubes [EMAIL PROTECTED] wrote:
Here are my project and classpath files. I set src/java in the classpath
along
That't it, thanks again!
On 4/6/06, Dennis Kubes [EMAIL PROTECTED] wrote:
Here they are zipped up.
-Original Message-
From: TDLN [mailto:[EMAIL PROTECTED]
Sent: Thursday, April 06, 2006 11:44 AM
To: nutch-user@lucene.apache.org
Subject: Re: .classpath and .project for 0.8
Thanks
Check out with svn and update the svn
So you should be in synch with the main code
And these bugs are fixed usually by the group in the blink of an eye with
good stability.
Update using svn client
Hope that helps.
Rgds
Prabhu
On 4/6/06, TDLN [EMAIL PROTECTED] wrote:
I am seeing the same
Raghavendra Prabhu wrote:
This is an erro which i did not face yesterday. When i ran the crawl with
updated build today, i got this error
Apply the attached patch (go to the trunk/ dir and execute 'patch -p0
patch.txt'), and please report if it fixes the problem.
--
Best regards,
Which Java version do you use?
Is it the same for all urls or only for specific one?
If URL you are trying to crawl is public you can send it to me (off list
if you wish) and I can check it on my machine.
Regards
Piotr
Rajesh Munavalli wrote:
I had earlier posted this message to the list but
Java version: JSDK 1.4.2_08
URL Seed: http://www.math.psu.edu/MathLists/Contents.html
I even tried allocating more stack memory using -Xss, process memory
-Xms option. However, if I run the individual tools (fetchlisttool,
fetcher, updatedb..etc) separately from the shell, it works fine.
Thanks,
Forgot to mention one more parameter. Modify the crawl-urlfilter to accept
any URL.
On 4/6/06, Rajesh Munavalli [EMAIL PROTECTED] wrote:
Java version: JSDK 1.4.2_08
URL Seed: http://www.math.psu.edu/MathLists/Contents.html
I even tried allocating more stack memory using -Xss, process memory
15 matches
Mail list logo