Re: [Nutch-general] Memory problem while running Nutch

2006-06-15 Thread Nutch Newbie
Hi Have a look at the bin/nutch script about JAVA HEAP SIZE adjust it to your settings..you should see something a line like JAVA_HEAP_MAX=-Xmx1000m in bin/nutch script rgds On 6/15/06, Jayant Kumar Gandhi [EMAIL PROTECTED] wrote: Hi, I installed Tomcat using cPanel/WHM as root. It

Re: [Nutch-general] Memory problem while running Nutch

2006-06-15 Thread Jayant Kumar Gandhi
I tried that, it didn't help. Thanks anyway. I believe the problem is in Java and not just with Nutch. Because I get the same error when i do java -version or java crashes. Log of crash report is attached. I read on one of the forums that it may be due to jdk1.5.0_05 and jdk1.5.0_07 should work

Re: [Nutch-general] Confused about searchable fields

2006-06-15 Thread Bogdan Kecman
Hi, Try to delete nutch*.war from /opt/tomcat/webapps/ or ROOT.war (if you maybe renamed it) .. It helped me once with some stuped container problem. As for the plugins .. Look at the /opt/tomcat/log/catalina.out As for the fields, start Luke, and look at your fields.. Go to first tab

[Nutch-general] Nutch 0.8

2006-06-15 Thread Matthew Holt
Any estimate on the release date for nutch 0.8? Just wondering.. Thanks. Matt ___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] managing content size in segments folder

2006-06-15 Thread Roberto Monge
I've been using nutch to index production log files from a client application. It's been a great tool because we do get a large volume of logs from the field and often have to go through complicated pattern searches. Lately we're have some issues managing the our disk space. I noticed that

Re: [Nutch-general] RSSParser

2006-06-15 Thread Carli Collins
Hello, I would like to use Nutch to search RSS documents from my Struts web app. I see that there is a class called RSSParser, however the JARS in the nightly builds do not seem to have this class. Should I be using something else? Is Nutch not the right tool for me? Thanks

Re: [Nutch-general] RSSParser

2006-06-15 Thread Chris Mattmann
Hi Carli, The RSSParser class has been part of the trunk since 0.7, but didn't get released with those official releases. Right now you can get the RSSParser capability by downloading 0.8-dev, which is available from the nutch trunk. Point your favorite web browser to:

Re: [Nutch-general] RSSParser

2006-06-15 Thread Sami Siren
Point your favorite web browser to: http://lucene.apache.org/nutch/version_control.html And then d/l the latest trunk and you should be all set. To use the There are also nightly builds available, see http://lucene.apache.org/nutch/nightly.html -- Sami Siren

Re: [Nutch-general] RSSParser

2006-06-15 Thread Carli Collins
I tried to use the nightly builds but I was unable to compile the code. -Original Message- From: Sami Siren [mailto:[EMAIL PROTECTED] Sent: Thursday, June 15, 2006 2:25 PM To: nutch-user@lucene.apache.org Subject: Re: RSSParser Point your favorite web browser to:

Re: [Nutch-general] managing content size in segments folder

2006-06-15 Thread TDLN
As far as I know, content in the segments is used to generate the summary in the search results and off course for the cache feature. If you don't need these you can adjust the fetcher.store.content and http.content.limit config properties. Also you might have to change search.jsp. Rgrds, Thomas

Re: [Nutch-general] RSSParser

2006-06-15 Thread Sami Siren
Carli Collins wrote: I tried to use the nightly builds but I was unable to compile the code. There is no need to compile the code unless you want to extend it (in wich case i recommend you to check out a version from svn repository). Just use it ;) -- Sami Siren

Re: [Nutch-general] nutch .72 out-of-the-box build issue

2006-06-15 Thread TDLN
Yes, this is the wrong forum :) This has been discussed many times, please search the archives. Rgrds, thomas On 6/14/06, Dagum, Leo [EMAIL PROTECTED] wrote: Apologies if this is the wrong forum.. Just downloaded the nutch .72 release and tried building, using jdk1.5.0_03 and ant 1.6.5.

Re: [Nutch-general] Memory problem while running Nutch

2006-06-15 Thread TDLN
You can't run Nutch effectively on a VPS. You will either run into memory / disk problems or into your hosting party :) Rgrds, Thomas On 6/15/06, Jayant Kumar Gandhi [EMAIL PROTECTED] wrote: I tried that, it didn't help. Thanks anyway. I believe the problem is in Java and not just with Nutch.

Re: [Nutch-general] managing content size in segments folder

2006-06-15 Thread TDLN
I mean disable the cache link in the search.jsp. On 6/15/06, TDLN [EMAIL PROTECTED] wrote: As far as I know, content in the segments is used to generate the summary in the search results and off course for the cache feature. If you don't need these you can adjust the fetcher.store.content and

[Nutch-general] peaceable

2006-06-15 Thread Victor Hutchinson
___ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] Too many open files

2006-06-15 Thread TDLN
t still seems that this will be an issue for other platforms, which have hard limits on the number of file descriptors. What platforms? You can use ulimit -n to increase the limit on Unix/Linux systems. Rgrds, Thomas On 6/13/06, Howie Wang [EMAIL PROTECTED] wrote: Hi, I think I remember

Re: [Nutch-general] Too many open files

2006-06-15 Thread Howie Wang
You're right. I guess I misunderstood the term hard limit when talking about file descriptor limits. Still, why is Nutch opening so many file descriptors during merge or reparse? 2000+ open file descriptors doesn't seem intentional. Plus, my DB is not that big (~1M pages). Has anyone seen this

Re: [Nutch-general] fether handling on 302 redirect

2006-06-15 Thread Yuzo Kanomata
I discovered the source of my problem. The web server reports back 302 with the header field as: location: url but Nutch expects Location: url I fixed my problme by adding a few lines to Http.java Specifics and Patch: In package org.apache.nutch.protocol.http, class Http

[Nutch-general] Problems switching over from nutch 0.7.1 to nutch 0.8 (dev) -- zero search results problem with invertlinks

2006-06-15 Thread Bryan Woliner
Hi All, I have been using Nutch 0.7.1 for some time (although I am certainly not an expert) and am now in the process of switching over to Nutch 0.8. However, I have ran into a couple of problems along the way and am hoping that those of you who have been using nutch 0.8 for a while will take a

Re: [Nutch-general] nutch .72 out-of-the-box build issue

2006-06-15 Thread TDLN
Please keep it on the list so others can benefit as well. Just add the missing dirs and start the build again. RGrds, Thomas On 6/15/06, Dagum, Leo [EMAIL PROTECTED] wrote: I did search the archives but that didn't turn up anything useful. If you can suggest some keywords that will turn up