Re: retry later

2006-03-08 Thread mos
when you get an error while fetching, and you get the org.apache.nutch.protocol.retrylater because the max retries have been reached, nutch says it has given up and will retry later, when does that retry occur? That's an issue I reported some weeks ago and which is in my opinion an annoying

Re: project vitality?

2006-03-06 Thread mos
On 3/4/06, Stefan Groschupf: Just a general note, jira has a voting functionality. This allows everybody to vote an issue and can show in a very compressed style what the community is looking for. However it is not used that often yet. It would be great if more users can use it. That's a

Re: project vitality?

2006-03-06 Thread mos
On 3/4/06, Stefan Groschupf: Just a general note, jira has a voting functionality. This allows everybody to vote an issue and can show in a very compressed style what the community is looking for. However it is not used that often yet. It would be great if more users can use it. That's a

Re: Fetch timeouts

2006-02-16 Thread mos
Try to increase the value for the parameter of property namefetcher.threads.per.host/name value1/value /property This could help if you crawl pages from one host and if you run into time-outs. By the way: It's important to avoid time-outs because in Nutch 0.7.1 there is a bug that prevents

Re: crawler

2006-02-03 Thread mos
to generate it (e.g. use the apache log-file) - Enhance the nutch html parser and make it able to intepret the JavaScipt links Greetings mos, from munich On 2/3/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hello, I have problems indexing a special internet site: http://www.gildemeister.com

Wrong 'Next Fetch' Date

2006-02-02 Thread mos
); tool.updateForSegment(fileSystem, lseg); tool.close(); Thanks mos