Re: nutch crawling file system SOLVED

2012-03-11 Thread Lewis John Mcgibbney
Hi Alessio, If you check out our official tutorial you will see no mention of crawl-urlfilter, this was deprecated after Nutch 1.2 IIRC. I can only suggest that any other tutorial you are using is in need of an update. http://wiki.apache.org/nutch/NutchTutorial On Sat, Mar 10, 2012 at 4:42 PM,

Re: nutch crawling file system SOLVED

2012-03-11 Thread Lewis John Mcgibbney
Please see below On Sun, Mar 11, 2012 at 5:10 PM, alessio crisantemi alessio.crisant...@gmail.com wrote: [1]http://wiki.apache.org/nutch/FAQ#How_do_I_index_my_local_file_system.3F I've now updated this link, thanks for pointing this out. And Now, I have another problem: I crawled my

Re: nutch crawling file system SOLVED

2012-03-11 Thread remi tassing
You're probably looking for the Highlighting future http://wiki.apache.org/solr/HighlightingParameters Remi On Sun, Mar 11, 2012 at 6:10 PM, alessio crisantemi alessio.crisant...@gmail.com wrote: Thank you Lewis for your explanation: I supposed this fact and I post on mailing list my

Re: nutch crawling file system SOLVED

2012-03-11 Thread alessio crisantemi
thank you Remi for your preciuos help. I try again and I write you the results. But I have another little question: how can I do for limit the crawling only to my selected root? Because all time, Nutch crawl also the parent directories. I read that The code that is responsable for this is in

Re: nutch crawling file system SOLVED

2012-03-11 Thread remi tassing
Using crawl-ulrfilter (or regex-urlfilter depending on which one you're using), you should be able to solve this. Unless you're not clear on what folders to exclude...? On Sunday, March 11, 2012, alessio crisantemi alessio.crisant...@gmail.com wrote: thank you Remi for your preciuos help. I try