Check the url filters Crawl-filter.txt
see whether the rule is allowed see whether the link below matches with url pattern there in the crawl-filter.txt file http://*punto-informatico.it <http://punto-informatico.it>* On 2/7/06, Enrico Triolo <[EMAIL PROTECTED]> wrote: > > I'm switching to nutch-0.8 but I'm facing a problem with url redirects. > To let you understand better I'll explain my problem with a real example: > > I created an 'urls' directory and inside it I created an 'urls.txt' file > containing only this line: "http://www.punto-informatico.it". > If pointed to this url the webserver sends a 30x response redirecting to " > http://punto-informatico.it". > > If I run nutch 0.8 with this command: > > nutch urls/ -dir pi -depth 2 -threads 1 > > it can't retrieve any page... > > I tried the same command with nutch-0.7 and it retrieved 41 pages. > > Is it an issue or am I missing something? > > Thanks, > Enrico > >
