yes On Sat, Aug 4, 2012 at 6:11 PM, Lewis John Mcgibbney < [email protected]> wrote:
> http:// ? > > hth > > On Fri, Aug 3, 2012 at 9:53 AM, Alexei Korolev <[email protected]> > wrote: > > Hello, > > > > I have small script > > > > $NUTCH_PATH inject crawl/crawldb seed.txt > > $NUTCH_PATH generate crawl/crawldb crawl/crawldb/segments -adddays 0 > > > > s1=`ls -d crawl/crawldb/segments/* | tail -1` > > $NUTCH_PATH fetch $s1 > > $NUTCH_PATH parse $s1 > > $NUTCH_PATH updatedb crawl/crawldb $s1 > > > > In seed.txt I have just one site, for example "test.com". When I start > > script it falls on fetch phase. > > If I change test.com on www.test.com it works fine. Seems the reason, > that > > outgoing link on test.com all have www. prefix. > > What I need to change in nutch config for work with test.com? > > > > Thank you in advance. I hope my explanation is clear :) > > > > -- > > Alexei A. Korolev > > > > -- > Lewis > -- Alexei A. Korolev

