Can't make sure what happened, check out the "indexes" dir, find out whether this dir is empty, or you can use index management tools such as "luke" to make sure whether index file is broken.
Tiger 2011-1-20 2011/1/18 Andrey Sapegin <[email protected]> > Dear all. > > I have a problem with nutch Internet crawl/recrawl script (I'm wanted to > understand how it works, so I wrote it by myself). > > After I merge indexes (merging segments seems to be fine), I search doesn't > work for me: > $ bin/nutch org.apache.nutch.searcher.NutchBean http > Total hits: 0 > > Before recrawling I was able to search (index was placed at crawl/indexes) > > My script: > --------------------------------------------- > #!/bin/bash > export JAVA_HOME=/usr/lib/jvm/java-6-sun > > #Inject new urls > bin/nutch inject crawl/crawldb dmoz/urls > echo "new URLs injected (dmoz/urls)" > > #generate segments > bin/nutch generate crawl/crawldb crawl/segments -topN $3 > echo "segments generated" > > #generate fetch-list > s1=`ls -d crawl/segments/2* | tail -1` > echo $s1 > echo "fetch-list generated" > > #fetch > bin/nutch fetch $s1 -threads $2 > echo "fetching done" > > #update the database with results of fetch > bin/nutch updatedb crawl/crawldb $s1 > echo "database updated" > > #merge segments > bin/nutch mergesegs crawl/MERGEDsegments crawl/segments/* > rm -r crawl/segments > mv crawl/MERGEDsegments crawl/segments > echo "segments merged" > > #inverting links > bin/nutch invertlinks crawl/linkdb -dir crawl/segments > echo "links inverted" > > #indexing > bin/nutch index crawl/NEWindexes crawl/crawldb crawl/linkdb > crawl/segments/* > echo "indexing done" > > #dedup - delete duplicate documents in the index > bin/nutch dedup crawl/NEWindexes > echo "dedup done" > > #merging indexes > bin/nutch merge crawl/MERGEDindexes crawl/NEWindexes > echo "indexes merged" > > # replace indexes with indexes_merged > mv --verbose crawl/indexes crawl/OLDindexes > mv --verbose crawl/MERGEDindexes crawl/indexes/part-00000 > > #clean up > rm -rf crawl/NEWindexes > rm -rf crawl/OLDindexes > ------------------------------------------------- > > What's wrong with the script? > > Thank You in advance, > Kind Regards, > > -- > > Andrey Sapegin, > Software Developer, > > Unister GmbH > [email protected] > www.unister.de > >

