Hi All,

I am using nutch 1.12 , observed  indexing same crawlDB multiple times
gives different indexed doc count.

We indexing from crawlDB and noted the indexed doc count, then wiped all
index from solr and indexed again, this time number of document indexed
were less then before.

I removed all our customized plugins  but indexed doc count still varies
and it's reproducible almost every time.

Command I used for crawl
./crawl seedPath crawlDir -1

Command Used for Indexing to solr:
./nutch solrindex $SOLR_URL $CRAWLDB_PATH $CRAWLDB_DIR/segments/* -filter
-normalize -deleteGone

Please suggest.

Thanks Mark

Reply via email to