I was doing a crawl of a site and at around 10,000 documents in it looks like it deleted the segments and started over from 0. I think it might have merged them into the db??
I see 050522 000938 Processing pagesByMD5: Merged to new DB containing 9324 records in 0.07 seconds 050522 000938 Processing pagesByMD5: Merged 133200.0 records/second . . . 050522 000953 Overall processing: Sorted 1.2441012441012442E-5 entries/second 050522 000953 FetchListTool completed 050522 000953 logging at INFO Then it starts over 050522 003516 status: segment 20050522000951, 100 pages, 0 errors, 456114 bytes, 1523290 ms So I lost all the 10,000 pages I had already fetched. Is this normal? How can I make it so that it never delted the segments?? ------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_idt12&alloc_id344&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
