All, So I've had moderate to good success with Nutch today - however, I'm having to kill my machine that is, unfortunately, in the middle of a seven hour crawl. Naturally, I don't want to just pitch the data I've accumulated and I'm hoping there's a way around this w/o reindexing. Based on prior experience with smaller crawls -it appears that Nutch needs to collapse all of the segments it has built intenrally...in this case:
crawl.test/ 20050809133539 20050809133548 20050809133735 20050809140754 20050809171316 I'm hoping that I can do this manually via "nutch merge" or some other tool. Any suggestions on how to pull this off would be most welcome. Thanks for your help, Cory Wilkerson ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
