I'm attempting to get a crawl working using scripts, but I've been getting a "Skipping <url>; different batch id (null)" error and then nothing new in Solr. So I've reverted back to trying out the "crawl" for the nutch script:
./nutch crawl ../urls/ -solr "http://localhost/nutchsolr" -threads 5 -depth 3 -topN 100 urls has the "seed.txt" file with some sites. It definitely is able to get pages (finding other hostnames in the lists scrolling through the screen), but then it is still skipping with the "batch id (null)" message for everything it finds. Any guidance/advice would be appreciated. Thanks! -- Chris

