I'm attempting to get a crawl working using scripts, but I've been getting
a "Skipping <url>; different batch id (null)" error and then nothing new in
Solr.  So I've reverted back to trying out the "crawl" for the nutch script:

./nutch crawl ../urls/ -solr "http://localhost/nutchsolr"; -threads 5 -depth
3 -topN 100

urls has the "seed.txt" file with some sites.  It definitely is able to get
pages (finding other hostnames in the lists scrolling through the screen),
but then it is still skipping with the "batch id (null)" message for
everything it finds.

Any guidance/advice would be appreciated.

Thanks!

-- Chris

Reply via email to