Hi Rui, The equivalent of 'batchid' in 1.x would be segment. The batchId represents an identifier for a data structure containing (initially) generated URLs which are good for fetching.
hth Lewis On Fri, Dec 28, 2012 at 6:43 AM, 高睿 <[email protected]> wrote: > Hi, > > I would like to do that. But I still don't understand the concept of > 'batch id'. Besides, is it the right direction to capture 'batch' argument > in command line? > > Thanks. > > > > > > > > > > At 2012-12-19 22:07:23,"Lewis John Mcgibbney" <[email protected]> > wrote: > >Hi, > > > >Currently the batchID is originally set by the GeneratorJob#run() method > >@line 169 [0], you will see that this can also be overridden by the > >generate.batch.id property in nutch-site.xml > > > >Currently if you look at line 117 in the crawl script [1] you will see > that > >there is a TODO to capture the batchID programmatically. > > > >1) I would advise you to use this crawl script > >2) If you are able to create an issue on Jira, then submit a patch for the > >issue it would be excellent. > > > >Lewis > > > >[0] > > > http://svn.apache.org/viewvc/nutch/branches/2.x/src/java/org/apache/nutch/crawl/GeneratorJob.java?view=markup > >[1] > >http://svn.apache.org/viewvc/nutch/branches/2.x/src/bin/crawl?view=markup > > > >On Fri, Dec 14, 2012 at 11:49 AM, 高睿 <[email protected]> wrote: > > > >> Hi, > >> > >> When I specify solr in command line, There will be an exception thrown. > >> Command line: urls -solr http://localhost:8080/solr/ -depth 1 -topN 3 > >> I tried to add '-batch 3' parameter into command line, but it doesn't > >> help. I looked into the code, and found the parameter is ignored > somewhere. > >> So, how do I fix this? Thanks. > >> > >> Skipping http://www.iguuu.com/thread-944-1-1.html; different batch id > >> (null) > >> Skipping http://www.iguuu.com/thread-987-1-1.html; different batch id > >> (null) > >> Exception in thread "main" java.lang.NullPointerException > >> at java.util.Hashtable.put(Unknown Source) > >> at java.util.Properties.setProperty(Unknown Source) > >> at org.apache.hadoop.conf.Configuration.set(Configuration.java:438) > >> at > >> org.apache.nutch.indexer.IndexerJob.createIndexJob(IndexerJob.java:128) > >> at > >> org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:44) > >> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68) > >> at org.apache.nutch.crawl.Crawler.run(Crawler.java:192) > >> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > >> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257) > >> > >> Regards, > >> Rui > >> > > > > > > > >-- > >*Lewis* > -- *Lewis*

