Hey Talat!! Is there anyway I can specify the batchID as well in the following command?
bin/nutch solrindex <solr url> -all -crawlId <crawl id> On Mon, Oct 28, 2013 at 11:51 AM, Talat UYARER <[email protected]>wrote: > It is right Laxmi. We dont have SolrIndexerJob command :) > you can use SolrIndexerJob with nutch shell script. May be you can use > Like this: > > bin/nutch solrindex <solr url> -all -crawlId <crawl id> > > Talat > > 28-10-2013 17:46 tarihinde, A Laxmi yazdı: > > It says SolrIndexerJob: command not found >> >> when I followed this syntax >> >> SolrIndexerJob <solr url> (<batchId> | -all | -reindex) [-crawlId <id>] >> >> >> >> >> >> On Mon, Oct 28, 2013 at 11:29 AM, feng lu <[email protected]> wrote: >> >> Hi Laxmi >>> >>> I check at code in bin/crawl script >>> >>> echo "Indexing $CRAWL_ID on SOLR index -> $SOLRURL" >>> $bin/nutch solrindex $commonOptions $SOLRURL -all -crawlId $CRAWL_ID >>> >>> if what you say is correct, then that script will also ignore the bachID >>> and crawlID. >>> >>> you can try a small test db and run bin/nutch script step by step. >>> >>> >>> On Mon, Oct 28, 2013 at 10:57 PM, A Laxmi <[email protected]> >>> wrote: >>> >>> Hi feng - >>>> >>>> I tried but its ignoring the batch ID and crawlID for some reason. >>>> >>>> >>>> >>>> >>>> On Mon, Oct 28, 2013 at 10:00 AM, feng lu <[email protected]> wrote: >>>> >>>> Hi >>>>> >>>>> please check the usage of solrindex command >>>>> >>>>> $ bin/nutch solrindex >>>>> Usage: SolrIndexerJob <solr url> (<batchId> | -all | -reindex) >>>>> >>>> [-crawlId >>> >>>> <id>] >>>>> >>>>> >>>>> >>>>> On Mon, Oct 28, 2013 at 9:10 PM, A Laxmi <[email protected]> >>>>> >>>> wrote: >>> >>>> >>>>> Hi, >>>>>> >>>>>> For Nutch 2.2.1, I am aware of two crawl commands/scripts that came >>>>>> >>>>> out >>> >>>> of >>>>> >>>>>> the box with nutch - >>>>>> >>>>>> (1) bin/nutch (step by step), >>>>>> (2) bin/crawl (all in one) >>>>>> >>>>>> I know how to specify a crawl ID for `bin/crawl` command. Similarly, >>>>>> >>>>> how >>>> >>>>> to >>>>> >>>>>> specify a crawl ID for `bin/nutch` command? >>>>>> >>>>>> The reason I am asking is, I ran a large crawl job using `all-in-one >>>>>> >>>>> crawl >>>>> >>>>>> command "bin/crawl"` specifying a crawl ID, it broke while indexing >>>>>> >>>>> in >>> >>>> Solr >>>>> >>>>>> for 9th crawl iteration. Now, I just want to run one step `"bin/nutch >>>>>> solrindex"` command for just that interrupted 9th iteration to >>>>>> >>>>> complete >>> >>>> the >>>>> >>>>>> solr indexing. How should I specify crawlID in "`bin/nutch >>>>>> >>>>> solrindex`" >>> >>>> command? What is the syntax? >>>>>> >>>>>> I have all the crawl data stored in a HBase table "webpage_test" >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Don't Grow Old, Grow Up... :-) >>>>> >>>>> >>>> >>> >>> >>> -- >>> Don't Grow Old, Grow Up... :-) >>> >>> >> >

