It is right Laxmi. We dont have SolrIndexerJob command :)
you can use SolrIndexerJob with nutch shell script. May be you can use
Like this:
bin/nutch solrindex <solr url> -all -crawlId <crawl id>
Talat
28-10-2013 17:46 tarihinde, A Laxmi yazdı:
It says SolrIndexerJob: command not found
when I followed this syntax
SolrIndexerJob <solr url> (<batchId> | -all | -reindex) [-crawlId <id>]
On Mon, Oct 28, 2013 at 11:29 AM, feng lu <[email protected]> wrote:
Hi Laxmi
I check at code in bin/crawl script
echo "Indexing $CRAWL_ID on SOLR index -> $SOLRURL"
$bin/nutch solrindex $commonOptions $SOLRURL -all -crawlId $CRAWL_ID
if what you say is correct, then that script will also ignore the bachID
and crawlID.
you can try a small test db and run bin/nutch script step by step.
On Mon, Oct 28, 2013 at 10:57 PM, A Laxmi <[email protected]> wrote:
Hi feng -
I tried but its ignoring the batch ID and crawlID for some reason.
On Mon, Oct 28, 2013 at 10:00 AM, feng lu <[email protected]> wrote:
Hi
please check the usage of solrindex command
$ bin/nutch solrindex
Usage: SolrIndexerJob <solr url> (<batchId> | -all | -reindex)
[-crawlId
<id>]
On Mon, Oct 28, 2013 at 9:10 PM, A Laxmi <[email protected]>
wrote:
Hi,
For Nutch 2.2.1, I am aware of two crawl commands/scripts that came
out
of
the box with nutch -
(1) bin/nutch (step by step),
(2) bin/crawl (all in one)
I know how to specify a crawl ID for `bin/crawl` command. Similarly,
how
to
specify a crawl ID for `bin/nutch` command?
The reason I am asking is, I ran a large crawl job using `all-in-one
crawl
command "bin/crawl"` specifying a crawl ID, it broke while indexing
in
Solr
for 9th crawl iteration. Now, I just want to run one step `"bin/nutch
solrindex"` command for just that interrupted 9th iteration to
complete
the
solr indexing. How should I specify crawlID in "`bin/nutch
solrindex`"
command? What is the syntax?
I have all the crawl data stored in a HBase table "webpage_test"
--
Don't Grow Old, Grow Up... :-)
--
Don't Grow Old, Grow Up... :-)