[
https://issues.apache.org/jira/browse/NUTCH-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998333#comment-13998333
]
Lewis John McGibbney commented on NUTCH-1773:
---------------------------------------------
bq. hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch solrindex TestCrawl18
-reindex
There is no $SOLR_URL passed as an argument here!
Is anyone else getting issues with this? Can we reproduce?
> Solr Indexer fails
> ------------------
>
> Key: NUTCH-1773
> URL: https://issues.apache.org/jira/browse/NUTCH-1773
> Project: Nutch
> Issue Type: Bug
> Components: indexer
> Affects Versions: 2.3
> Environment: Ubuntu 12.04 LTS, java version "1.7.0_55" - Hbase-0.90.6
> (pseudo dist), Hadoop 1.2.1, Solr 4.6
> Reporter: Ralf
> Priority: Critical
> Fix For: 2.3
>
>
> When using crawl script or solrindexer by itself (/bin/nutch solrindex) in
> localmode it fails with:
> hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch solrindex TestCrawl18
> -reindex
> IndexingJob: starting
> Active IndexWriters :
> SOLRIndexWriter
> solr.server.url : URL of the SOLR instance (mandatory)
> solr.commit.size : buffer size when sending to SOLR (default 1000)
> solr.mapping.file : name of the mapping file for fields (default
> solrindex-mapping.xml)
> solr.auth : use authentication (default false)
> solr.auth.username : use authentication (default false)
> solr.auth : username for authentication
> solr.auth.password : password for authentication
> SolrIndexerJob: java.lang.IllegalStateException: Target host must not be
> null, or set in parameters.
> at
> org.apache.http.impl.client.DefaultRequestDirector.determineRoute(DefaultRequestDirector.java:787)
> at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:414)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:393)
> at
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197)
> at
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
> at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168)
> at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:146)
> at
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.commit(SolrIndexWriter.java:146)
> at org.apache.nutch.indexer.IndexWriters.commit(IndexWriters.java:127)
> at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:171)
> at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:187)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:196)
> when using the new INDEX command it finishes, but nothing is added to Solr:
> hduser@bl4ck1c3:~/nutch-2.3/runtime/local$ bin/nutch index TestCrawl18
> -reindex
> IndexingJob: starting
> Active IndexWriters :
> SOLRIndexWriter
> solr.server.url : URL of the SOLR instance (mandatory)
> solr.commit.size : buffer size when sending to SOLR (default 1000)
> solr.mapping.file : name of the mapping file for fields (default
> solrindex-mapping.xml)
> solr.auth : use authentication (default false)
> solr.auth.username : use authentication (default false)
> solr.auth : username for authentication
> solr.auth.password : password for authentication
>
> Log shows:
> 2014-05-13 03:01:13,781 INFO indexer.IndexingJob - IndexingJob: starting
> 2014-05-13 03:01:14,108 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.analysis.lang.LanguageIndexingFilter
> 2014-05-13 03:01:14,109 INFO basic.BasicIndexingFilter - Maximum title
> length for indexing set to: 100
> 2014-05-13 03:01:14,109 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.basic.BasicIndexingFilter
> 2014-05-13 03:01:14,335 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.more.MoreIndexingFilter
> 2014-05-13 03:01:14,336 INFO anchor.AnchorIndexingFilter - Anchor
> deduplication is: off
> 2014-05-13 03:01:14,336 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.anchor.AnchorIndexingFilter
> 2014-05-13 03:01:14,620 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:14,768 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:14,968 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:15,243 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:15,276 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:15,326 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:15,386 INFO indexer.IndexWriters - Adding
> org.apache.nutch.indexwriter.solr.SolrIndexWriter
> 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: content dest:
> content
> 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: title dest:
> title
> 2014-05-13 03:01:15,403 INFO solr.SolrMappingReader - source: host dest: host
> 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: batchId dest:
> batchId
> 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: boost dest:
> boost
> 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: digest dest:
> digest
> 2014-05-13 03:01:15,404 INFO solr.SolrMappingReader - source: tstamp dest:
> tstamp
> 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.analysis.lang.LanguageIndexingFilter
> 2014-05-13 03:01:15,405 INFO basic.BasicIndexingFilter - Maximum title
> length for indexing set to: 100
> 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.basic.BasicIndexingFilter
> 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.more.MoreIndexingFilter
> 2014-05-13 03:01:15,405 INFO anchor.AnchorIndexingFilter - Anchor
> deduplication is: off
> 2014-05-13 03:01:15,405 INFO indexer.IndexingFilters - Adding
> org.apache.nutch.indexer.anchor.AnchorIndexingFilter
> 2014-05-13 03:01:15,426 WARN zookeeper.ClientCnxnSocket - Connected to an
> old server; r-o mode will be unavailable
> 2014-05-13 03:01:15,442 WARN mapred.FileOutputCommitter - Output path is
> null in cleanup
> 2014-05-13 03:01:16,144 INFO indexer.IndexWriters - Adding
> org.apache.nutch.indexwriter.solr.SolrIndexWriter
> 2014-05-13 03:01:16,144 INFO indexer.IndexingJob - Active IndexWriters :
> SOLRIndexWriter
> solr.server.url : URL of the SOLR instance (mandatory)
> solr.commit.size : buffer size when sending to SOLR (default 1000)
> solr.mapping.file : name of the mapping file for fields (default
> solrindex-mapping.xml)
> solr.auth : use authentication (default false)
> solr.auth.username : use authentication (default false)
> solr.auth : username for authentication
> solr.auth.password : password for authentication
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: content dest:
> content
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: title dest:
> title
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: host dest: host
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: batchId dest:
> batchId
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: boost dest:
> boost
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: digest dest:
> digest
> 2014-05-13 03:01:16,145 INFO solr.SolrMappingReader - source: tstamp dest:
> tstamp
> 2014-05-13 03:01:16,338 INFO solr.SolrIndexWriter - Total 0 document is
> added.
> 2014-05-13 03:01:16,338 INFO indexer.IndexingJob - IndexingJob: done.
--
This message was sent by Atlassian JIRA
(v6.2#6252)