Lewis John McGibbney created NUTCH-1864:
-------------------------------------------
Summary: Bug in indexchecker CLI parsing and invocation of
index-solr plugin by default
Key: NUTCH-1864
URL: https://issues.apache.org/jira/browse/NUTCH-1864
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 1.10
Reporter: Lewis John McGibbney
Fix For: 1.10
I noticed ok that we have a bug in indexchecker tool where
* the command line parsing is buggy, it expects the args.length -1 argument to
be the URL IIRC.
* Even if indexer-solr is NOT activated, I get the following message
lmcgibbn@LMC-032857 /usr/local/trunk/runtime/local $ ./bin/nutch indexchecker
-dumpText http://nasa.gov
fetching: http://nasa.gov
Exception in thread "main" java.lang.RuntimeException: Missing SOLR URL. Should
be set via -D solr.server.url
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default
solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication
at
org.apache.nutch.indexwriter.solr.SolrIndexWriter.setConf(SolrIndexWriter.java:192)
at
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:159)
at org.apache.nutch.indexer.IndexWriters.<init>(IndexWriters.java:57)
at
org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:98)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:178)
These issues should be rectified as this is an extremely useful tool which is
broken right now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)