[
https://issues.apache.org/jira/browse/NUTCH-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153683#comment-14153683
]
Sebastian Nagel commented on NUTCH-1864:
----------------------------------------
Hi Lewis,
* (indexer-solr invoked by default): +1, that's definitely a problem
** could only reproduce if indexer-solr is in plugin.includes
** easy to fix: instantiate IndexWriters only with property doIndex == true
(see attached patch)
* (command-line parsing): 0, the last argument must be the URL, indeed. Most
Nutch tools (and Tool in general) require options and arguments in a certain
ordering, e.g.
{code}
bin/nutch indexchecker -DdoIndex=true -dumpText http://www.nasa.gov/
{code}
Do we really need to change it?
> Bug in indexchecker CLI parsing and invocation of index-solr plugin by default
> ------------------------------------------------------------------------------
>
> Key: NUTCH-1864
> URL: https://issues.apache.org/jira/browse/NUTCH-1864
> Project: Nutch
> Issue Type: Bug
> Components: indexer
> Affects Versions: 1.10
> Reporter: Lewis John McGibbney
> Fix For: 1.10
>
>
> I noticed ok that we have a bug in indexchecker tool where
> * the command line parsing is buggy, it expects the args.length -1 argument
> to be the URL IIRC.
> * Even if indexer-solr is NOT activated, I get the following message
> lmcgibbn@LMC-032857 /usr/local/trunk/runtime/local $ ./bin/nutch indexchecker
> -dumpText http://nasa.gov
> fetching: http://nasa.gov
> Exception in thread "main" java.lang.RuntimeException: Missing SOLR URL.
> Should be set via -D solr.server.url
> SOLRIndexWriter
> solr.server.url : URL of the SOLR instance (mandatory)
> solr.commit.size : buffer size when sending to SOLR (default 1000)
> solr.mapping.file : name of the mapping file for fields (default
> solrindex-mapping.xml)
> solr.auth : use authentication (default false)
> solr.auth.username : username for authentication
> solr.auth.password : password for authentication
> at
> org.apache.nutch.indexwriter.solr.SolrIndexWriter.setConf(SolrIndexWriter.java:192)
> at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:159)
> at org.apache.nutch.indexer.IndexWriters.<init>(IndexWriters.java:57)
> at
> org.apache.nutch.indexer.IndexingFiltersChecker.run(IndexingFiltersChecker.java:98)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at
> org.apache.nutch.indexer.IndexingFiltersChecker.main(IndexingFiltersChecker.java:178)
> These issues should be rectified as this is an extremely useful tool which is
> broken right now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)