[
https://issues.apache.org/jira/browse/NUTCH-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792805#comment-16792805
]
ASF GitHub Bot commented on NUTCH-2700:
---------------------------------------
sebastian-nagel commented on pull request #446: NUTCH-2700 Indexchecker:
improve command-line help
URL: https://github.com/apache/nutch/pull/446
... and add options `-doIndex` to pass "checked" document to index writers
(the property `doIndex` is kept to ensure back-ward compatibility):
```
% bin/nutch indexchecker
Usage:
IndexingFiltersChecker [OPTIONS] <url>
Fetch single URL and index it
IndexingFiltersChecker [OPTIONS] -stdin
Read URLs to be indexed from stdin
IndexingFiltersChecker [OPTIONS] -listen <port> [-keepClientCnxOpen]
Listen on <port> for URLs to be indexed
Options:
-D<property>=<value> set/overwrite Nutch/Hadoop properties
(a generic Hadoop option to be passed
before other command-specific options)
-normalize normalize URLs
-followRedirects follow redirects when fetching URL
-dumpText show the entire plain-text content,
not only the first 100 characters
-doIndex pass document to configured index writers
and let them index it
-md <key>=<value> metadata added to CrawlDatum before parsing
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Indexchecker: improve command-line help
> ---------------------------------------
>
> Key: NUTCH-2700
> URL: https://issues.apache.org/jira/browse/NUTCH-2700
> Project: Nutch
> Issue Type: Improvement
> Components: indexer
> Affects Versions: 1.15
> Reporter: Sebastian Nagel
> Priority: Minor
> Fix For: 1.16
>
>
> The command-line help of the indexchecker tool is incomplete:
> {noformat}
> Usage: IndexingFiltersChecker [-normalize] [-followRedirects] [-dumpText]
> [-md key=value] (-stdin | -listen <port> [-keepClientCnxOpen])
> {noformat}
> It does not
> - show the possibility to pass the URL as argument
> - mention the property {{-DdoIndex=true}} which makes it send the document to
> the indexes
> It should follow the help shown by parsechecker.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)