ASF GitHub Bot commented on NUTCH-2700:

sebastian-nagel commented on pull request #446: NUTCH-2700 Indexchecker: 
improve command-line help
URL: https://github.com/apache/nutch/pull/446
   ... and add options `-doIndex` to pass "checked" document to index writers 
(the property `doIndex` is kept to ensure back-ward compatibility):
   % bin/nutch indexchecker
     IndexingFiltersChecker [OPTIONS] <url>
       Fetch single URL and index it
     IndexingFiltersChecker [OPTIONS] -stdin
       Read URLs to be indexed from stdin
     IndexingFiltersChecker [OPTIONS] -listen <port> [-keepClientCnxOpen]
       Listen on <port> for URLs to be indexed
     -D<property>=<value>  set/overwrite Nutch/Hadoop properties
                           (a generic Hadoop option to be passed
                            before other command-specific options)
     -normalize            normalize URLs
     -followRedirects      follow redirects when fetching URL
     -dumpText             show the entire plain-text content,
                           not only the first 100 characters
     -doIndex              pass document to configured index writers
                           and let them index it
     -md <key>=<value>     metadata added to CrawlDatum before parsing
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> Indexchecker: improve command-line help
> ---------------------------------------
>                 Key: NUTCH-2700
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2700
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Priority: Minor
>             Fix For: 1.16
> The command-line help of the indexchecker tool is incomplete:
> {noformat}
> Usage: IndexingFiltersChecker [-normalize] [-followRedirects] [-dumpText] 
> [-md key=value] (-stdin | -listen <port> [-keepClientCnxOpen])
> {noformat}
> It does not
> - show the possibility to pass the URL as argument
> - mention the property {{-DdoIndex=true}} which makes it send the document to 
> the indexes
> It should follow the help shown by parsechecker.

This message was sent by Atlassian JIRA

Reply via email to