[ https://issues.apache.org/jira/browse/NUTCH-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma updated NUTCH-2194: --------------------------------- Attachment: NUTCH-2194.patch Updated patch. Signature is now also added to CrawlDatum, in case an indexing filter wants to do something with it. The checker now more closely resembles IndexerMapReduce. > Run IndexingFilterChecker as simple Telnet server > ------------------------------------------------- > > Key: NUTCH-2194 > URL: https://issues.apache.org/jira/browse/NUTCH-2194 > Project: Nutch > Issue Type: New Feature > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Minor > Fix For: 1.12 > > Attachments: NUTCH-2194.patch, NUTCH-2194.patch > > > We have used a customized IndexingFilterChecker running as server to be able > to quickly test/check pages from web applications. I'll add this feature back > by letting IndexingFilterChecker run optionally as a simple server. > Run it with: > {code} > export NUTCH_HEAPSIZE=25 ; bin/nutch indexchecker -normalize -dumpText > -followRedirects -listen 1234 > {code} > Then perform a request over TCP: > {code} > echo "http://apache.org/" | nc localhost 1234 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)