-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9119/
-----------------------------------------------------------

Review request for nutch and Julien Le Dem.


Description
-------

Will contain the patch the SegmentContentDumperTool described in NUTCH-1526:

./bin/nutch org.apache.nutch.tools.SegmentContentDumper [options]
   -segmentRootDir full file path to the root segment directory, e.g., 
crawl/segments
   -regexUrlPattern a regex URL pattern to select URL keys to dump from the 
content DB in each segment
   -outputDir The output directory to write file names to.
   -metadata --key=value where key is a Content Metadata key and value is a 
value to check.


This addresses bug NUTCH-1526.
    https://issues.apache.org/jira/browse/NUTCH-1526


Diffs
-----


Diff: https://reviews.apache.org/r/9119/diff/


Testing
-------

Testing it on DARPA XDATA XNET.


Thanks,

Chris Mattmann

Reply via email to