Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JerryRussell:
http://wiki.apache.org/nutch/bin/nutch_mergesegs

The comment on the change is:
fixed classpath to org.apache

------------------------------------------------------------------------------
- mergesegs is an alias for net.nutch.tools.!SegmentMergeTool
+ mergesegs is an alias for org.apache.nutch.tools.!SegmentMergeTool
  
  This class cleans up accumulated segments data, and merges them into a single 
(or optionally multiple) segment(s), with no duplicates in it.
  
  There are no prerequisites for its correct operation except for a set of 
already fetched segments (they don't have to contain parsed content, only 
fetcher output is required). This tool does not use DeleteDuplicates, but 
creates its own "master" index of all pages in all segments. Then it walks 
sequentially through this index and picks up only most recent versions of pages 
for every unique value of url or hash.
  
- If some of the input segments are corrupted, this tool will attempt to repair 
them, using net.nutch.segment.!SegmentReader.fixSegment(!NutchFileSystem, File, 
boolean, boolean, boolean, boolean) method.
+ If some of the input segments are corrupted, this tool will attempt to repair 
them, using 
org.apache.nutch.segment.!SegmentReader.fixSegment(!NutchFileSystem, File, 
boolean, boolean, boolean, boolean) method.
  
  Output segment can be optionally split on the fly into several segments of 
fixed length.
  
@@ -16, +16 @@

  
  You may want to run SegmentMergeTool instead of following the manual 
procedures, with all options turned on, i.e. to merge segments into the output 
segment(s), index it, and then delete the original segments data.
  
- Usage: bin/nutch net.nutch.tools.!SegmentMergeTool (-local | -nfs ...) [[BR]]
+ Usage: bin/nutch org.apache.nutch.tools.!SegmentMergeTool (-local | -nfs ...) 
[[BR]]
  (-dir <input_segments_dir> | seg1 seg2 ...) [[BR]]
  [-o <output_segments_dir>] [-max count] [-i] [-ds] [[BR]]
  -dir <input_segments_dir> "path to directory containing input segments" [[BR]]

Reply via email to