Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JeffRitchie:
http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_segread

------------------------------------------------------------------------------
  = "segread" is an alias for "org.apache.nutch.segment.SegmentReader" =
  
- == Reads or Exports a Segments Data ==
+ == Reads and Exports a Segments Data ==
  
  === Usage ===
   nutch-0.8-dev/bin/nutch org.apache.nutch.segment.!SegmentReader <segment>
@@ -19, +19 @@

   None.
  
  === Caveats and Notes ===
+  Creates a directory in <segment> called segdump.  Within that directory a 
number of files are created.  A dump file called ''dump'' and several other 
files ''part-00000'' to ''part-00006''.  The dump file contains some readable 
information about the pages fetched and their parsed information.  I beleive 
that the dump file is all the part files consolidated together.  Do not 'cat' 
this if in a term as it does contain some binary data that will corrupt your 
terminal.
  
  DevelopmentCommandLineOptions
  

Reply via email to