Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by JeffRitchie:
http://wiki.apache.org/nutch/nutch-0%2e8-dev/bin/nutch_segread

The comment on the change is:
number of part files = number of tasktrackers.... I'm silly.

------------------------------------------------------------------------------
   None.
  
  === Caveats and Notes ===
-  Creates a directory in <segment> called segdump.  Within that directory a 
number of files are created.  A dump file called ''dump'' and several other 
files ''part-00000'' to ''part-00006''.  The dump file contains some readable 
information about the pages fetched and their parsed information.  I beleive 
that the dump file is all the part files consolidated together.  Do not 'cat' 
this if in a term as it does contain some binary data that will corrupt your 
terminal.
+  Creates a directory in <segment> called segdump.  Within that directory a 
number of files are created.  A dump file called ''dump'' and several other 
files prefixed ''part-''.  The dump file contains some readable information 
about the pages fetched and their parsed information.  The part files 
consolidated together to form the dump file and can be deleted.  Do not 'cat' 
these files if in a term as it does contain some binary data that will corrupt 
your terminal.
  
  DevelopmentCommandLineOptions
  

Reply via email to