[Lucene-hadoop Wiki] Trivial Update of "SequenceFile" by Arun C Murthy

Apache Wiki Wed, 16 Aug 2006 03:17:30 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by Arun C Murthy:
http://wiki.apache.org/lucene-hadoop/SequenceFile

------------------------------------------------------------------------------
  
  Essentially there are 3 different file formats for !SequenceFiles depending 
on whether ''compression'' and ''block compression'' are active.
  
- 
+ [[BR]]
- However any of the above formats share a common ''header'' (which is used by 
the !SequenceFile.Reader to return the appropriate key/value pairs). The next 
section summarises the header:
+ However all of the above formats share a common ''header'' (which is used by 
the !SequenceFile.Reader to return the appropriate key/value pairs). The next 
section summarises the header:
+ [[Anchor(SeqFileHeader)]]
- [[Anchor(SeqFileHeader)]]===== SequenceFile Common Header =====
+ ===== SequenceFile Common Header =====
   * version - A byte array: SEQ<version no.>
   * keyClassName - String
   * valueClassName - String
@@ -30, +31 @@

   * blockCompression -  A boolean which specifies if ''block compression'' is 
turned on for keys/values in this file.
   * sync - A sync marker to denote end of the header.
  
- 
+ [[BR]]
  The formats for Uncompressed/!RecordCompressed Writers are very similar:
  ===== Uncompressed/RecordCompressed Writer Format =====
   * [#SeqFileHeader Header]
@@ -38, +39 @@

     * Key
     * (Compressed?) Value
   * A sync-marker every 100bytes or so to help in seeking to a random point in 
the file and then seeking to next ''record''.
- <br>
  
+ [[BR]]
  The format for the !BlockCompressedWriter is as follows:
  ===== BlockCompressed Writer Format =====
   * [#SeqFileHeader Header]

[Lucene-hadoop Wiki] Trivial Update of "SequenceFile" by Arun C Murthy

Reply via email to