Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by Arun C Murthy: http://wiki.apache.org/lucene-hadoop/SequenceFile ------------------------------------------------------------------------------ Essentially there are 3 different file formats for !SequenceFiles depending on whether ''compression'' and ''block compression'' are active. - + [[BR]] - However any of the above formats share a common ''header'' (which is used by the !SequenceFile.Reader to return the appropriate key/value pairs). The next section summarises the header: + However all of the above formats share a common ''header'' (which is used by the !SequenceFile.Reader to return the appropriate key/value pairs). The next section summarises the header: + [[Anchor(SeqFileHeader)]] - [[Anchor(SeqFileHeader)]]===== SequenceFile Common Header ===== + ===== SequenceFile Common Header ===== * version - A byte array: SEQ<version no.> * keyClassName - String * valueClassName - String @@ -30, +31 @@ * blockCompression - A boolean which specifies if ''block compression'' is turned on for keys/values in this file. * sync - A sync marker to denote end of the header. - + [[BR]] The formats for Uncompressed/!RecordCompressed Writers are very similar: ===== Uncompressed/RecordCompressed Writer Format ===== * [#SeqFileHeader Header] @@ -38, +39 @@ * Key * (Compressed?) Value * A sync-marker every 100bytes or so to help in seeking to a random point in the file and then seeking to next ''record''. - <br> + [[BR]] The format for the !BlockCompressedWriter is as follows: ===== BlockCompressed Writer Format ===== * [#SeqFileHeader Header]