[ https://issues.apache.org/jira/browse/HADOOP-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas resolved HADOOP-2424. ----------------------------------- Resolution: Won't Fix With the right header, the existing LzoCodec should be compatible with lzop. It would probably be better implemented as an InputFormat/OutputFormat, anyway. > lzop compatible CompressionCodec > -------------------------------- > > Key: HADOOP-2424 > URL: https://issues.apache.org/jira/browse/HADOOP-2424 > Project: Hadoop > Issue Type: Improvement > Components: io, native > Reporter: Chris Douglas > > LzoCodec currently outputs at most {{io.compression.codec.lzo.buffersize}} > (default 64k)- less the compression overhead- bytes per write (HADOOP-2402) > in the following format: > {noformat} > [uncompressed block length(32)] > [compressed block length(32)] > [compressed block] > {noformat} > lzop (lzo-backed command-line utility) writes blocks in the following format: > {noformat} > [uncompressed block length(32)] > [compressed block length (32)] > [Adler-32|CRC-32 checksum of uncompressed block (32)] > [Adler-32|CRC-32 checksum of compressed block (32)] > [compressed block] > {noformat} > There's an additional ~32 byte header to the file. I don't know of a > standard, but the lzop source should suffice. > Since we're using ".lzo" as the default extension, it's worth considering > being compatible with lzop, but not necessarily for all lzo-compressed > blocks. For example, SequenceFiles should use the existing LzoCodec format. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.