Re: Snappy question related to last

Harsh J Sun, 15 Apr 2012 23:40:05 -0700

This depends on the container you're using. SequenceFiles with Snappy
can be detected easily since the header of such files carry the codec
class used, and hence readers instantiate the right one to decompress
with.

However, since Snappy is just a compression codec and does not provide
a container format
(http://code.google.com/p/snappy/issues/detail?id=34) there's no
present way to "detect" if a file/stream is snappy encoded or not,
unless a full stream is available (to test with, via python's
snappy.isValidCompressed, say).

If you're using Snappy today, its best to be used at map intermediate
level, and within other container formats such as the hadoop
sequencefiles and avro datafiles.

On Sun, Apr 15, 2012 at 6:02 PM, JAX <jayunit...@gmail.com> wrote:
> Hi guys : related to the last snappy question - how does Hadoop detect Snappy 
> compression in the input dataset ( how does Hadoop
> Know when to decompress records via snappy ).
>
> Jay Vyas
> MMSB
> UCHC

-- 
Harsh J

Re: Snappy question related to last

Reply via email to