Re: Support for gz files ?

Ramkumar Chokkalingam Mon, 21 Oct 2013 11:35:14 -0700

Oh that really helps. My bad, didn't read that clearly. In fact, I'm
already reading .gz files. But my concern was, will it be efficient to run
the job without unzipping the .gz files- which might itself take some time
to run for my input size.


I have around 20K input files each of the size ~250KB which are already in
.gz format . Also am not storing it in HDFS, but reading directly from
Local file system. So as to make this processing split across multiple
files - should I decompress them and compress again with snappy utility
before running them  or run them directly as .gz input files ?

Re: Support for gz files ?

Reply via email to