Have you seen this thread ? http://stackoverflow.com/questions/24402737/how-to-read-gz-files-in-spark-using-wholetextfiles
On Tue, Feb 16, 2016 at 2:17 AM, Deepak Gopalakrishnan <dgk...@gmail.com> wrote: > Hello, > > I'm reading S3 files using wholeTextFiles() . My files are gzip format but > the names of the files does not end with a ".gz". I cannot force the names > of these files to end with a ".gz" . Is there a way to specify the > InputFormat as Gzip when using wholeTextFiles() > ? > > -- > Regards, > *Deepak Gopalakrishnan* > *Mobile*:+918891509774 > *Skype* : deepakgk87 > http://myexps.blogspot.com > >