Yes, it does work with fewer GZipped files. I am reading the files in using
sc.textFile() and a pattern string.

For example:

a = sc.textFile('s3n://bucket/2014-??-??/*.gz')
a.count()

Nick
​


On Tue, May 20, 2014 at 10:09 PM, Madhu <ma...@madhu.com> wrote:

> I have read gzip files from S3 successfully.
>
> It sounds like a file is corrupt or not a valid gzip file.
>
> Does it work with fewer gzip files?
> How are you reading the files?
>
>
>
>
> -----
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/count-ing-gz-files-gives-java-io-IOException-incorrect-header-check-tp5768p6149.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to