textFile used to work with .gz files, i haven't tested it on bz2 files. If
it isn't decompressing by default then what you have to do is to use the
sc.wholeTextFiles and then decompress each record (that being file) with
the corresponding codec.

Thanks
Best Regards

On Tue, Sep 8, 2015 at 6:49 PM, Chris Teoh <chris.t...@gmail.com> wrote:

> Hi Folks,
>
> I tried using Spark v1.2 on bz2 files in Java but the behaviour is
> different to the same textFile API call in Python and Scala.
>
> That being said, how do I process to read .tar.bz2 files in Spark's Java
> API?
>
> Thanks in advance
> Chris
>

Reply via email to