Hi!
Quick questions:
- which sdk are you using?
- is this batch or streaming?

As JB mentioned, TextIO is able to work with compressed files that contain
text. Nothing currently handles the double decompression that I believe
you're looking for.
TextIO for Java is also able to"watch" a directory for new files. If you're
able to (outside of your pipeline) decompress your first zip file into a
directory that your pipeline is watching, you may be able to use that as
work around. Does that sound like a good thing?
Finally, if you want to implement a transform that does all your logic,
well then that sounds like SplittableDoFn material; and in that case,
someone that knows SDF better can give you guidance (or clarify if my
suggestions are not correct).
Best
-P.

On Thu, Mar 15, 2018, 8:09 PM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

> Hi
>
> TextIO supports compressed file. Do you want to read files in text ?
>
> Can you detail a bit the use case ?
>
> Thanks
> Regards
> JB
> Le 15 mars 2018, à 18:28, Shirish Jamthe <sjam...@google.com> a écrit:
>>
>> Hi,
>>
>> My input is a tar.gz or .zip file which contains thousands of tar.gz
>> files and other files.
>> I would lile to extract the tar.gz files from the tar.
>>
>> Is there a transform that can do that? I couldn't find one.
>> If not is it in works? Any pointers to start work on it?
>>
>> thanks
>>
> --
Got feedback? go/pabloem-feedback

Reply via email to