Re: .tar.gz codec class implementation

Todd Lipcon Tue, 21 Jul 2009 11:20:42 -0700

Hi Andraz,

First, thanks for the contribution. Could you create a JIRA ticket and
upload the code there? Due to ASF restrictions, all contributions must be
attached to a JIRA so you can officially grant permission to include the
code. The JIRA will also allow others to review and comment on the code.


When you attach it to the JIRA, if you could format it as a -p0 patch
against the Common repository, that would also be preferred. Check out this
page for further info:

http://wiki.apache.org/hadoop/HowToContribute

Thanks
-Todd

On Tue, Jul 21, 2009 at 2:12 AM, Andraz Tori <[email protected]> wrote:

> If it is useful to anyone:
> here's a codec to support getting data from .tar.gz
>
> Basically the assumption is that instead of having just one text file
> gzipped, you have many text files tared and gzipped. Therefore it just
> concatenates all the files inside .tar.gz archive.
>
> The source was based on GzipCodec.java
> It also depends on JavaTar from
> http://gjt.org/pkgdoc/com/ice/tar/index.html which is released under
> Public Domain.
>
> It passes the unit tests for codecs and we've successfully used it in
> processing around a hundred gigabytes of data.
>
>
> --
> Andraz Tori, CTO
> Zemanta Ltd, New York, London, Ljubljana
> www.zemanta.com
> mail: [email protected]
> tel: +386 41 515 767
> twitter: andraz, skype: minmax_test
>
>
>

Re: .tar.gz codec class implementation

Reply via email to