I retract the suggestion :).
How would we do testing/building for it in piggybank? Not include it in the
compile and test targets, and set up a separate compile-rcstore,
test-rcstore targets?

-D


On Mon, Nov 30, 2009 at 6:31 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:

> +1 on what Alan is saying. I think it would be an overkill to have
> another contrib. for this.
>
> Olga
>
> -----Original Message-----
> From: Alan Gates [mailto:ga...@yahoo-inc.com]
> Sent: Monday, November 30, 2009 2:42 PM
> To: pig-dev@hadoop.apache.org
> Subject: Re: Pig reading hive columnar rc tables
>
>
> On Nov 30, 2009, at 12:18 PM, Dmitriy Ryaboy wrote:
>
> > That's awesome, I've been itching to do that but never got around to
> > it..
> > Garrit, do you have any benchmarks on read speeds?
> >
> > I don't know about putting this in piggybank, as it carries with it
> > pretty
> > significant dependencies, increasing the size of the jar and making it
> > difficult for users to don't need it to build piggybank in the first
> > place.
> > We might want to consider some other contrib for it -- maybe a "misc"
> > contrib that would have indivudual ant targets for these kinds of
> > compatibility submissions?
>
> Does it have to increase the size of the piggybank jar?  Instead of
> including hive in our piggybank jar, which I agree would be bad, can
> we just say that if you want to use this function you need to provide
> the appropriate hive jar yourself?  This way we could use ivy to pull
> the jars and build piggybank.
>
> I'm not really wild about creating a new section of contrib just for
> functions that have heavier weight requirements.
>
> Alan.
>
> >
> > -D
> >
> >
> > On Mon, Nov 30, 2009 at 3:09 PM, Olga Natkovich <ol...@yahoo-
> > inc.com> wrote:
> >
> >> Hi Garrit,
> >>
> >> It would be great if you could contribute the code. The process is
> >> pretty simple:
> >>
> >> - Open a JIRA that describes what the loader does and that you would
> >> like to contribute it to the Piggybank.
> >> - Submit the patch that contains the loader. Make sure it has unit
> >> tests
> >> and javadoc.
> >>
> >> On this is done, one of the committers will review and commit the
> >> patch.
> >>
> >> More details on how to contribute are in
> >> http://wiki.apache.org/pig/PiggyBank.
> >>
> >> Olga
> >>
> >> -----Original Message-----
> >> From: Gerrit van Vuuren [mailto:gvanvuu...@specificmedia.com]
> >> Sent: Friday, November 27, 2009 2:42 AM
> >> To: pig-dev@hadoop.apache.org
> >> Subject: Pig reading hive columnar rc tables
> >>
> >> Hi,
> >>
> >>
> >>
> >> I've coded a LoadFunc implementation that can read from Hive
> >> Columnar RC
> >> tables, this is needed for a project that I'm working on because
> >> all our
> >> data is stored using the Hive thrift serialized Columnar RC format. I
> >> have looked at the piggy bank but did not find any implementation
> >> that
> >> could do this. We've been running it on our cluster for the last week
> >> and have worked out most bugs.
> >>
> >>
> >>
> >> There are still some improvements to be done but I would need  like
> >> setting the amount of mappers based on date partitioning. Its been
> >> optimized so as to read only specific columns and can churn through a
> >> data set almost 8 times faster with this improvement because not all
> >> column data is read.
> >>
> >>
> >>
> >> I would like to contribute the class to the piggybank can you guide
> >> me
> >> in what I need to do?
> >>
> >> I've used hive specific classes to implement this, is it possible
> >> to add
> >> this to the piggy bank build ivy for automatic download of the
> >> dependencies?
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Gerrit Jansen van Vuuren
> >>
> >>
>
>

Reply via email to