Seems like something to put on the TODO if it isn't already there. I might
look at the bug list to see what is in store for the future :)

The good news is that I just made the changes to my script to use merge so
I'll benchmark again and see how much faster it is …. probably significantly
faster :)

On Fri, Aug 19, 2011 at 11:12 PM, Ashutosh Chauhan <[email protected]>wrote:

> Hey Kevin,
>
> No, Pig currently doesn't auto-detect that data is getting sorted in
> previous steps of script. So, you need to tell it by 'using merge'.
>
> Hope it helps,
> Ashutosh
>
> On Fri, Aug 19, 2011 at 22:51, Kevin Burton <[email protected]> wrote:
>
> > I was reading about USING 'merge' with JOIN when relations are already
> > sorted.
> >
> > I actually was just looking through some code and realized that one of my
> > JOINs was on two relations that were *already* sorted due to a DISTINCT
> and
> > GROUP operation.
> >
> > I just added USING 'merge' and the initial results look the same.
> >
> > I haven't benchmarked it though.
> >
> > Does/would the existing optimizer be able to detect this and just use
> merge
> > without manual intervention?
> >
> > --
> >
> > Founder/CEO Spinn3r.com
> >
> > Location: *San Francisco, CA*
> > Skype: *burtonator*
> >
> > Skype-in: *(415) 871-0687*
> >
>



-- 

Founder/CEO Spinn3r.com

Location: *San Francisco, CA*
Skype: *burtonator*

Skype-in: *(415) 871-0687*

Reply via email to