Well, that's a great point! The only  reason why I'm creating a large 60k
pigscript to run (among  over half a dozen people adding/modifying scripts)
is that when pig compiles script, it optimizes to give the best performance.

I mean, otherwise we'd actually have to manually create optimized pieces of
scripts that have clearly defined interface in the form of an HDFS file. But
even then, I don't believe it will run as fast as a pig script concatenation
of all of the scripts.

(btw, can somebody confirm this fact for me?)


That's why when I actually run the script on a full set of data, it is a
huge piece of script post concatenation, marco expansion, var substitution,
etc., etc., etc... rendering it even more like assembly language. I mean the
alias names are like


temp_count_group_by_group_b_sales_by_metro_month_1

And then I freak out when I see something like the 2998 error. My eyes glaze
over, and I'm like, okay, it's time to email pig-user group... And to answer
the other question, I eventually got to the duplicate alias after rebuilding
everything yet again and redeploying. My current hypothesis is that maybe an
scp failed and I didn't see the error message and one of our own libraries
was corrupt ?

It might have been an UDF that we're using that crashed it...

if only I had the stacktrace...




On Thu, Sep 30, 2010 at 2:02 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:

> Yeah, sure, as soon as we all quit our respective jobs and start a support
> company :)
>
> (on a more serious note, improving the error messages is a big item for Pig
> 0.9)
>
> (also, good god, 60k line Pig script. Look into workflow management tools
> that allow you to avid creating such monoliths).
>
> On Wed, Sep 29, 2010 at 8:18 PM, hc busy <hc.b...@gmail.com> wrote:
>
> > "null" was the error. this 60k PigLatin script that I'm running hasn't
> > changed that much, but suddenly started erroring out. I've rebuilt pig
> > release 7 from scratch, checked java version, err... checked PiggyBank
> and
> > our own libraries, not there changed.
> >
> > You know, some comercial software that has professional support will
> > actually send emails when an "unhandled error" occurs. The email is
> > received
> > by the developer/support and diagnosed. And in that email, it would
> contain
> > all the gory details that the product doesn't want to display to the
> user.
> >
> >
> > I wonder if you guys are up to doing something like that for pig?
> >
> >
> >
> > On Wed, Sep 29, 2010 at 8:13 PM, Jeff Zhang <zjf...@gmail.com> wrote:
> >
> > > No other stack trace ? And in what situation does this happen ?
> > >
> > >
> > >
> > > On Thu, Sep 30, 2010 at 11:09 AM, hc busy <hc.b...@gmail.com> wrote:
> > > > Guys, I'm seeing this one
> > > >
> > > > 2998
> > > >
> > > > Unexpected internal error.
> > > >
> > > >
> > > > Can we be more specific or dump a stack trace when this happens?
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards
> > >
> > > Jeff Zhang
> > >
> >
>

Reply via email to