Re: does anyone have idea on how to run multiple sequential jobs with bash script

Ted Dunning Wed, 11 Jun 2008 17:07:47 -0700

Pig is much more ambitious than cascading.  Because of the ambitions, simple
things got overlooked.  For instance, something as simple as computing a
file name to load is not possible in pig, nor is it possible to write
functions in pig.  You can hook to Java functions (for some things), but you
can't really write programs in pig.  On the other hand, pig may eventually
provide really incredible capabilities including program rewriting and
optimization that would be incredibly hard to write directly in Java.

The point of cascading was simply to make life easier for a normal
Java/map-reduce programmer.  It provides an abstraction for gluing together
several map-reduce programs and for doing a few common things like joins.
Because you are still writing Java (or Groovy) code, you have all of the
functionality you always had.  But, this same benefit costs you the future
in terms of what optimizations are likely to ever be possible.

The summary for us (especially 4-6 months ago when we were deciding) is that
cascading is good enough to use now and pig will probably be more useful
later.

On Wed, Jun 11, 2008 at 4:19 PM, Haijun Cao <[EMAIL PROTECTED]> wrote:

>
> I find cascading very similar to pig, do you care to provide your comment
> here? If map reduce programmers are to go to the next level (scripting/query
> language), which way to go?
>
>
>

Re: does anyone have idea on how to run multiple sequential jobs with bash script

Reply via email to