Not that I am aware of, perhaps one of the others can chime in.

What are you trying to unblock though? The only action that I see happening
after your exec statement immediately requires out1 to have been generated.

Do you also have some D that you generate independently of out1, but
deriving it from B? If yes, but B=load 'data2'; D= FILTER B (or whaterver)
before the exec.

Better yet -- avoid such dependencies :-).

-D

On Tue, Feb 16, 2010 at 1:21 AM, prasenjit mukherjee <
[email protected]> wrote:

> Still confusing.  So the entire execution ( from B = LOAD .....
> onwards ) will be blocked till 'out1' is stored ?
>
> > STORE A INTO 'out1';
> > EXEC;
> > B = LOAD 'data2';
> > C = FOREACH B GENERATE MYUDF($0,'out1');
> > STORE C INTO 'out2';
>
> Any way to restrict it to a particular block ?
>
> -Prasen
>
>
> On Tue, Feb 16, 2010 at 2:35 PM, Dmitriy Ryaboy <[email protected]>
> wrote:
> > It's been in the docs since 0.3
> >
> > http://hadoop.apache.org/pig/docs/r0.3.0/piglatin.html
> > Implicit Dependencies
> >
> > If a script has dependencies on the execution order outside of what Pig
> > knows about, execution may fail. For instance, in this script MYUDF might
> > try to read from out1, a file that A was just stored into. However, Pig
> does
> > not know that MYUDF depends on the out1 file and might submit the jobs
> > producing the out2 and out1 files at the same time.
> >
> > ...
> > STORE A INTO 'out1';
> > B = LOAD 'data2';
> > C = FOREACH B GENERATE MYUDF($0,'out1');
> > STORE C INTO 'out2';
> >
> > To make the script work (to ensure that the right execution order is
> > enforced) add the exec statement. The exec statement will trigger the
> > execution of the statements that produce the out1 file.
> >
> > ...
> > STORE A INTO 'out1';
> > EXEC;
> > B = LOAD 'data2';
> > C = FOREACH B GENERATE MYUDF($0,'out1');
> > STORE C INTO 'out2';
> >
> >
> >
> > On Tue, Feb 16, 2010 at 12:46 AM, Mridul Muralidharan <
> [email protected]>
> > wrote:
> >>
> >> Is this documented behavior or current impl detail ?
> >> A lot of scripts broke when multi-query optimization was committed to
> > trunk
> >> because of the implicit ordering assumption (based on STORE) in earlier
> > pig
> >> - which was, iirc, documented.
> >>
> >>
> >> Regards,
> >> Mridul
> >>
> >> On Thursday 11 February 2010 10:52 PM, Dmitriy Ryaboy wrote:
> >>>
> >>> EXEC will trigger execution of the code that precedes it.
> >>>
> >>>
> >>>
> >>> On Thu, Feb 11, 2010 at 9:12 AM, prasenjit mukherjee
> >>> <[email protected]>  wrote:
> >>>>
> >>>> Is there any way I can have a pig statement wait for a condition.This
> >>>> is what I am trying to do :  I am first creating and storing a
> >>>> relation in pig, and then I want to upload that relation via
> >>>> STREAM/DEFINE command. Here is the pig script I am tryign to write :
> >>>>
> >>>> .........
> >>>> STORE r1 INTO 'myoutput.data'
> >>>> STREAM 'myfile_containing_output_dat.txt' THRUGH `upload.py`
> >>>>
> >>>> Any way I can acheive this ?
> >>>>
> >>>> -Prasen
> >>>>
> >>
> >>
> >
>

Reply via email to