Not that I am aware of, perhaps one of the others can chime in. What are you trying to unblock though? The only action that I see happening after your exec statement immediately requires out1 to have been generated.
Do you also have some D that you generate independently of out1, but deriving it from B? If yes, but B=load 'data2'; D= FILTER B (or whaterver) before the exec. Better yet -- avoid such dependencies :-). -D On Tue, Feb 16, 2010 at 1:21 AM, prasenjit mukherjee < [email protected]> wrote: > Still confusing. So the entire execution ( from B = LOAD ..... > onwards ) will be blocked till 'out1' is stored ? > > > STORE A INTO 'out1'; > > EXEC; > > B = LOAD 'data2'; > > C = FOREACH B GENERATE MYUDF($0,'out1'); > > STORE C INTO 'out2'; > > Any way to restrict it to a particular block ? > > -Prasen > > > On Tue, Feb 16, 2010 at 2:35 PM, Dmitriy Ryaboy <[email protected]> > wrote: > > It's been in the docs since 0.3 > > > > http://hadoop.apache.org/pig/docs/r0.3.0/piglatin.html > > Implicit Dependencies > > > > If a script has dependencies on the execution order outside of what Pig > > knows about, execution may fail. For instance, in this script MYUDF might > > try to read from out1, a file that A was just stored into. However, Pig > does > > not know that MYUDF depends on the out1 file and might submit the jobs > > producing the out2 and out1 files at the same time. > > > > ... > > STORE A INTO 'out1'; > > B = LOAD 'data2'; > > C = FOREACH B GENERATE MYUDF($0,'out1'); > > STORE C INTO 'out2'; > > > > To make the script work (to ensure that the right execution order is > > enforced) add the exec statement. The exec statement will trigger the > > execution of the statements that produce the out1 file. > > > > ... > > STORE A INTO 'out1'; > > EXEC; > > B = LOAD 'data2'; > > C = FOREACH B GENERATE MYUDF($0,'out1'); > > STORE C INTO 'out2'; > > > > > > > > On Tue, Feb 16, 2010 at 12:46 AM, Mridul Muralidharan < > [email protected]> > > wrote: > >> > >> Is this documented behavior or current impl detail ? > >> A lot of scripts broke when multi-query optimization was committed to > > trunk > >> because of the implicit ordering assumption (based on STORE) in earlier > > pig > >> - which was, iirc, documented. > >> > >> > >> Regards, > >> Mridul > >> > >> On Thursday 11 February 2010 10:52 PM, Dmitriy Ryaboy wrote: > >>> > >>> EXEC will trigger execution of the code that precedes it. > >>> > >>> > >>> > >>> On Thu, Feb 11, 2010 at 9:12 AM, prasenjit mukherjee > >>> <[email protected]> wrote: > >>>> > >>>> Is there any way I can have a pig statement wait for a condition.This > >>>> is what I am trying to do : I am first creating and storing a > >>>> relation in pig, and then I want to upload that relation via > >>>> STREAM/DEFINE command. Here is the pig script I am tryign to write : > >>>> > >>>> ......... > >>>> STORE r1 INTO 'myoutput.data' > >>>> STREAM 'myfile_containing_output_dat.txt' THRUGH `upload.py` > >>>> > >>>> Any way I can acheive this ? > >>>> > >>>> -Prasen > >>>> > >> > >> > > >
