Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by GuntherHagleitner:
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

------------------------------------------------------------------------------
  
  attachment:mapreduce.png
  
+ [[Anchor(Store_load_bridge)]]
+ === Store-load sequences ===
+ 
+ If a script stores and loads from the same file in a script, some special 
processing takes place
+ to ensure that the jobs are executed in the right sequence.
+ 
+ ==== Reversible LoadStoreFunc ====
+ 
+ If the store and load are processed using the same function and the 
LoadStoreFunc is reversible,
+ the store is processed, but the load is removed from the plan. Instead the 
parent of the store is
+ used as input for the dependent processing nodes.
+ 
+ The script:
+ 
+ {{{
+ A = load 'page_views';
+ store A into 'tmp1' using PigStorage();
+ B = load 'tmp1' using PigStorage();
+ C = filter B by $0 is not null;
+ store C into 'tmp2';
+ }}}
+ 
+ Will result in the following logical plan:
+ 
+ attachment:load-store-rev.png
+ 
+ If on the other side different load and store functions are used or the 
function is not reversible,
+ the store and load will connected in the logical plan and eventually will 
result in 2 jobs running 
+ in sequence.
+ 
+ The script:
+ 
+ {{{
+ A = load 'page_views';
+ store A into 'tmp1' using PigStorage();
+ B = load 'tmp1' using BinStorage();
+ C = filter B by $0 is not null;
+ store C into 'tmp2';
+ }}}
+ 
+ Will result in the following logical plan:
+ 
+ attachment:load-store-non.png
+ 
+ [[Anchor(File_commands)]]
+ === File commands ===
+ 
+ Commands like rm, rmf, mv, copyToLocal and copy will trigger execution of all 
the stores that
+ were defined before the command. This is done so that we can make sure that 
the targets of these
+ commands will be there.
+ 
+ For instance:
+ 
+ {{{
+ A = load 'foo';
+ store A into 'bar';
+ mv bar baz;
+ rm foo;
+ A = load 'baz';
+ store A into 'foo';
+ }}}
+ 
+ Will result in a job that produces bar, then the mv and rm are executed. 
Finally, another job
+ is run that will generate foo.
+ 
  [[Anchor(Phases)]]
  == Phases ==
  

Reply via email to