Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/PigStreamingFunctionalSpec

------------------------------------------------------------------------------
  
  ==== 5.1 Load/Stream and Stream/Store optimizations ====
  
- In cases where the STREAM operator immediately follows the LOAD or where it 
directly precedes the STORE operator, and given that they have the '''same''' 
LoadFunc/StoreFunc specifications Pig will try and optimize away the 
interpretation of data in the LoadFunc/StoreFunc (i.e. need to breakup raw 
input into ''Tuples'') by substituting the equivalent {Load|Store}Funcs for 
!BinaryStorage. For the LOAD/STREAM case the caveat is that this is feasible 
only when individual tasks are processing all of the data in the given input 
file (i.e. the split by 'file' option is specified to the LOAD operator).
+ In cases where the STREAM operator immediately follows the LOAD or where it 
directly precedes the STORE operator, and given that they have the '''same''' 
LoadFunc/StoreFunc specifications Pig will try and optimize away the 
interpretation of data in the LoadFunc/StoreFunc (i.e. need to breakup raw 
input into ''Tuples'') by substituting the equivalent {Load|Store}Funcs for 
!BinaryStorage. For the LOAD/STREAM case the caveat is that this is feasible 
only when individual tasks are processing all of the data in the given input 
file (i.e. the split by 'file' option is specified to the LOAD operator). *As 
the result, for optimization to take place `split by 'file'` must be specified 
on the load statement.*
  
  E.g.
  Pig will optimize:

Reply via email to