Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/PigUserCookbook

------------------------------------------------------------------------------
  dump C; 
  }}}
  
- In pig 1.x, DISTINCT is just GROUP BY/PROJECT under the hood. In pig 0.2.0 it 
is not, and it is much faster and more efficient (depending on your key 
cardinality, up to 20x faster in pig team's tests). Therefore, the use of 
DISTINCT is recommended over GROUP BY - GENERATE. 
+ In pig 0.1.x, DISTINCT is just GROUP BY/PROJECT under the hood. In pig 0.2.0 
it is not, and it is much faster and more efficient (depending on your key 
cardinality, up to 20x faster in pig team's tests). Therefore, the use of 
DISTINCT is recommended over GROUP BY - GENERATE. 
  

Reply via email to