[ 
https://issues.apache.org/jira/browse/PIG-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591053#action_12591053
 ] 

Alan Gates commented on PIG-161:
--------------------------------

Shubham wrote - "I am assuming that the MapReduce compiler would convert it to 
a Cogroup and a Foreach statement for MapReduce job."

Alan - I don't think that's what we want.  Distinct can be done much more 
efficiently if we use the combiner.  So

a = load ...
b = distinct a;

should translate to  

map - identity
combiner - apply unique function
group all
reducer - apply unique function

Alan.


> Rework physical plan
> --------------------
>
>                 Key: PIG-161
>                 URL: https://issues.apache.org/jira/browse/PIG-161
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: arithmeticOperators.patch, incr2.patch, incr3.patch, 
> incr4.patch, incr5.patch, MRCompilerTests_PlansAndOutputs.txt, 
> Phy_AbsClass.patch, podistinct.patch, pogenerate.patch, pogenerate.patch, 
> pogenerate.patch, posort.patch
>
>
> This bug tracks work to rework all of the physical operators as described in 
> http://wiki.apache.org/pig/PigTypesFunctionalSpec

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to