[ 
https://issues.apache.org/jira/browse/CRUNCH-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555729#comment-14555729
 ] 

Gabriel Reid commented on CRUNCH-525:
-------------------------------------

The changes to CompositeMapFn and ExtractKeyFn look good to me, but I don't 
fully get the logic for the PairMapFn scale factor (max of the key and value 
scale factors). 

I assume it's impossible to do something that is really correct for this 
calculation, but my first guess would to be something like the mean of the key 
and value MapFn scale factors (which is probably even less correct). I also 
realize that I'm totally bike-shedding by bringing this up. :-)

> The ExtractKeyFn is has an incorrect scale factor
> -------------------------------------------------
>
>                 Key: CRUNCH-525
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-525
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.12.0
>            Reporter: Stephen Patel
>            Assignee: Josh Wills
>            Priority: Minor
>         Attachments: CRUNCH-525.patch
>
>
> The ExtractKeyFn[0] used by the by[1] method of the PCollectionImpl is using 
> the default scale factor for a MapFn (1.0).  It should be using 1.0 + the 
> scale factor of the wrapped MapFn, in order to be accurate.
> [0]: 
> https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/fn/ExtractKeyFn.java
> [1]: 
> https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/impl/dist/collect/PCollectionImpl.java#L270



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to