[ https://issues.apache.org/jira/browse/CRUNCH-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555729#comment-14555729 ]
Gabriel Reid commented on CRUNCH-525: ------------------------------------- The changes to CompositeMapFn and ExtractKeyFn look good to me, but I don't fully get the logic for the PairMapFn scale factor (max of the key and value scale factors). I assume it's impossible to do something that is really correct for this calculation, but my first guess would to be something like the mean of the key and value MapFn scale factors (which is probably even less correct). I also realize that I'm totally bike-shedding by bringing this up. :-) > The ExtractKeyFn is has an incorrect scale factor > ------------------------------------------------- > > Key: CRUNCH-525 > URL: https://issues.apache.org/jira/browse/CRUNCH-525 > Project: Crunch > Issue Type: Bug > Components: Core > Affects Versions: 0.12.0 > Reporter: Stephen Patel > Assignee: Josh Wills > Priority: Minor > Attachments: CRUNCH-525.patch > > > The ExtractKeyFn[0] used by the by[1] method of the PCollectionImpl is using > the default scale factor for a MapFn (1.0). It should be using 1.0 + the > scale factor of the wrapped MapFn, in order to be accurate. > [0]: > https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/fn/ExtractKeyFn.java > [1]: > https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/impl/dist/collect/PCollectionImpl.java#L270 -- This message was sent by Atlassian JIRA (v6.3.4#6332)