[ 
https://issues.apache.org/jira/browse/CRUNCH-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel Reid updated CRUNCH-71:
-------------------------------

    Attachment: CRUNCH-71.patch

Patch to resolve the issue attached.

I'm not 100% happy with this solution, as I would prefer that the PType would 
be supplied to the DoFn at runtime instead of the DoFn being responsible for 
calling PType#initialize.

However, that approach could bring a lot of extra work along with it in job 
setup as there is not a 1-to-1 relationship between DoFns and PTypes. 

As object reuse issues are pretty isolated in MR contexts (joins are the main 
place where I see them occurring) then this fix feels ok to me for now. Any 
objections to this patch?
                
> PType mapping functions are not initialized before being used for deep copying
> ------------------------------------------------------------------------------
>
>                 Key: CRUNCH-71
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-71
>             Project: Crunch
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>         Attachments: CRUNCH-71.patch
>
>
> The PType#getDetachedValue method performs a deep copy (if needed) in order 
> to allow DoFns to hold on to values that have been passed through them (for 
> example, in join functions).
> The WritablePType class uses the built-in input and output MapFns in the 
> PType to handle this deep copying, but the input and output MapFns don't get 
> initialized (i.e. initialize isn't called on them) after they are 
> deserialized along with the DoFn that is using them. In some rare cases (at 
> least for tuples), this can result in NullPointerExceptions or other 
> nastiness.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to