[
https://issues.apache.org/jira/browse/CRUNCH-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabriel Reid reopened CRUNCH-183:
---------------------------------
Reopened due to failing unit tests.
Not sure how that happened, as I was sure that I ran the tests before
committing, but obviously I didn't or I missed a failed test -- sorry about
that.
> Reservoir sampling functions don't take object reuse into account
> -----------------------------------------------------------------
>
> Key: CRUNCH-183
> URL: https://issues.apache.org/jira/browse/CRUNCH-183
> Project: Crunch
> Issue Type: Bug
> Reporter: Gabriel Reid
> Assignee: Gabriel Reid
> Fix For: 0.6.0
>
> Attachments: CRUNCH-183.patch
>
>
> ReservoirSampleFn and WRSCombineFn in o.a.c.lib.SampleUtils both hold onto
> references of processed values, but don't make deep copies of them. For
> complex objects such as Avro objects, this leads to incorrect results, with
> the same value being returned for all samples.
> This can be resolved by making use of PType#getDetachedValue before storing a
> reference to the object.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira