[jira] [Commented] (FLINK-3291) Object reuse bug in MergeIterator.HeadStream.nextHead

Greg Hogan (JIRA) Fri, 12 Feb 2016 07:49:04 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144732#comment-15144732
 ]


Greg Hogan commented on FLINK-3291:
-----------------------------------

I started looking at `MergeIterator` last year since I had object reuse enabled 
and was still seeing a flood of newly created objects. The complex code was 
already written, but tying it together and passing the configuration through 
resulted in ~5% reduction in overall runtime for the algorithms I was 
benchmarking. So clearly this is a performance critical section of the code, 
and as evidenced by this ticket a source of potential errors.

[~ggevay], at this point are the changes to `MergeIterator` fixing a bug? Do 
you want to fix up and clarify the documentation for `MutableObjectIterator` 
and verify the implementing classes?

My only question on [~StephanEwen]'s comments would be, isn't copying sometimes 
necessary in non-reusing mode but avoidable when reusing objects? Copies would 
be required any time an object is presented to the user multiple times, as in 
joins / cross / cogroup.

[~StephanEwen]'s comment on cheap vs expensive copies would be excellent for 
the user documentation. As I understand, if you're making expensive copies 
(with object reuse enabled) in a user defined function, you may be better off 
disabling object reuse and allowing the deserialization to use a fresh object 
instead.

> Object reuse bug in MergeIterator.HeadStream.nextHead
> -----------------------------------------------------
>
>                 Key: FLINK-3291
>                 URL: https://issues.apache.org/jira/browse/FLINK-3291
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Runtime
>    Affects Versions: 1.0.0
>            Reporter: Gabor Gevay
>            Assignee: Gabor Gevay
>            Priority: Critical
>
> MergeIterator.HeadStream.nextHead saves a reference into `this.head` of the 
> `reuse` object that it got as an argument. This object might be modified 
> later by the caller.
> This actually happens when ReduceDriver.run calls input.next (which will 
> actually be MergeIterator.next(E reuse)) in the inner while loop of the 
> objectReuseEnabled branch, and that calls top.nextHead with the reference 
> that it got from ReduceDriver, which erroneously saves the reference, and 
> then ReduceDriver later uses that same object for doing the reduce.
> Another way in which this fails is when MergeIterator.next(E reuse) gives 
> `reuse` to different `top`s in different calls, and then the heads end up 
> being the same object.
> You can observe the latter situation in action by running ReducePerformance 
> here:
> https://github.com/ggevay/flink/tree/merge-iterator-object-reuse-bug
> Set memory to -Xmx200m (so that the MergeIterator actually has merging to 
> do), put a breakpoint at the beginning of MergeIterator.next(reuse), and then 
> watch `reuse`, and the heads of the first two elements of `this.heap` in the 
> debugger. They will get to be the same object after hitting continue about 6 
> times.
> You can also look at the count that is printed at the end, which shouldn't be 
> larger than the key range. Also, if you look into the output file 
> /tmp/xxxobjectreusebug, for example the key 999977 appears twice.
> The good news is that I think I can see an easy fix that doesn't affect 
> performance: MergeIterator.HeadStream could have a reuse object of its own as 
> a member, and give that to iterator.next in nextHead(E reuse). And then we 
> wouldn't need the overload of nextHead that has the reuse parameter, and 
> MergeIterator.next(E reuse) could just call its other overload.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3291) Object reuse bug in MergeIterator.HeadStream.nextHead

Reply via email to