Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16777 )

Change subject: KUDU-3108: fix invalid memory accesses in merge iterator
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16777/6/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/16777/6/src/kudu/common/generic_iterators.cc@556
PS6, Line 556:   typedef std::set<MergeIterState*, 
MergeIterStateSmallestMaxComparator> MergeStateOrderedSet;
> Ah, indeed: in this version (with the fix) there is an extra constraint, bu
I ran the tablet_scan tool (local replica scan, nothing transferred over the 
wire) on a 65GiB replica with 1667 rowsets and an average rowset height of 5. I 
tried with the two-heap variant, two-heap + std::set, and original three-heap 
variant, and averaged over 5 runs:

2 heaps: 1048.57s
2 heaps + set: 1076.45s
3 heaps: 1166.96s

I'm not sure (didn't profile anything), but I think the difference is that 
because the three-heap variant keeps the hot set optimally small, there is more 
movement of merge iterators to and from the cold heap (ie we might be hitting 
the L708 branch more, and pushing to cold_ might be more expensive than pushing 
to hot_ and hotmaxes_).

All told, I'm still leaning towards moving forward with the two-heap variant, 
given it seems to consistently outperform the others, even if it's suboptimal 
from a hot set sizing perspective.



--
To view, visit http://gerrit.cloudera.org:8080/16777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8ec1cd3fd67ec4ea92a55b5b0ce555123748824d
Gerrit-Change-Number: 16777
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Lieber-Dembo <a...@apache.org>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ban...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Thu, 26 Nov 2020 19:42:59 +0000
Gerrit-HasComments: Yes

Reply via email to