Adar Lieber-Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/16777 )
Change subject: KUDU-3108: fix invalid memory accesses in merge iterator ...................................................................... Patch Set 3: (6 comments) Good find, thanks for this fix. Do we need to backport this to past releases and/or issue any sort of advisory regarding incremental backups? Could you rerun some of the microbenchmarks I listed in past commit messages to see whether changing hotmaxes from a heap to a set had any impact on perf? I don't think it has, (insert/remove are still O(log n), begin is still O(1)) but would be good to confirm. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc File src/kudu/common/generic_iterators.cc: http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@129 PS3, Line 129: if (read_block_) { Should we defensively consider fully_exhausted_ here and in last_row()? http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@259 PS3, Line 259: We prefer : // setting this to true instead of, e.g. resetting 'read_block_' since : // 'read_block_' may be referenced in our heaps, and thus, may not be safely : // cleared if our heaps still refer to its memory. Not sure I understand this: isn't one of our invariants that if a block is exhausted, it's not in any heap? http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@359 PS3, Line 359: // Three different heaps are used to optimize the merge process. To explain how You should update all references to "three heaps" to indicate the new use of hotmaxes. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@564 PS3, Line 564: typedef std::set<MergeIterState*, MergeIterStateSmallestMaxComparator> MergeStateMinSet; There was a short block comment here previously that explained what RowMinHeap did; could you write an equivalent comment for MergeStateMinSet? http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@578 PS3, Line 578: // The three heaps as described in the algorithm above. Update here as well. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@633 PS3, Line 633: initted_ = true; This is effectively the same as the code before, right? RefillHotHeap() is a no-op when states_ is empty? -- To view, visit http://gerrit.cloudera.org:8080/16777 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ec1cd3fd67ec4ea92a55b5b0ce555123748824d Gerrit-Change-Number: 16777 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Adar Lieber-Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 24 Nov 2020 20:59:48 +0000 Gerrit-HasComments: Yes
