Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16777 )
Change subject: KUDU-3108: fix invalid memory accesses in merge iterator ...................................................................... Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc File src/kudu/common/generic_iterators.cc: http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@129 PS3, Line 129: DCHECK_LT(next_row_idx_, read_block_->nrows()); > Should we defensively consider fully_exhausted_ here and in last_row()? Good thought, though I've removed fully_exhausted_ entirely. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@259 PS3, Line 259: lock) { : DCHECK_GE(read_block_->nrows(), next_row_idx_ + num_rows); : : next_row_idx_ += num_rows; > Not sure I understand this: isn't one of our invariants that if a block is Good catch. I'd introduced 'fully_exhausted_' before it became evident that this was an invariant. This invariant's changed a bit in the latest revision: now, if a block is exhausted, we cannot perform any actions that leverage either next_row() or last_row(). http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@359 PS3, Line 359: // > You should update all references to "three heaps" to indicate the new use o Done http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@564 PS3, Line 564: return a->schema().Compare(a->next_row(), b->next_row()) > 0; > There was a short block comment here previously that explained what RowMinH Done http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@578 PS3, Line 578: // Boost offers a variety of different heap data structures[1]. Perf testing > Update here as well. Done http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@633 PS3, Line 633: if (opts_.include_deleted_rows && > This is effectively the same as the code before, right? RefillHotHeap() is Right, RefillHotHeap() would have no-oped since 'cold' would be empty. That said, it seems this optimization is unneeded for correctness so I'll remove it. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/common/generic_iterators.cc@932 PS3, Line 932: unique_ptr<RowwiseIterator> NewMergeIterator( : MergeIteratorOptions opts, vector<IterWithBounds> iters) { : return unique_ptr<RowwiseIterator>(new MergeIterator(opts, std::move(iters))); : } > With the short-cut introduced in MergeIterator::Init(), would this crash wh Good catch. I've found that the update to Init() isn't necessary for correctness, so I've reverted it. http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/tablet/diff_scan-test.cc File src/kudu/tablet/diff_scan-test.cc: http://gerrit.cloudera.org:8080/#/c/16777/3/src/kudu/tablet/diff_scan-test.cc@103 PS3, Line 103: // Define our diff scan to st > Does it mean that before adding this option the scan wasn't a 'DiffScan' pe It was previously a diff scan, just from -Inf. I just found it odd we were defining snap1 but not using it, so I used it assuming that was the original intention. Done -- To view, visit http://gerrit.cloudera.org:8080/16777 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8ec1cd3fd67ec4ea92a55b5b0ce555123748824d Gerrit-Change-Number: 16777 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Adar Lieber-Dembo <a...@apache.org> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 25 Nov 2020 02:54:04 +0000 Gerrit-HasComments: Yes