Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Lieber-Dembo,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16777

to look at the new patch set (#6).

Change subject: KUDU-3108: fix invalid memory accesses in merge iterator
......................................................................

KUDU-3108: fix invalid memory accesses in merge iterator

The 'hotmaxes_' heap and 'hot_' heap in the MergeIterator are meant to
represent the same set of iterator states currently deemed "hot" in the
optimal merge algorithm[1]. They used different ordering constraints to
allow constant time access to the smallest last row and the smallest
next row across all hot states respectively. However, we were using
pop() on both heaps when iterator states were no longer deemed hot,
incorrectly expecting that the call would remove the same iterator state
from both heaps.

In the case that this pop() was followed by the destruction of the
sub-iterator (e.g. if the iterator was fully exhausted), this would
leave an iterator state in the heaps that pointed at destructed state:

F1030 21:16:58.411253 40800 schema.h:706] Check failed: 
KeyEquals(*lhs.schema()) && KeyEquals(*rhs.schema())
*** Check failure stack trace: ***
*** Aborted at 1604117818 (unix time) try "date -d @1604117818" if you are 
using GNU date ***
PC: @     0x7f701fcf11d7 __GI_raise
*** SIGABRT (@0x111700009efd) received by PID 40701 (TID 0x7f6ff0f47700) from 
PID 40701; stack trace: ***
    @     0x7f7026a70370 (unknown)
    @     0x7f701fcf11d7 __GI_raise
    @     0x7f701fcf28c8 __GI_abort
    @     0x7f70224377b9 google::logging_fail()
    @     0x7f7022438f8d google::LogMessage::Fail()
    @     0x7f702243aee3 google::LogMessage::SendToLog()
    @     0x7f7022438ae9 google::LogMessage::Flush()
    @     0x7f702243b86f google::LogMessageFatal::~LogMessageFatal()
    @     0x7f702cc99fbc kudu::Schema::Compare<>()
    @     0x7f7026167cfd kudu::MergeIterator::RefillHotHeap()
    @     0x7f7026167357 kudu::MergeIterator::AdvanceAndReheap()
    @     0x7f7026169617 kudu::MergeIterator::MaterializeOneRow()
    @     0x7f70261688e9 kudu::MergeIterator::NextBlock()
    @     0x7f702cbddd9b kudu::tablet::Tablet::Iterator::NextBlock()
    @     0x7f70317bcab3 
kudu::tserver::TabletServiceImpl::HandleContinueScanRequest()
    @     0x7f70317bb857 
kudu::tserver::TabletServiceImpl::HandleNewScanRequest()
    @     0x7f70317b464e kudu::tserver::TabletServiceImpl::Scan()
    @     0x7f702ddfd762 
_ZZN4kudu7tserver21TabletServerServiceIfC1ERK13scoped_refptrINS_12MetricEntityEERKS2_INS_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_
    @     0x7f702de0064d 
_ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_7tserver21TabletServerServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeERKSt9_Any_dataS4_S5_S9_
    @     0x7f702b4ddcc2 std::function<>::operator()()
    @     0x7f702b4dd6ed kudu::rpc::GeneratedServiceIf::Handle()
    @     0x7f702b4dfff8 kudu::rpc::ServicePool::RunThread()
    @     0x7f702b4de8c5 _ZZN4kudu3rpc11ServicePool4InitEiENKUlvE_clEv
    @     0x7f702b4e0337 
_ZNSt17_Function_handlerIFvvEZN4kudu3rpc11ServicePool4InitEiEUlvE_E9_M_invokeERKSt9_Any_data
    @     0x7f7033524b9c std::function<>::operator()()
    @     0x7f70248227e0 kudu::Thread::SuperviseThread()
    @     0x7f7026a68dc5 start_thread
    @     0x7f701fdb376d __clone
Aborted

This patch switches the 'hotmaxes_' min heap to a set, using the same
ordering, but disambiguating between iterator states via pointer
comparison.

The switch to using a set means that extra care has to be taken when
removing elements. Whereas heap::pop() can be called safely without any
calls to the comparator, a value-based set::erase() uses the comparator
to find and then remove a specified element. This is problematic in our
calls to AdvanceAndReheap(), which call Advance() on the iterator state,
potentially invalidating the state's last_row() value and making
subsequent calls to erase() fail.

I experimented with using an eager call to a value-based erase()
_before_ the call to Advance(), which is admittedly sub-optimal in cases
where we erase() and then insert() the value back into the set
immediately. That approach performed fairly poorly, so I went another
route of using a lazy position-based set::erase() after the call to
Advance(). This still incurs a call to set::find(), but at least it can
correctly erase() with the underlying RowBlock destroyed.

Here are some performance numbers, running the same test used in
1567dec086 (generic_iterators-test with 10000 rows per list) to compare
the following, averaged over five runs:
- a: this patch, which uses position-based erase() after calling Advance()
- b: an iteration of this patch that used value-based erase() before calling
     Advance()
- c: the original version that uses heap::pop() after calling Advance()

Parameters                  | a        | b       | c
----------------------------+----------+---------+---------
overlapping, 10 lists       | 0.059s   | 0.0744s | 0.0472s
overlapping, 100 lists      | 0.6726s  | 0.8876s | 0.4938s
overlapping, 1000 lists     | 15.5588s | 18.87s  | 10.3554s
non-overlapping, 10 lists   | 0.011s   | 0.0114s | 0.0106s
non-overlapping, 100 lists  | 0.0786s  | 0.0794s | 0.083s
non-overlapping, 1000 lists | 0.7824s  | 0.7346s | 0.7174s

Unsurprisingly, with many overlapping iterators, the usage of a set instead of
a heap does hurt performance by a noticeable (1.5x) factor. That said, the
performance hit is justified for correctness.

To exercise these codepaths more rigorously, I bumped fuzz-itest's
default keyspace size to 5. Some bugs (this one included) can only be
created when there are a mix of overlapping and non-overlapping rowsets,
which is impossible to achieve with the current keyspace size of 2.

[1] 
https://docs.google.com/document/d/1uP0ubjM6ulnKVCRrXtwT_dqrTWjF9tlFSRk0JN2e_O0/edit#

Change-Id: I8ec1cd3fd67ec4ea92a55b5b0ce555123748824d
---
M src/kudu/common/generic_iterators.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/tablet/diff_scan-test.cc
3 files changed, 140 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/77/16777/6
-- 
To view, visit http://gerrit.cloudera.org:8080/16777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8ec1cd3fd67ec4ea92a55b5b0ce555123748824d
Gerrit-Change-Number: 16777
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Lieber-Dembo <a...@apache.org>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

Reply via email to