Adar Dembo created KUDU-2807:
--------------------------------
Summary: Possible crash when flush or compaction overlaps with
another compaction
Key: KUDU-2807
URL: https://issues.apache.org/jira/browse/KUDU-2807
Project: Kudu
Issue Type: Bug
Components: tablet
Affects Versions: 1.9.0
Reporter: Adar Dembo
Manuel Sopena reported a crash like this in Slack:
{noformat}
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F0429 07:26:56.918041 34043 tablet.cc:2268] Check failed: lock.owns_lock()
RowSet(24130) unable to lock compact_flush_lock
{noformat}
It's hard to say exactly what's going on without more logging, but after
looking at the code in more detail, I think the culprit is [this
commit|https://github.com/apache/kudu/commit/d3684a7b2add8f06b7189adb9ce9222b8ae1eff5],
new in Kudu 1.9.0. To understand why it's problematic, we first need to
understand the locking invariant in play:
# A thread must acquire the tablet's compact_select_lock_ in order to select
rowsets to compact.
# Because of #1, it's safe to assume that, if a thread successfully acquired a
rowset's compact_flush_lock_ in the act of selecting it for compaction, it can
release and reacquire the lock without contention. More precisely, it can
release the compact_flush_lock_, then try-lock it, and the try-lock is
guaranteed to succeed. All compacting MM ops use a CHECK to enforce this
invariant.
With that in mind, here's the problem: at the time that the call to
{{RowSetInfo::ComputeCdfAndCollectOrdered}} is made from
{{Tablet::AtomicSwapRowSetsUnlocked}}, the tablet's compact_select_lock_ is not
held. {{ComputeCdfAndCollectOrdered}} calls
{{RowSet::IsAvailableForCompaction}}, which try-locks the per-rowset
compact_flush_lock_. As a result, it's possible for a racing MM operation to
also call {{IsAvailableForCompaction}}, successfully try-lock the
compact_flush_lock_, release it, try-lock it again (as per the invariant
above), fail, and crash in the aforementioned CHECK.
I don't think this can result in corruption as we crash rather than allowing
the MM op to proceed. But it's a bad race and a bad crash, so we should fix it.
Possibly producing a 1.9.1 release in the process.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)