Hello Zoltan Chovan, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763),
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/21127
to look at the new patch set (#4).
Change subject: WIP [ARM] Concurrent binary tree memory fences fixed.
......................................................................
WIP [ARM] Concurrent binary tree memory fences fixed.
TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM.
The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart
from incorrect order, there are no other errors in the tree. All inserted
elements are present, but CBTree::GetCopy cannot find them due to the incorrect
ordering.
This is not a unit test error, but a real error in the CBTree implementation.
Modifying the test to only do the inserts first, and only start checking when
the tree is finalized does not prevent the bug. So calling only the tree->Insert
function from multiple threads is sufficient to reproduce the issue (only on
ARM).
WIP cause: This change fixes the problem, and can be the final solution.
However, I would also consider changing the version field to std::atomic and
keep using the same barriers as now (not the default memory_order_seq_cst).
Either way, we need to agree on the proposed changes of the memory barriers
then I can also make a version with using std::atomic.
Root cause:
Some memory order restrictions are not strict enough. It does not cause a
problem on x86 (There were no real changes for 11 years), but it causes a
problems on ARM. x86 guarantees a globally consistent order for stores (TSO).
ARM, in contrast, allows stores to different memory locations to be observed
differently from the program order.
More info:
https://www.sciencedirect.com/science/article/pii/S1383762124000390
The following 6 barriers need to be more strict:
1. When we set the Splitting / Inserting flag on a node, then it is not enough
to flush all previous changes. It is also very important not to reorder any
write before it. So instead of Release_Store (which is practically equivalent
to std::memory_order_release), we need a 2 way barrier.
2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we
check if the version is unchanged, then a load_acquire (equivalent to
std::memory_order_acquire) is not sufficient. We have to also make sure no
read is reordered in the other direction.
Currently adding only the 2 barriers under (1.) is sufficient to stabilize the
test. However the constraints imposed by load_acquire under (2.) are also
insufficient in theory.
Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
---
M src/kudu/tablet/concurrent_btree.h
1 file changed, 9 insertions(+), 0 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/4
--
To view, visit http://gerrit.cloudera.org:8080/21127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
Gerrit-Change-Number: 21127
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka <[email protected]>
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>