Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21127
Change subject: iWIP [ARM] Concurrent binary tree memory fences fixed. ...................................................................... iWIP [ARM] Concurrent binary tree memory fences fixed. TestCBTree.TestRacyConcurrentInsert sometime fails on ARM. The concurrent inserts produce a tree with some misordered nodes. Apart from misordering, there are no other errors in the tree. All inserted elements are present, all edges are valid. This is not an unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue. WIP cause: This change fixes the problem, and can be the final solution. However I would also consider changing it to the std::atomic library (Also using the correct memory orders like now, not the default memory_order_seq_cst everyhere). Either way, we need to agree on the proposed changes of the the memory barriers then I can also make a version with using std::atomic. Root cause: Some memory order restrictions are not strigt enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes a problem on ARM. The following 6. barriest needs to be more strict: 1. When we set the Splittingi/Inserting flag on a node, then it is not enough to flush all or changes. It is also very important not to reorder any write before it. So instead of Release_Store we need a 2 way fence. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire is not sufficient. We have to also make sure no read is reordered in the other direction either. Currently adding only the 2 fences under (1.) is sufficient to stabilize the test. However the fences under (2.) are also wrong in theory. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/1 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka <[email protected]>
