[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 15: I updated the performance results. Both in the commit message and in the sheets. I decided to go with a 8 core intel for testing (c6i.2xlarge). I run the new test with 3 params: + 4-4 read/write to fit into the cores. + 16-16 read/write to see how much they sabotage each other. + 16-4 read/write with boosted reader x20 = 5 * 16/4, because reading threads seemed to take 0.2-0.3x time than a writing thread doing the same number of operations. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 15 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Fri, 07 Jun 2024 11:32:27 + Gerrit-HasComments: No
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 15: (7 comments) http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@45 PS6, Line 45: > nit: these Done http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@47 PS6, Line 47: eve > nit: C++ Done http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@72 PS6, Line 72: should be supported by both Clang and GCC. > How so? What's the size of LeafNode and InternalNode after removing the pa Packed is no longer removed in newer versions. It was a mistake to do so, since unaligned other structures are added into the same arena http://gerrit.cloudera.org:8080/#/c/21127/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/14//COMMIT_MSG@43 PS14, Line 43: versi > nit: formatting Done http://gerrit.cloudera.org:8080/#/c/21127/14//COMMIT_MSG@55 PS14, Line 55: lut > nit: were Done http://gerrit.cloudera.org:8080/#/c/21127/6/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: http://gerrit.cloudera.org:8080/#/c/21127/6/src/kudu/tablet/concurrent_btree.h@139 PS6, Line 139: * > nit: add space Done http://gerrit.cloudera.org:8080/#/c/21127/6/src/kudu/tablet/concurrent_btree.h@378 PS6, Line 378: > Just for my understanding, what is the difference between StableVersionAcqu StableVersion waits until a node is not locked (Only nodes under modification are locked), reading never locks. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 15 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Fri, 07 Jun 2024 11:25:54 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#15). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we call AcquireVersion or StableVersion to verify that the version was not changed during our (already completed) read, a 2-way std::memory_order_acquire is required. Putting the appropriate std::atomic_thread_fence(...) calls to these places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the C++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions were reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Nodes are PACKED, which conflicts with std::atomic. However, when we allocate a node, it is allocated with sufficient alignment for atomic operations. Other unaligned structures (the values of the data) are stored between nodes. Removing PACKED would increase the memory footprint and actually slow things down slightly. Notes: + std::atomic:: usually blocks reordering in one direction. std::atomic_thread_fence(...) blocks reordering in both directions. - std::assume_aligned is from C++20. The used __builtin_assume_aligned should be supported by both Clang and GCC. - I renamed `IsLocked` to `IsLockedUnsafe` because it should only be used by the thread that actually holds the lock (as it is currently only used in `DCHECK` macros when we hold the lock). Performance Tests on an aws c6i.2xlarge : === master = Performance counter stats for 'bin/cbtree-test --gtest_filter=*TestRacyConcurrentInsert* --gtest_repeat=10': 1033864.95 msec task-clock#7.703 CPUs utilized 788348 context-switches #0.763 K/sec 142572 cpu-migrations#0.138 K/sec 3853 page-faults #0.004 K/sec cycles instructions branches branch-misses 134.221318612 seconds time elapsed 1036.566723000 seconds user 0.369948000 seconds sys Performance counter stats for 'bin/cbtree-test --gtest_filter=*ConcurrentReadWriteBenchmark* --gtest_repeat=10 --concurrent_rw_benchmark_num_writer_threads=4 --concurrent_rw_benchmark_num_reader_threads=4': 175292.55 msec task-clock#5.723 CPUs utilized 35287 context-switches #0.201 K/sec 1209 cpu-migrations#0.007 K/sec 1990806 page-faults #0.011 M/sec cycles instructions branches branch-misses 30.627830389 seconds time elapsed 172.712266000 seconds user 2.846389000 seconds sys Performance counter stats for 'bin/cbtree-test --gtest_filter=*ConcurrentReadWriteBenchmark* --gtest_repeat=10 --concurrent_rw_benchmark_num_writer_threads=16
[kudu-CR] [build] bootstrap-dev-env.sh fix for ubuntu 22.04+
Hello Marton Greber, Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20559 to look at the new patch set (#2). Change subject: [build] bootstrap-dev-env.sh fix for ubuntu 22.04+ .. [build] bootstrap-dev-env.sh fix for ubuntu 22.04+ On Ubuntu 22.04 python package is renamed to python2 On 23.10 and 24.04 python2 is no longer available. We should just use python3 on newer platforms. Tested on: 18.04, 20.04. 22.04, 23.10, 24.04 Change-Id: I275fa06214fde6ede1c6dd8e85f786fdc950cf8c --- M docker/bootstrap-dev-env.sh 1 file changed, 8 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/20559/2 -- To view, visit http://gerrit.cloudera.org:8080/20559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I275fa06214fde6ede1c6dd8e85f786fdc950cf8c Gerrit-Change-Number: 20559 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21447 ) Change subject: Add a benchmark for CBTree concurrent writes. .. Patch Set 8: (2 comments) I think it is cleaner to add it to benchmark suit in a separate commit. I checked the Jenkins runs. Only TSAN fails with tests other than this file. @Alexey, is this ok to override Jenkins results, or should I keep updating until it passes? http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@57 PS4, Line 57: N > You could use the same logic to output of hardware_concurrency() as you did I would like to keep it this way. http://gerrit.cloudera.org:8080/#/c/21447/8/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/8/src/kudu/tablet/cbtree-test.cc@924 PS8, Line 924: ConcurrentReadWriteBenchmark > Don't want to delay your commit. I would prefer to add it in a separate change. I will push the change to gerrit, when this is merged. -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 04 Jun 2024 11:04:25 + Gerrit-HasComments: Yes
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21447 ) Change subject: Add a benchmark for CBTree concurrent writes. .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/21447/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21447/7//COMMIT_MSG@29 PS7, Line 29: can't be assig > nit: can't be assigned Done http://gerrit.cloudera.org:8080/#/c/21447/7/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/7/src/kudu/tablet/cbtree-test.cc@933 PS7, Line 933: t int num_t > could this be a const? Done http://gerrit.cloudera.org:8080/#/c/21447/7/src/kudu/tablet/cbtree-test.cc@936 PS7, Line 936: t int num_inserts_first > could this be a const? Done http://gerrit.cloudera.org:8080/#/c/21447/7/src/kudu/tablet/cbtree-test.cc@1089 PS7, Line 1089: > nit: extra space Done -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 29 May 2024 13:25:42 + Gerrit-HasComments: Yes
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#8). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't be assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 197 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/8 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21447 ) Change subject: Add a benchmark for CBTree concurrent writes. .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@57 PS4, Line 57: N > Any specific reason behind choosing these numbers for number of writes/read No specific reason, I just had 8 core. So I put it to +1. Also I wanted different number of reader threads than writer threads. Now even with the same number, they don't have 1-to-1 mapping. I thought about hardware_concurrency(), but should I set both reader and writer threads to hardware_concurrency() /2 ? Should I start hardware_concurrency() writer threads and change half of them to readers? Also, since the goal is to measure performance, not to do the workload run as fast as I can, I don't see any problem with having more or less threads than concurrently possible. http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@56 PS4, Line 56: : DEFINE_int32(concurrent_rw_benchmark_num_writer_threads, 4, : "Number of writer threads in TestConcurrentReadWritePerformance"); : DEFINE_int32(concurrent_rw_benchmark_num_reader_threads, 4, : "Number of reader threads in TestConcurrentReadWritePerformance"); : DEFINE_int32(concurrent_rw_benchmark_num_inserts, 100, : "Number of inserts in TestConcurrentReadWritePerformance"); : // This might be needed, because reads are significantly faster than writes. : DEFINE_int32(concurrent_rw_benchmark_reader_boost, 1, : "Multiply the amount of values ea > style nit: could you please re-format this a bit -- take a look at correspo like this? http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@933 PS4, Line 933: der > style nit for here and elsewhere: please add a space after the comma Done http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@936 PS4, Line 936: t_phas > Add a one liner comment on why you chose this. As the comment say any prime is good (It is the largest below 1e6). I changed it to 10007. http://gerrit.cloudera.org:8080/#/c/21447/4/src/kudu/tablet/cbtree-test.cc@938 PS4, Line 938: e apply a (d > nit for here and elsewhere: would it work as intended with the 'std::' pref Thank you. -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 28 May 2024 21:12:31 + Gerrit-HasComments: Yes
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#7). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 198 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/7 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#6). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 197 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/6 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#5). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 197 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/5 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21447 ) Change subject: Add a benchmark for CBTree concurrent writes. .. Patch Set 4: (17 comments) http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG@32 PS1, Line 32: > nit: writer ? ok http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG@37 PS1, Line 37: erfor > nit: each ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@31 PS1, Line 31: #include > It seems you'll need to rebase this patch on top the HEAD revision in the m ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@907 PS1, Line 907: > nit: consider ether converting this into a functor visible only in the scop Thanks. I will use std::array and make it a lambda http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@924 PS1, Line 924: TEST_F(TestCBTree, TestConcurrentReadWritePerformance) { > style nit here and below: use kCamelName pattern for static/constexpr const ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@925 PS1, Line 925: SKIP_IF_SLOW_NOT_ALLOWED(); : constexpr int kTrials = 10; > Do you want these to be hard-coded, or there might be some extra value in d I added 4 flags. Number of r/w threads, number of inserts (no longer required to be a prime or p % 4 == 3, It felt really user unfriendly), and a num_reader_boost to increase the workload of reader threads (they are much faster than writer thread). Actually num_reader_boost might be the most useful one. http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@928 PS1, Line 928: her > ? *easy http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@931 PS1, Line 931: > nit: does LINT and 'git check' approve using this symbol? Yes, they seem to be cool with it. I change it to p % 4 == 3. http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@938 PS1, Line 938: tatic_c > nit: barriers? ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@958 PS1, Line 958: nsert ra > style nit for here and below: is it possible to use 'while (true)' to match Yes. http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1024 PS1, Line 1024: > nit: threads ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1028 PS1, Line 1028: tree.reset(); > Does it make sense to check for failures before starting the second phase o yes, it does. http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1029 PS1, Line 1029: > nit: threads ok http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1029 PS1, Line 1029: > nit: $2 values ? ok http://gerrit.cloudera.org:8080/#/c/21447/3/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21447/3/src/kudu/tablet/cbtree-test.cc@494 PS3, Line 494: break; This is just because the rebase. Nothing changes here http://gerrit.cloudera.org:8080/#/c/21447/3/src/kudu/tablet/cbtree-test.cc@501 PS3, Line 501: for (thread& thr : threads) { This is just because the rebase. Nothing changes here http://gerrit.cloudera.org:8080/#/c/21447/3/src/kudu/tablet/cbtree-test.cc@759 PS3, Line 759: break; This is just because the rebase. Nothing changes here -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 23 May 2024 22:28:00 + Gerrit-HasComments: Yes
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#4). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 184 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/4 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#3). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 184 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/3 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21447 to look at the new patch set (#2). Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage eachi other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 184 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/2 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] Fix deadlock on fail for CBTree-test
Hello Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21446 to look at the new patch set (#4). Change subject: Fix deadlock on fail for CBTree-test .. Fix deadlock on fail for CBTree-test When TestConcurrentIterateAndInsert, TestConcurrentInsert, TestRacyConcurrentInsert fail while --gtest_repeat is used, they will keep running forever. Instead of just returning on fail, they should properly stop the other threads running, and then exit. To reproduce the problem, run this on ARM (where the test actually fails): ./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy* Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 4 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/21446/4 -- To view, visit http://gerrit.cloudera.org:8080/21446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 Gerrit-Change-Number: 21446 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Fix deadlock on fail for CBTree-test
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21446 ) Change subject: Fix deadlock on fail for CBTree-test .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/21446/3/src/kudu/tablet/cbtree-test.cc File src/kudu/tablet/cbtree-test.cc: http://gerrit.cloudera.org:8080/#/c/21446/3/src/kudu/tablet/cbtree-test.cc@451 PS3, Line 451: num_threads + 1 > just curious, why is count initialised to 1 extra than number of threads? Because there is a controlling thread (the unit test itself). Check out InsertAndVerify function, is is very short. when the unit test 21 lines below calls the go_barrier, num_threads are already waiting. Then it stand in done_barrier waiting for the threads to finish. http://gerrit.cloudera.org:8080/#/c/21446/3/src/kudu/tablet/cbtree-test.cc@483 PS3, Line 483: go_barrier.Wait() > Here and at line 748, make sure extra barrier wait is ok. We stop the threads by reseting the tree and the telling them to start. At this point the are standing in line 271 in the InsertAndVerify function, in this wait: while (true) { go_barrier->Wait(); if (tree->get() == nullptr) return; This is how we stop them normally. And in case a thread detects a problem, it still stays in the while(true). Even if the thread that caches an error would exit, others might already stand in the go_barrier().Wait(). Without the wait the writer threads would just wait forever for the next start. http://gerrit.cloudera.org:8080/#/c/21446/3/src/kudu/tablet/cbtree-test.cc@485 PS3, Line 485: thread & > thread & -> thread& It is, how it was, but I can change it. -- To view, visit http://gerrit.cloudera.org:8080/21446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 Gerrit-Change-Number: 21446 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 22 May 2024 13:43:53 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#14). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we call AcquireVersion or StableVersion to verify that the version was not changed during our (already completed) read, a 2-way std::memory_order_acquire is required. Putting the appropriate std::atomic_thread_fence(...) calls to these places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Nodes are PACKED, which conflicts with std::atomic. However, when we allocate a node, it is allocated with sufficient alignment for atomic operations. Other unaligned structures (the values of the data) are stored between nodes. Removing PACKED would increase the memory footprint and actually slow things down slightly. Notes: + std::atomic:: usually blocks reordering in one direction. std::atomic_thread_fence(...) blocks reordering in both directions. - std::assume_aligned is from C++20. The used __builtin_assume_aligned should be supported by both Clang and GCC. - I renamed `IsLocked` to `IsLockedUnsafe` because it should only be used by the thread that actually holds the lock (as it is currently only used in `DCHECK` macros when we hold the lock). Performance Tests on an aws c6i.xlarge : === master = Performance counter stats for 'bin/cbtree-test --gtest_filter=*TestScanPerformance* --gtest_repeat=10': 45832.36 msec task-clock#1.000 CPUs utilized 165 context-switches #0.004 K/sec 0 cpu-migrations#0.000 K/sec 81868 page-faults #0.002 M/sec cycles instructions branches branch-misses 45.853664150 seconds time elapsed 45.736053000 seconds user 0.096008000 seconds sys Performance counter stats for 'bin/memrowset-test --gtest_filter=*TestMemRowSetUpdatePerformance* --gtest_repeat=10': 15873.74 msec task-clock#0.999 CPUs utilized 68 context-switches #0.004 K/sec 0 cpu-migrations#0.000 K/sec 73408 page-faults #0.005 M/sec cycles instructions branches branch-misses 15.893188554 seconds time elapsed 15.745611000 seconds user 0.128073000 seconds sys Performance counter stats for 'bin/deltamemstore-test --gtest_filter=*BenchmarkManyUpdatesToOneRow* --gtest_repeat=10': 471.49 msec task-clock#0.960 CPUs utilized 3 context-switches #0.006 K/sec
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 13: (7 comments) I updated the review. No more undefined behaviour. I also updated https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing There is a description and a RuntimesSumary tab to understand what it actually contains. It uses "perf stat" now. If the following gerrit passes: https://gerrit.cloudera.org/#/c/21447/ I can add this to the commit message runtimes too when it is merged http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@9 PS3, Line 9: fail > Maybe add a couple lines about how the failure manifested. Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@16 PS3, Line 16: e inserts first, and : only start checking when the tree is finalized does not prevent the > Is this enough to repro the issue on x86 as well? Or only on ARM? Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@20 PS3, Line 20: > please add some details on what exactly needs to be changed and what the im Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@28 PS3, Line 28: : More info: > Do we know why it does not cause a problem on x86? Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@32 PS3, Line 32: olution: > I think this part is a bit hard to understand, I think you are talking abou Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@32 PS3, Line 32: > nit: remove Done http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@43 PS3, Line 43: > did you mean "absence of fences"? Done -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 22 May 2024 13:25:10 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#13). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we call AcquireVersion or StableVersion to verify that the version was not changed during our (already completed) read, a 2-way std::memory_order_acquire is required. Putting the appropriate std::atomic_thread_fence(...) calls to these places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Nodes are PACKED, which conflicts with std::atomic. However, when we allocate a node, it is allocated with sufficient alignment for atomic operations. Other unaligned structures (the values of the data) are stored between nodes. Removing PACKED would increase the memory footprint and actually slow things down slightly. Notes: + std::atomic:: usually blocks reordering in one direction. std::atomic_thread_fence(...) blocks reordering in both directions. - std::assume_aligned is from C++20. The used __builtin_assume_aligned should be supported by both Clang and GCC. - I renamed `IsLocked` to `IsLockedUnsafe` because it should only be used by the thread that actually holds the lock (as it is currently only used in `DCHECK` macros when we hold the lock). Performance Tests on an aws c6i.xlarge : === master = Performance counter stats for 'bin/cbtree-test --gtest_filter=*TestScanPerformance* --gtest_repeat=10': 45832.36 msec task-clock#1.000 CPUs utilized 165 context-switches #0.004 K/sec 0 cpu-migrations#0.000 K/sec 81868 page-faults #0.002 M/sec cycles instructions branches branch-misses 45.853664150 seconds time elapsed 45.736053000 seconds user 0.096008000 seconds sys Performance counter stats for 'bin/memrowset-test --gtest_filter=*TestMemRowSetUpdatePerformance* --gtest_repeat=10': 15873.74 msec task-clock#0.999 CPUs utilized 68 context-switches #0.004 K/sec 0 cpu-migrations#0.000 K/sec 73408 page-faults #0.005 M/sec cycles instructions branches branch-misses 15.893188554 seconds time elapsed 15.745611000 seconds user 0.128073000 seconds sys Performance counter stats for 'bin/deltamemstore-test --gtest_filter=*BenchmarkManyUpdatesToOneRow* --gtest_repeat=10': 471.49 msec task-clock#0.960 CPUs utilized 3 context-switches #0.006 K/sec
[kudu-CR] Fix deadlock on fail for CBTree-test
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21446 to look at the new patch set (#3). Change subject: Fix deadlock on fail for CBTree-test .. Fix deadlock on fail for CBTree-test When TestConcurrentIterateAndInsert, TestConcurrentInsert, TestRacyConcurrentInsert fail while --gtest_repeat is used, they will keep running forever. Instead of just returning on fail, they should properly stop the other threads running, and then exit. To reproduce the problem, run this on ARM (where the test actually fails): ./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy* Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/21446/3 -- To view, visit http://gerrit.cloudera.org:8080/21446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 Gerrit-Change-Number: 21446 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#12). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we call AcquireVersion or StableVersion to verify that the version was not changed during our (already completed) read, a 2-way std::memory_order_acquire is required. Putting the appropriate std::atomic_thread_fence(...) calls to these places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Nodes are PACKED, which conflicts with std::atomic. However, when we allocate a node, it is allocated with sufficient alignment for atomic operations. Other unaligned structures (the values of the data) are stored between nodes. Removing PACKED would increase the memory footprint and actually slow things down slightly. Notes: + std::atomic:: usually blocks reordering in one direction. std::atomic_thread_fence(...) blocks reordering in both directions. - std::assume_aligned is from C++20. The used __builtin_assume_aligned should be supported by both Clang and GCC. - I renamed `IsLocked` to `IsLockedUnsafe` because it should only be used by the thread that actually holds the lock (as it is currently only used in `DCHECK` macros when we hold the lock). Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/concurrent_btree.h 2 files changed, 110 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/12 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Fix deadlock on fail for CBTree-test
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21446 to look at the new patch set (#2). Change subject: Fix deadlock on fail for CBTree-test .. Fix deadlock on fail for CBTree-test When TestConcurrentIterateAndInsert, TestConcurrentInsert, TestRacyConcurrentInsert fail while --gtest_repeat is used, they will keep running forever. Instead of just returning on fail, they should properly stop the other threads running, and then exit. To reproduce the problem, run this on ARM (where the test actually fails): ./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy* Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 3 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/21446/2 -- To view, visit http://gerrit.cloudera.org:8080/21446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 Gerrit-Change-Number: 21446 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#11). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we call AcquireVersion or StableVersion to verify that the version was not changed during our (already completed) read, a 2-way std::memory_order_acquire is required. Putting the appropriate std::atomic_thread_fence(...) calls to these places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Nodes are PACKED, which conflicts with std::atomic. However, when we allocate a node, it is allocated with sufficient alignment for atomic operations. Other unaligned structures (the values of the data) are stored between nodes. Removing PACKED would increase the memory footprint and actually slow things down slightly. Notes: + std::atomic:: usually blocks reordering in one direction. std::atomic_thread_fence(...) blocks reordering in both directions. - std::assume_aligned is from C++20. The used __builtin_assume_aligned should be supported by both Clang and GCC. - I renamed `IsLocked` to `IsLockedUnsafe` because it should only be used by the thread that actually holds the lock (as it is currently only used in `DCHECK` macros when we hold the lock). Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 108 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/11 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Add a benchmark for CBTree concurrent writes.
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21447 Change subject: Add a benchmark for CBTree concurrent writes. .. Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. Reader threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage eachi other. So no new test is required for that scenario. Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 161 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/47/21447/1 -- To view, visit http://gerrit.cloudera.org:8080/21447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4 Gerrit-Change-Number: 21447 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] Fix deadlock on fail for CBTree-test
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21446 Change subject: Fix deadlock on fail for CBTree-test .. Fix deadlock on fail for CBTree-test When TestConcurrentIterateAndInsert, TestConcurrentInsert, TestRacyConcurrentInsert fail while --gtest_repeat is used, they will keep running forever. Instead of just returning on fail, they should properly stop the other threads running, and then exit. To reproduce the problem, run this on ARM (where the test actually fails): ./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy* Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 --- M src/kudu/tablet/cbtree-test.cc 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/46/21446/1 -- To view, visit http://gerrit.cloudera.org:8080/21446 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498 Gerrit-Change-Number: 21446 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] [glog] Fix minidump-test for cold symbols.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21385 ) Change subject: [glog] Fix minidump-test for cold symbols. .. Patch Set 3: OK, I abandon this patch and will try to update to Glog 0.7.0 -- To view, visit http://gerrit.cloudera.org:8080/21385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I18d88790d09e9763223a767617794332859cbfad Gerrit-Change-Number: 21385 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 14 May 2024 10:30:09 + Gerrit-HasComments: No
[kudu-CR] [glog] Fix minidump-test for cold symbols.
Zoltan Martonka has abandoned this change. ( http://gerrit.cloudera.org:8080/21385 ) Change subject: [glog] Fix minidump-test for cold symbols. .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/21385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I18d88790d09e9763223a767617794332859cbfad Gerrit-Change-Number: 21385 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [security-flags-itest] Fix missing command line flags
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21399 ) Change subject: [security-flags-itest] Fix missing command line flags .. Patch Set 3: Please add a proper description of the problem and the solution. + Mention that the test fails because the flag is not registered properly. + The flag is defined in server_process.a. We do not use any function or variable from it. + It is a known C++ phenomenon that the INITIALIZATION ROUTINES of the library might not be included if no symbol is used from the library (despite the initialization routines having an effect on used functionality, such as registering the variable in the gflags ecosystem). Also, --whole-archive fixes the issue and works perfectly if you use it only for the problematic library. The first google result for "how to add --whole-archive in cmake" works. You are doing something wrong... -- To view, visit http://gerrit.cloudera.org:8080/21399 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc Gerrit-Change-Number: 21399 Gerrit-PatchSet: 3 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Reviewer: Ádám Bakai Gerrit-Comment-Date: Mon, 13 May 2024 15:16:16 + Gerrit-HasComments: No
[kudu-CR] [glog] Fix minidump-test for cold symbols.
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21385 to look at the new patch set (#2). Change subject: [glog] Fix minidump-test for cold symbols. .. [glog] Fix minidump-test for cold symbols. glog 0.6.0 cannot reliably resolve cold symbols. This was only fixed in recent versions: https://github.com/google/glog/issues/869 This might cause ASSERT_DEATH checks to fail in minidump-test. The long term solution would be to updata to 0.7.0: https://issues.apache.org/jira/browse/KUDU-3572 One affected platform is RHEL 9.3 with default g++. Change-Id: I18d88790d09e9763223a767617794332859cbfad --- M src/kudu/util/minidump-test.cc 1 file changed, 10 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/21385/2 -- To view, visit http://gerrit.cloudera.org:8080/21385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I18d88790d09e9763223a767617794332859cbfad Gerrit-Change-Number: 21385 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [glog] Fix minidump-test for cold symbols.
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21385 to look at the new patch set (#3). Change subject: [glog] Fix minidump-test for cold symbols. .. [glog] Fix minidump-test for cold symbols. glog 0.6.0 cannot reliably resolve cold symbols. This was only fixed in recent versions: https://github.com/google/glog/issues/869 This might cause ASSERT_DEATH checks to fail in minidump-test. The long term solution would be to update to 0.7.0: https://issues.apache.org/jira/browse/KUDU-3572 One affected platform is RHEL 9.3 with default g++. Change-Id: I18d88790d09e9763223a767617794332859cbfad --- M src/kudu/util/minidump-test.cc 1 file changed, 10 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/21385/3 -- To view, visit http://gerrit.cloudera.org:8080/21385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I18d88790d09e9763223a767617794332859cbfad Gerrit-Change-Number: 21385 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 9: > (1 comment) > > > (1 comment) > > > > I realized that, the only proper way to add a 2-way load barrier > is > > with > > std::atomic_thread_fence(std::memory_order_acquire), which is > more > > strict, than > > the 1-way std::atomic::load(std::memory_order_acquire); > > > > I made some performance test. I used TestScanPerformance from > > cbtree-test and > > TestMemRowSetUpdatePerformance from memrowset-test. I also added > 2 > > more > > performance tests, one using basic arena, one using > > ThreadSafeMemoryTrackingArena (we only use the 2nd in > production). > > I will add g++ tests too (as soon as it compiles). > > See the branches sheet for name resolution. > > For some reason, I assumed you were going to add those tests into > this patch or as a separate patch, but it seems by 'adding > > > https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing > > > (1 comment) > > > > I realized that, the only proper way to add a 2-way load barrier > is > > with > > std::atomic_thread_fence(std::memory_order_acquire), which is > more > > strict, than > > the 1-way std::atomic::load(std::memory_order_acquire); > > > > I made some performance test. I used TestScanPerformance from > > cbtree-test and > > TestMemRowSetUpdatePerformance from memrowset-test. I also added > 2 > > more > > performance tests, one using basic arena, one using > > ThreadSafeMemoryTrackingArena (we only use the 2nd in > production). > > I will add g++ tests too (as soon as it compiles). > > See the branches sheet for name resolution. > > For some reason, I assumed you were going to add your new tests as > a part of this changelist as well, but it seems you just meant > adding results from running those tests into the spreadsheet :) > > OK, maybe it makes sense to add your new tests into Kudu as well? > Feel free to post that as a separate changelist. > > Also, it would be great if you could address other feedback items > (nits, questions of other reviewers, etc.) and post a follow-up > revision, if necessary. > > Thanks! > > > https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing I run the arm tests on a c6g.2xlarge arm machine, and the x86 tests on a physical pc in the bp office (I can include the spec if you want). I thought only the changes matter (not the comparison of x86 and ARM), however I can try to rerun the tests on a similar x86 than the arm machine (c6i.2xlarge?)/ All times are in ms, the value that the google test prints (TestMemRowSetUpdatePerformance included). The (same platform) tests were run on the same machine. One at a time through ssh. No other task was taking place (I even closed VS Code to prevent it from doing any background task). x86_master_clang_2 and x86_master_clang are the same, only 3 hours apart (And I have no idea why the second run is a bit slower). My new tests are nearly the same as "TestConcurrentInsert", just using the ThreadSafeMemoryTrackingArena and the sizes that production actually uses. I could add a test with less inserts using this parameters. (If the tree works while it is small, it will work, when it is big, because threads will "collide" less often). I will post a separate gerrit for this test (Same as TestPerformanceThreadSafe, just runs <1s). -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 07 May 2024 13:14:22 + Gerrit-HasComments: No
[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11
Hello Alexey Serbin, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21396 to look at the new patch set (#2). Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11 .. [g++11] Fix DecrementIntCell for g++10 and g++11 There seems to be a compiler bug, that optimizes out the safety check for INT_MIN in the DecrementIntCell function. It appears on RHEL 9.2 with g++ 11.4.1. Only in Release build. For more infoi, see: https://stackoverflow.com/questions/78424303/g-optimizes-away-check-for-int-min-in-release-build The issue seems to be fixed in g++12 and not yet present in g++9. Solution: Slightly change the function to ensure it is compiled correctly. This modification should not alter the correct optimized code. Basically, any change where the compiler cannot perform the two optimization steps (in this order) should address the issue: + if (x == INT_MIN) x = INT_MAX; else x -= 1; > x -= 1 (this is equivalent on the x86 platform). + if (x - 1 < x) > if (true) (this equivalence holds only at the mathematical level). Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5 --- M src/kudu/common/key_util.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/21396/2 -- To view, visit http://gerrit.cloudera.org:8080/21396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5 Gerrit-Change-Number: 21396 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21396 Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11 .. [g++11] Fix DecrementIntCell for g++10 and g++11 There seems to be a compiler bug, that optimizes out the safety check for INT_MIN in the DecrementIntCell function. It appears on RHEL 9.2 with g++ 11.4.1. Only in Release build. For more infoi, see: https://stackoverflow.com/questions/78424303/g-optimizes-away-check-for-int-min-in-release-build The issue seems to be fixed in g++12 and not yet present in g++9. Solution: Slightly change the function to ensure it is compiled correctly. This modification should not alter the correct optimized code. Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5 --- M src/kudu/common/key_util.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/21396/1 -- To view, visit http://gerrit.cloudera.org:8080/21396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5 Gerrit-Change-Number: 21396 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] [glog] Fix minidump-test for cold symbols.
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21385 Change subject: [glog] Fix minidump-test for cold symbols. .. [glog] Fix minidump-test for cold symbols. glog 0.6.0 cannot reliably resolve cold symbols. This was only fixed in recent versions: https://github.com/google/glog/issues/869 This might cause ASSERT_DEATH checks to fail in minidump-test. The long term solution would be to updatate to 0.7.0: https://issues.apache.org/jira/browse/KUDU-3572 Change-Id: I18d88790d09e9763223a767617794332859cbfad --- M src/kudu/util/minidump-test.cc 1 file changed, 10 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/85/21385/1 -- To view, visit http://gerrit.cloudera.org:8080/21385 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I18d88790d09e9763223a767617794332859cbfad Gerrit-Change-Number: 21385 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] [build] Fix RocksDB Snappy dependency.
Hello Yingchun Lai, Attila Bukor, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21370 to look at the new patch set (#3). Change subject: [build] Fix RocksDB Snappy dependency. .. [build] Fix RocksDB Snappy dependency. RocksDB uses Snappy. When statically linking Kudu, librocksdb.a contains multiple undefined symbols that are defined in libsnappy.a. In the case of static linking, the order of libraries passed to the linker matters, and libsnappy.a must come after librocksdb.a. This sometimes causes the release build to fail on RHEL 9.3. Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 --- M CMakeLists.txt 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/21370/3 -- To view, visit http://gerrit.cloudera.org:8080/21370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 Gerrit-Change-Number: 21370 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai
[kudu-CR] KUDU-3568 Fix budgeting constraint test by enabling preset factor
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21360 ) Change subject: KUDU-3568 Fix budgeting constraint test by enabling preset factor .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21360 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9df218cd2d8ef3709793db267d5a0d651421dbb6 Gerrit-Change-Number: 21360 Gerrit-PatchSet: 1 Gerrit-Owner: Ashwani Raina Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 02 May 2024 07:48:22 + Gerrit-HasComments: No
[kudu-CR] [Python] Update dependencies and 3.10 support
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21380 ) Change subject: [Python] Update dependencies and 3.10 support .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21380 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I930a15018158563996819fb33d32a3bb5641c396 Gerrit-Change-Number: 21380 Gerrit-PatchSet: 4 Gerrit-Owner: Marton Greber Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 02 May 2024 07:41:42 + Gerrit-HasComments: No
[kudu-CR] [build] Fix RocksDB Snappy dependency.
Hello Attila Bukor, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21370 to look at the new patch set (#2). Change subject: [build] Fix RocksDB Snappy dependency. .. [build] Fix RocksDB Snappy dependency. RocksDB uses Snappy. When statically linking Kudu, librocksdb.a contains multiple undefined symbols that are defined in libsnappy.a. In the case of static linking, the order of libraries passed to the linker matters, and libsnappy.a must come after librocksdb.a. This sometimes causes the release build to fail on RHEL 9.3. Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 --- M CMakeLists.txt 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/21370/2 -- To view, visit http://gerrit.cloudera.org:8080/21370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 Gerrit-Change-Number: 21370 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [build] Fix RocksDB Snappy dependency.
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21370 Change subject: [build] Fix RocksDB Snappy dependency. .. [build] Fix RocksDB Snappy dependency. RocksDB uses Snappy. When statically linking Kudu, librocksdb.a contains multiple undefined symbols that are defined in libsnappy.a. In the case of static linking, the order of libraries passed to the linker matters, and libsnappy.a must come after librocksdb.a. This causes the release build to fail on RHEL 9.3. Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 --- M CMakeLists.txt 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/21370/1 -- To view, visit http://gerrit.cloudera.org:8080/21370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3ce75f69d94436f732dbe9a0011546b1ae494824 Gerrit-Change-Number: 21370 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#10). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 6 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to this 6 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for removing PACKED: Although the padding of the Nodes was turn of, the following happened (both on ARM and x86): + NodeBase was 16 byte, so there was no padding after the base class. + LeafNode and InternalNode was 249, 244 bytes (used in memrowset and deltamemstore) + When we reserved a new node, we used arena_->AllocateBytesAligned(...), with 8 bytes alignment. So the end result was exactly the same, as if there were no "PACKED" attribute. Also compiler kept warning about " might be unaligned", but it were never unaligned. By removing the packed PACKED we change nothing in the memory layout. (neither on ARM or x86). It just gets rid of the old compiler warning. (std::atomic disable packing anyway). Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/concurrent_btree.h 2 files changed, 101 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/10 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 6: (1 comment) I realized that, the only proper way to add a 2-way load barrier is with std::atomic_thread_fence(std::memory_order_acquire), which is more strict, than the 1-way std::atomic::load(std::memory_order_acquire); I made some performance test. I used TestScanPerformance from cbtree-test and TestMemRowSetUpdatePerformance from memrowset-test. I also added 2 more performance tests, one using basic arena, one using ThreadSafeMemoryTrackingArena (we only use the 2nd in production). I will add g++ tests too (as soon as it compiles). See the branches sheet for name resolution. https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@33 PS6, Line 33: The following 6 barriers need to be more strict: > Did you run the TestCBTree.TestScanPerformance scenario from the cbtree-tes See the google sheet. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 23 Apr 2024 13:00:06 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#9). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 7 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf/SeekNextLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to these 7 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the C++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for __builtin_assume_aligned and reinterpret_cast: Although the nodes are PACKED, We guarantee that they will be aligned to 8 bytes. They can have unaligned values put between them in the same arena, so packing does save memory. If you put std::atomic into a struct, g++ refuses to pack it, so we need a placeholder. Reason for renaming IsLocked to IsLockedUnsafe: In the original version, it was an unsafe memory access. Currently, it is only in places where we hold the lock (which means we have atomic access to the version before and after in the same thread). Also, it is only used in DCHECK() macros. However, I found it disturbing to have an unsafe function similarly named as all the other proper atomic functions around it. I also fixed TestRacyConcurrentInsert, so it won't freez after failing. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/concurrent_btree.h 2 files changed, 100 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/9 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#8). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 7 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf/SeekNextLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to this 7 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the C++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for __builtin_assume_aligned and reinterpret_cast: Although the nodes are PACKED, We guarantee that they will be aligned to 8 bytes. They can have unaligned values put between them in the same arena, so packing does save memory. If you put std::atomic into a struct, g++ refuses to pack it, so we need a placeholder. Reason for renaming IsLocked to IsLockedUnsafe: In the original version, it was an unsafe memory access. Currently, it is only in places where we hold the lock (which means we have atomic access to the version before and after in the same thread). Also, it is only used in DCHECK() macros. However, I found it disturbing to have an unsafe function similarly named as all the other proper atomic functions around it. I also fixed TestRacyConcurrentInsert, so it won't freez after failing. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/concurrent_btree.h 2 files changed, 100 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/8 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/21127/6/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: http://gerrit.cloudera.org:8080/#/c/21127/6/src/kudu/tablet/concurrent_btree.h@390 PS6, Line 390: AtomicVersionValue VersionAcqRel() { : return version_.load(std::memory_order_acq_rel); : } > I'm curios how this make sense if by the specification the behavior of std: No, I missed it. There is actually no right flag for what we wan't to do here. We need to use std::atomic_thread_fence(std::memory_order_acquire); (because it is 2-way). To my best knowledge, g++ compiled the undefined behaviour as seq_cst, so it worked, but also slowed down the execution. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 23 Apr 2024 12:57:02 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#7). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 7 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf/SeekNextLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to this 7 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for __builtin_assume_aligned and reinterpret_cast: Although the nodes are PACKED, We guarantee that they will be aligned to 8 bytes. They can have unaligned values put between them in the same arena, so packing does save memory. If you put std::atomic into a struct, g++ refuses to pack it, so we need a placeholder. Reason for renaming IsLocked to IsLockedUnsafe: In the original version, it was an unsafe memory access. Currently, it is only in places where we hold the lock (which means we have atomic access to the version before and after in the same thread). Also, it is only used in DCHECK() macros. However, I found it disturbing to have an unsafe function similarly named as all the other proper atomic functions around it. I also fixed TestRacyConcurrentInsert, so it won't freez after failing. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/concurrent_btree.h 2 files changed, 100 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/7 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [CMakeLists] Make kudu test main static
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21282 ) Change subject: [CMakeLists] Make kudu_test_main static .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21282/1//COMMIT_MSG@14 PS1, Line 14: This happens in ubuntu 22.04 test runs > Thank you for the update. It was reproducible on a public aws arm container in August, so it can't be caused by any recent commits. Steps were if I remember correctly: + Start a new machine. + clone kudu. + Install python3 by hand and do a python->python3 symlink. + Run the 3 bootstrap script from the docker library (commenting out the python install) + ./thirdparty/build-if-necessary.sh + mkdir build/debug && cd build/debug + ../thirdparty/installed/common/bin/cmake -GNinja -DKUDU_LINK=dynamic -DCMAKE_BUILD_TYPE=Debug ../.. No other steps were needed. Back in august I even started a new instance, just to check if it is wrong an a clean environment. (I left the issue in august, when Cloudera provided an ubuntu20 arm image. I was testing ubuntu22 on aws using personal funds. (just wanted to take a look how bad it is)). The same steps result in the problematic behaviour if I start a clean ubuntu22.04 docker image on my desktop machine. Also impala found the issue much earlier (2022 october), and it seem they could reproduce it on ubuntu18 and 20: https://github.com/apache/impala/commit/97480c3e3259777c27e37b830fbf14e27b4f4d89 I don't understand all the factors that decide the "library search order", but after some rebuild, it started to put libkudu_util before glibc and the issue disappeared, and even clearing the ld cache did not help. Only starting a new aws instance / docker image helped. -- To view, visit http://gerrit.cloudera.org:8080/21282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea Gerrit-Change-Number: 21282 Gerrit-PatchSet: 1 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Reviewer: Ádám Bakai Gerrit-Comment-Date: Fri, 19 Apr 2024 07:28:55 + Gerrit-HasComments: Yes
[kudu-CR] [CMakeLists] Make kudu test main static
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21282 ) Change subject: [CMakeLists] Make kudu_test_main static .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/21282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21282/1//COMMIT_MSG@10 PS1, Line 10: loading chain Please use "library search order" not "loading chain". That's the correct terminology. You could also add a few sentences about what the original code does (wraps dlopen and dlclose to prevent a potential deadlock during unwind stack resolve???) Also mention that this is just a best effort fix for an already known issue, and we don't really care because release is statically linked (unwind_safeness.cc line 117: "given that dynamic linkage isn't used in production anyway") http://gerrit.cloudera.org:8080/#/c/21282/1/src/kudu/util/CMakeLists.txt File src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/21282/1/src/kudu/util/CMakeLists.txt@469 PS1, Line 469: add_library(kudu_test_main STATIC The static keyword alone should solve the issue, or am I missing something? no need to reorder the libs below. Wont static linking tests kill the purpose of "-DKUDU_LINK=dynamic" ? -- To view, visit http://gerrit.cloudera.org:8080/21282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea Gerrit-Change-Number: 21282 Gerrit-PatchSet: 1 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 11 Apr 2024 07:27:19 + Gerrit-HasComments: Yes
[kudu-CR] [CMakeLists] Make kudu test main static
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21282 ) Change subject: [CMakeLists] Make kudu_test_main static .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/21282/1/src/kudu/util/CMakeLists.txt File src/kudu/util/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/21282/1/src/kudu/util/CMakeLists.txt@469 PS1, Line 469: add_library(kudu_test_main STATIC > The static keyword alone should solve the issue, or am I missing something? The real question is: How much will this affect TSAN/ASAN build time/disk space? -- To view, visit http://gerrit.cloudera.org:8080/21282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea Gerrit-Change-Number: 21282 Gerrit-PatchSet: 1 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 11 Apr 2024 07:30:01 + Gerrit-HasComments: Yes
[kudu-CR] [tablet] one less memory allocation in CBTree::count()
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21276 ) Change subject: [tablet] one less memory allocation in CBTree::count() .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21276 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I01a68e0427b399db92b33c910185654d195150a5 Gerrit-Change-Number: 21276 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 11 Apr 2024 06:52:52 + Gerrit-HasComments: No
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@46 PS6, Line 46: would resolve the issue > Have you actually tried that or this is just a hypothesis? I actually tried it out. Wanted to ship it originally. It is version 2 of this review (compare version 2 with base). -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 10 Apr 2024 12:37:27 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Tidy Bot, Zoltan Chovan, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#6). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 6 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to this 6 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for removing PACKED: Although the padding of the Nodes was turn of, the following happened (both on ARM and x86): + NodeBase was 16 byte, so there was no padding after the base class. + LeafNode and InternalNode was 249, 244 bytes (used in memrowset and deltamemstore) + When we reserved a new node, we used arena_->AllocateBytesAligned(...), with 8 bytes alignment. So the end result was exactly the same, as if there were no "PACKED" attribute. Also compiler kept warning about " might be unaligned", but it were never unaligned. By removing the packed PACKED we change nothing in the memory layout. (neither on ARM or x86). It just gets rid of the old compiler warning. (std::atomic disable packing anyway). Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 76 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/6 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [debug] Fallback to load from libc
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21204 ) Change subject: [debug] Fallback to load from libc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/21204/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21204/2//COMMIT_MSG@7 PS2, Line 7: [debug] Fallback to load from libc I think you should use some other tag. This logic avoid potential for deadlocks in stack trace collection. Production systems included. It can change a kudu server from crashing to deadlock (which is not good). When I see [debug], I automatically assume it touches something that will never happen on production until some1 starts to troubleshoot. -- To view, visit http://gerrit.cloudera.org:8080/21204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83ccb92a32bf431fc5bdae27264969e0e131a475 Gerrit-Change-Number: 21204 Gerrit-PatchSet: 2 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 26 Mar 2024 13:53:31 + Gerrit-HasComments: Yes
[kudu-CR] [debug] Fallback to load from libc
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21204 ) Change subject: [debug] Fallback to load from libc .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/21204/2/src/kudu/util/debug/unwind_safeness.cc File src/kudu/util/debug/unwind_safeness.cc: http://gerrit.cloudera.org:8080/#/c/21204/2/src/kudu/util/debug/unwind_safeness.cc@80 PS2, Line 80: if (error) { If libc is loaded after kudu_util, then the libc version of dlopen overrides this version. So you bring back the error that this code originally fixed. The state after symbol loading will be the same, as if you would just delete the whole thing from unwind_safenness.cc. -- To view, visit http://gerrit.cloudera.org:8080/21204 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83ccb92a32bf431fc5bdae27264969e0e131a475 Gerrit-Change-Number: 21204 Gerrit-PatchSet: 2 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 26 Mar 2024 13:45:44 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21127/5/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: http://gerrit.cloudera.org:8080/#/c/21127/5/src/kudu/tablet/concurrent_btree.h@382 PS5, Line 382: AtomicVersionValue StableVersionAcqRel() { I can rename it (and the other 3) to AcqRelStableVersion. I find this word order better, but it is just my subjective opinion. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 26 Mar 2024 12:16:44 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/21127/5/src/kudu/tablet/concurrent_btree.h File src/kudu/tablet/concurrent_btree.h: http://gerrit.cloudera.org:8080/#/c/21127/5/src/kudu/tablet/concurrent_btree.h@125 PS5, Line 125: typedef std::uint64_t AtomicVersionValue; // Local value we loaded/will store This is the way std::atomic works (load returns T, store expects T, atomic is not copyable etc.). I think it is much cleaner than the old solution (not that we have a choice if we use std::atomic). -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 26 Mar 2024 12:10:54 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. Patch Set 5: I have nothing to do with the Tidy bot warning about the TODO-s -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 26 Mar 2024 12:06:13 + Gerrit-HasComments: No
[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.
Hello Zoltan Chovan, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#5). Change subject: [ARM] Concurrent binary tree memory barriers fixed. .. [ARM] Concurrent binary tree memory barriers fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find some due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 Solution: The following 6 barriers need to be more strict: 1. When we set the Splitting/Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Putting the appropriate std::atomic_thread_fence(...) calls to this 6 places would resolve the issue. However, replacing the current function from atomicops-internals-arm64.h to the c++ standard functions will make the code more consistent. Reason for changing to std::atomics: atomicops-internals-arm64.h was picked from chromium source and the missing functions was reimplemented. The header is gone since, and even chromium uses a solution based on std::atomic (see base/atomicops_internals_portable.h in chromium source). I see no reason to update the header from chromium and implement the missing functions, just to have one more abstraction layer, that is basically just "function aliases" at this point. Reason for removing PACKED: Although the padding of the Nodes was turn of, the following happened (both on ARM and x86): + NodeBase was 16 byte, so there was no padding after the base class. + LeafNode and InternalNode was 249, 244 bytes (used in memrowset and deltamemstore) + When we reserved a new node, we used arena_->AllocateBytesAligned(...), with 8 bytes alignment. So the end result was exactly the same, as if there were no "PACKED" attribute. Also compiler kept warning about " might be unaligned", but it were never unaligned. By removing the packed PACKED we change nothing in the memory layout. (neither on ARM or x86). It just gets rid of the old compiler warning. (std::atomic disable packing anyway). Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 76 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/5 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.
Hello Zoltan Chovan, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#4). Change subject: WIP [ARM] Concurrent binary tree memory fences fixed. .. WIP [ARM] Concurrent binary tree memory fences fixed. TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present, but CBTree::GetCopy cannot find them due to the incorrect ordering. This is not a unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue (only on ARM). WIP cause: This change fixes the problem, and can be the final solution. However, I would also consider changing the version field to std::atomic and keep using the same barriers as now (not the default memory_order_seq_cst). Either way, we need to agree on the proposed changes of the memory barriers then I can also make a version with using std::atomic. Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes a problems on ARM. x86 guarantees a globally consistent order for stores (TSO). ARM, in contrast, allows stores to different memory locations to be observed differently from the program order. More info: https://www.sciencedirect.com/science/article/pii/S1383762124000390 The following 6 barriers need to be more strict: 1. When we set the Splitting / Inserting flag on a node, then it is not enough to flush all previous changes. It is also very important not to reorder any write before it. So instead of Release_Store (which is practically equivalent to std::memory_order_release), we need a 2 way barrier. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire (equivalent to std::memory_order_acquire) is not sufficient. We have to also make sure no read is reordered in the other direction. Currently adding only the 2 barriers under (1.) is sufficient to stabilize the test. However the constraints imposed by load_acquire under (2.) are also insufficient in theory. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/4 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21127 ) Change subject: WIP [ARM] Concurrent binary tree memory fences fixed. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@43 PS3, Line 43: the fences > did you mean "absence of fences"? base::subtle::Release_Store(v, *v | BTREE_INSERTING_MASK); Is in practice equivalent to: Lazy_Store(v, *v | BTREE_INSERTING_MASK); std::atomic_thread_fence(std::memory_order_release) Or with std::atomic v, it would be equivalent to: v->fetch_or(BTREE_SPLITTING_MASK, std:: memory_order_release); I should have written "constraints are not sufficient". I will change it. -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Mon, 11 Mar 2024 14:36:33 + Gerrit-HasComments: Yes
[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.
Hello Kudu Jenkins, Anonymous Coward (763), I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#3). Change subject: WIP [ARM] Concurrent binary tree memory fences fixed. .. WIP [ARM] Concurrent binary tree memory fences fixed. TestCBTree.TestRacyConcurrentInsert sometime fails on ARM. The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart from incorrect order, there are no other errors in the tree. All inserted elements are present and all edges are valid. This is not an unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue. WIP cause: This change fixes the problem, and can be the final solution. However I would also consider changing it to the std::atomic library (Also using the correct memory orders like now, not the default memory_order_seq_cst everywhere). Either way, we need to agree on the proposed changes of the the memory barriers then I can also make a version with using std::atomic. Root cause: Some memory order restrictions are not strict enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes a problem on ARM. The following 6. barrier needs to be more strict: 1. When we set the Splitting / Inserting flag on a node, then it is not enough to flush all or changes. It is also very important not to reorder any write before it. So instead of Release_Store we need a 2 way fence. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire is not sufficient. We have to also make sure no read is reordered in the other direction either. Currently adding only the 2 fences under (1.) is sufficient to stabilize the test. However the fences under (2.) are also wrong in theory. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/3 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Anonymous Coward (763) Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21127 to look at the new patch set (#2). Change subject: WIP [ARM] Concurrent binary tree memory fences fixed. .. WIP [ARM] Concurrent binary tree memory fences fixed. TestCBTree.TestRacyConcurrentInsert sometime fails on ARM. The concurrent inserts produce a tree with some misordered nodes. Apart from misordering, there are no other errors in the tree. All inserted elements are present, all edges are valid. This is not an unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue. WIP cause: This change fixes the problem, and can be the final solution. However I would also consider changing it to the std::atomic library (Also using the correct memory orders like now, not the default memory_order_seq_cst everyhere). Either way, we need to agree on the proposed changes of the the memory barriers then I can also make a version with using std::atomic. Root cause: Some memory order restrictions are not strigt enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes a problem on ARM. The following 6. barriest needs to be more strict: 1. When we set the Splittingi/Inserting flag on a node, then it is not enough to flush all or changes. It is also very important not to reorder any write before it. So instead of Release_Store we need a 2 way fence. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire is not sufficient. We have to also make sure no read is reordered in the other direction either. Currently adding only the 2 fences under (1.) is sufficient to stabilize the test. However the fences under (2.) are also wrong in theory. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/2 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] iWIP [ARM] Concurrent binary tree memory fences fixed.
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21127 Change subject: iWIP [ARM] Concurrent binary tree memory fences fixed. .. iWIP [ARM] Concurrent binary tree memory fences fixed. TestCBTree.TestRacyConcurrentInsert sometime fails on ARM. The concurrent inserts produce a tree with some misordered nodes. Apart from misordering, there are no other errors in the tree. All inserted elements are present, all edges are valid. This is not an unit test error, but a real error in the CBTree implementation. Modifying the test to only do the inserts first, and only start checking when the tree is finalized does not prevent the bug. So calling only the tree->Insert function from multiple threads is sufficient to reproduce the issue. WIP cause: This change fixes the problem, and can be the final solution. However I would also consider changing it to the std::atomic library (Also using the correct memory orders like now, not the default memory_order_seq_cst everyhere). Either way, we need to agree on the proposed changes of the the memory barriers then I can also make a version with using std::atomic. Root cause: Some memory order restrictions are not strigt enough. It does not cause a problem on x86 (There were no real changes for 11 years), but it causes a problem on ARM. The following 6. barriest needs to be more strict: 1. When we set the Splittingi/Inserting flag on a node, then it is not enough to flush all or changes. It is also very important not to reorder any write before it. So instead of Release_Store we need a 2 way fence. 2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we check if the version is unchanged, then a load_acquire is not sufficient. We have to also make sure no read is reordered in the other direction either. Currently adding only the 2 fences under (1.) is sufficient to stabilize the test. However the fences under (2.) are also wrong in theory. Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 --- M src/kudu/tablet/concurrent_btree.h 1 file changed, 9 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/1 -- To view, visit http://gerrit.cloudera.org:8080/21127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382 Gerrit-Change-Number: 21127 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] [rpc] fix typo in branch prediction
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/21044 ) Change subject: [rpc] fix typo in branch prediction .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21044 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5af612277dc5b03bdee597212885d47c823a6025 Gerrit-Change-Number: 21044 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 21 Feb 2024 07:36:47 + Gerrit-HasComments: No
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. Patch Set 14: (9 comments) http://gerrit.cloudera.org:8080/#/c/20725/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/14//COMMIT_MSG@21 PS14, Line 21: RHEL 8.8 ARM 64k core > Did you mean "RHEL 8.8 ARM with 64K page size"? Done http://gerrit.cloudera.org:8080/#/c/20725/14//COMMIT_MSG@36 PS14, Line 36: 64k core > nit: 64K page size ? Done http://gerrit.cloudera.org:8080/#/c/20725/14//COMMIT_MSG@51 PS14, Line 51: Added a jira to find a solution > nit: Filed a JIRA issue to track the issue Done http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@161 PS14, Line 161: CreateABlock > style nit: why not just CreateBlock() ? If you want to emphasize it's a si Thank you. I just used the original lambda name (create_a_block), did not gave it much thought. I will rename it to CreateBlock() http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@942 PS14, Line 942: is > are Done http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@949 PS14, Line 949: resistent > resistant Done http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@981 PS14, Line 981: // The chance that we delete the : // .metadata but fail to delete the .data file is (1-x) * x > I guess that's a bit simplistic given that the deletion of these two files Look at the last 2 lines of LogBlockContainerNativeMeta::~LogBlockContainerNativeMeta(). There is no error handling there. You can verify that the delete indeed happens when this smart pointers are reset: + Put a breakpoint to line 995 (CommitDeletedBlocks) in block_manager-test.cc and run the new test (use --gtest_also_run_disabled_tests) + When the breakpoint is hit, put a breakpoint into "Status DeleteFile(...)" in env_posix.cc. Resume the debugging. You will see a .data then a .metadata delete. you can check the debug callstack to see they are indeed called from ~LogBlockContainerNativeMeta() So the deletion error chances are independent as long as it is our artificial error and not a real disk error. http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@1022 PS14, Line 1022: , > nit: remove the comma? Done http://gerrit.cloudera.org:8080/#/c/20725/14/src/kudu/fs/block_manager-test.cc@1028 PS14, Line 1028: " > nit: remove the stray quote? Done -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 14 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Mon, 29 Jan 2024 16:26:25 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#15). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM with 64K page size. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 with 64k page size), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Filed a JIRA issue to track the issue: KUDU-3528. The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 97 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/15 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 15 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#14). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 95 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/14 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 14 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#13). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 95 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/13 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. Patch Set 9: (5 comments) http://gerrit.cloudera.org:8080/#/c/20725/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/9//COMMIT_MSG@53 PS9, Line 53: ARM > Not related to this change. Just for my understanding. On rhel system, where I am testing default is xfs. (ext4 on ubuntu) The problem only occurs when using LogBlockManagerNativeMeta, both the original unmodified test and the newly added passes when using FileBlockManager. If you compile and run Kudu on the arm cloudcat images , it will use LogBlockManagerNativeMeta without problems (probably the same in public aws). http://gerrit.cloudera.org:8080/#/c/20725/9/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20725/9/src/kudu/fs/block_manager-test.cc@933 PS9, Line 933: Bolck > nit: Block thx. http://gerrit.cloudera.org:8080/#/c/20725/9/src/kudu/fs/block_manager-test.cc@935 PS9, Line 935: because it can't handle a .data file > Maybe we can write: thx http://gerrit.cloudera.org:8080/#/c/20725/9/src/kudu/fs/block_manager-test.cc@939 PS9, Line 939: smarted delete > Not sure what "smarted delete" means. If you are pointing to approach of "f I was pointing to that, but I will just write your second suggestion. http://gerrit.cloudera.org:8080/#/c/20725/9/src/kudu/fs/block_manager-test.cc@984 PS9, Line 984: 0.9994174 > Add a one-liner to show how you arrived with this calculation. Haven't veri ok -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 25 Jan 2024 17:40:57 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#12). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 95 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/12 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#11). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 90 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/11 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#10). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 91 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/10 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#9). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 91 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/9 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. Patch Set 7: (8 comments) http://gerrit.cloudera.org:8080/#/c/20725/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/7//COMMIT_MSG@10 PS7, Line 10: fs_block_size > What's "fs_block_size"? Is it indeed the filesystem's block size (i.e. wha I will use "container block alignment". (it does not matter if it's st_blksize or f_bsize or some other value). http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@928 PS7, Line 928: const int kNumTries = 3; : const int kNumBlockTries = 500; : const int kNumAppends = 4; > nit: could these be constexpr? thanks. http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@939 PS7, Line 939: It is not the : // scope of this test to test this case. > This is looks quite confusing given what this test scenario is doing. Inst I changed/simplified the test. http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@942 PS7, Line 942: like the 64k option of rhel 8.8 > What are those exactly? I will change it to "If we use 64k block alignment (e.g RHEL ARM 8.8 64k core)" http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@951 PS7, Line 951: If you set a lower value, the test becomes flaky. With 0.2 it almost always fails. > This might become confusing later on when the root cause of KUDU-3528 is ad I try to give a better explanation. Also the root cause of the error is not that a container becomes full, but that we also delete all blocks. http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@982 PS7, Line 982: if (s.ok()) { : InsertOrDie(, id); : num_created++; : } > If this is just a way to overcome injected IO errors, why not to set FLAGS_ It was important in the original test. It is not important here (but also not a problem). I change the test to only have errors at deletion. Thank you for the suggestion. http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@999 PS7, Line 999: if (it != ids.end()) { : it++; : } > nit for here and elsewhere: it would be great if you preserved more of the Nothing. The original test tested what happens of we delete every 2nd block. So I copied it. It is better to delete all blocks. The problematic behaviour will occur more often. I change it. http://gerrit.cloudera.org:8080/#/c/20725/7/src/kudu/fs/block_manager-test.cc@1033 PS7, Line 1033: FLAGS_log_container_max_size = 1024 * 1024; This is not a good fix.I think there still is a chance around 1e-5 that we will be left with a .data file without .metadata with this setting. I obviously can't test it in a reasonable time, but I will revert to the flag solution. Please read the new commit message for more info. -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Wed, 24 Jan 2024 18:47:29 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3527 Fix block manager test when using 64k container block alignment
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#8). Change subject: KUDU-3527 Fix block manager test when using 64k container block alignment .. KUDU-3527 Fix block manager test when using 64k container block alignment BlockManagerTest.TestMetadataOkayDespiteFailure might fail on system where we use 64k alignment for data blocks. Root cause: Currently tablets fail to load if .metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is set to greater than zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. So deleting containers with injected io errors is expected to sometime prevent the block manager from restarting properly. However container deletion almost never occurred in this test until we run it on the new RHEL 8.8 ARM 64k core. Why is it stable on x86_64: On x86_64 we usually use 4k block alignment. We write 6080 byte data into a block, which is padded to 8k. So in the current test settings we have 32 blocks in a container when it becomes full (FLAGS_log_container_max_size = 256k). Later we delete exactly half of the 500 blocks we wrote. The chance of deleting all 32 blocks in a container is very small, and even if it happens, it still has around 0.09 chance to become corrupted. It is a bit flaky, but it would fail less than once in a billion run. If you dramatically decrease the FLAGS_log_container_max_size flag, the test starts to occasionally fail on a traditional x86_64 machine too. Why is it unstable with 64k alignment: With 64k alignment (currently used on ARM RHEL 8.8 64k core), there are 4 blocks in a full container file. We write 500 blocks, so we expect to have nearly 125 full files. If we delete exactly half of the blocks, we will make many (full) container file empty. Some of them might fail to be deleted properly leaving a lonely non-empty .data file without .metadata. On my RHEL machine the test fails 97-98% of the time for this exact reason. Solution: The test TestMetadataOkayDespiteFailure was supposed to test reloading the block manager with containers having deleted blocks, with some previous failed delete. It (probably) never tested the case when container deletion occurs. + Disabled container deletion, so the test scope remains the same as it was with smaller block alignments. + Add a new (currently disabled) test, to see how block manager handles the above described situation. Added a jira to find a solution (KUDU-3528). The original issue is not ARM specific, and far from trivial to solve, and was always in the system. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 91 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/8 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Hello Mahesh Reddy, Alexey Serbin, Ashwani Raina, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#7). Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. KUDU-3527 Fix block manager test on 64k filesystems BlockManagerTest.TestMetadataOkayDespiteFailure might fail on systems where fs_block_size=64k. Currently tablets fail to load if one metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is not zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. In the current test on systems with fs_block_size=4k deletion never occurs. Changing to kNumAppends=64 will cause the test to randomly fail on x86 systems too, although only with a 2-3% chance (at least on my ubuntu20 machine). The short term solution is to make sure no container becomes full. A jira was created: KUDU-3528 and a test was added: DISABLED_TestReloadBlockManagerDespiteFailure for the original issue. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 98 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/7 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20894 ) Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/20894/2/src/kudu/fs/dir_util.cc File src/kudu/fs/dir_util.cc: http://gerrit.cloudera.org:8080/#/c/20894/2/src/kudu/fs/dir_util.cc@a68 PS2, Line 68: > Maybe I am missing something here. PreAllocate always calls allocate, you are right. If you call it with non-whole blocks it just rounds up the reservation. If you allocate 16k it will take a whole 64k block. If you would allocate 80k, it would take up 128k. If you holepunch a non-aligned interval, it will zero the bits out, but only free up blocks that are entirely in the interval. (e.g if you holepunch from [2k,10k] on a 4k filesystem, it will only free up the 2nd block. it will zero out the second half of the 1st block and the first half of the 3rd block). If you do multiple holepunches, that overlap, it won't free up blocks that are shared between. (e.g on a 4k filesystem you holepunch [2k, 6k] [6k, 10k], then the 2nd block (4k to 8k)will still be reserved by the file. Here is an easy to run c++ example (for xfs): https://github.com/martonka/kudu_random_stuff/blob/master/cpp/hole_punch_test.cpp -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Fri, 19 Jan 2024 13:31:29 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 4: > Do you have any real problem running kudu? > For me Kudu seems to kind of handle f_bsize != st_blksize kind of. > So it can run an such a disk (maybe not by optimal performance). > So backward compatibility is an issue, since there can be a Kudu > running on such a system. which means only this flag solution can > work. > Also some of these consistently tests fail on x86_64 to if you > make an ifs disk with 1k blocks. > > I think using f_bsize instead of st_blksize might actually be the > better way on 64k rhel if the disk is 4k. > > So I support the addition of these flags. > However I think we should extend/fix the tests to test both false > and true values of the flag. *xfs disk -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 4 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 16 Jan 2024 14:58:00 + Gerrit-HasComments: No
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 4: Do you have any real problem running kudu? For me Kudu seems to kind of handle f_bsize != st_blksize kind of. So it can run an such a disk (maybe not by optimal performance). So backward compatibility is an issue, since there can be a Kudu running on such a system. which means only this flag solution can work. Also some of these consistently tests fail on x86_64 to if you make an ifs disk with 1k blocks. I think using f_bsize instead of st_blksize might actually be the better way on 64k rhel if the disk is 4k. So I support the addition of these flags. However I think we should extend/fix the tests to test both false and true values of the flag. -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 4 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 16 Jan 2024 14:56:41 + Gerrit-HasComments: No
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Hello Ashwani Raina, Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20894 to look at the new patch set (#5). Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Fix CheckHolePunch for bigger than 4k blocks. With the new 64k ARM core, you can use filesystems with up to 64k block size. Currently Kudu does not start if the data is on a filesystem with larger than 4k blocks. Problem: CheckHolePunch has hard-coded 4k block size instead of using the get env->GetBlockSize. Kudu seems to be running fine on a 64k fs after this fix. More info: https://kudu.apache.org/docs/troubleshooting.html#req_hole_punching Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 --- M src/kudu/fs/dir_util.cc 1 file changed, 8 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/20894/5 -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20894 ) Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/20894/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20894/2//COMMIT_MSG@12 PS2, Line 12: > Maybe you could add more details on what fails in this test if underlying f I just wanted to give a test, so if some1 wants to try it out, he does not have to run a whole server. I changed the commit message, so It wont confuse anyone. It is not a unit test fix. This is required for kudu to start http://gerrit.cloudera.org:8080/#/c/20894/1/src/kudu/fs/dir_util.cc File src/kudu/fs/dir_util.cc: http://gerrit.cloudera.org:8080/#/c/20894/1/src/kudu/fs/dir_util.cc@78 PS1, Line 78: ETURN_NOT_OK(env->GetBlockSize(filename, > There are some failed tests, please check it. Thank you. http://gerrit.cloudera.org:8080/#/c/20894/2/src/kudu/fs/dir_util.cc File src/kudu/fs/dir_util.cc: http://gerrit.cloudera.org:8080/#/c/20894/2/src/kudu/fs/dir_util.cc@a68 PS2, Line 68: > Just wondering, if we keep offset at 4K which is unaligned for a 64K file-s We use ioctl in case of xfs. (see Status PunchHole in env_posix.cc) It does not report any error. Currently we fail with error message in line 96. (since file will be 64k). If we reserve a multiple of 64k, but make an unaligned or not long enough hole punch, then we fail with the message in line 105. -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 16 Jan 2024 11:00:51 + Gerrit-HasComments: Yes
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Hello Ashwani Raina, Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20894 to look at the new patch set (#4). Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Fix CheckHolePunch for bigger than 4k blocks. With the new 64k ARM core, you can use filesystems with up to 64k block size. Currently Kudu does not start if the data is on a filesystem with larger than 4k blocks. Problem: CheckHolePunch has hard-coded 4k block size instead of using the get env->GetBlockSize. Kudu seems to be running fine on a 64k fs after this fix. More info: https://kudu.apache.org/docs/troubleshooting.html#req_hole_punching Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 --- M src/kudu/fs/dir_util.cc 1 file changed, 8 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/20894/4 -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Hello Ashwani Raina, Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20894 to look at the new patch set (#3). Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Fix CheckHolePunch for bigger than 4k blocks. With the new 64k ARM core, you can use filesystems with up to 64k block size. Currently Kudu does not start if the data is on a filesystem with larger than 4k blocks. Problem: CheckHolePunch has hard-coded 4k block size instead of using the get env->GetBlockSize. Kudu seems to be running fine on a 64k fs after this fix. Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 --- M src/kudu/fs/dir_util.cc 1 file changed, 8 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/20894/3 -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
[kudu-CR] [log block manager-test] Improve random selection
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20899 ) Change subject: [log_block_manager-test] Improve random selection .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/20899/4/src/kudu/fs/log_block_manager-test.cc File src/kudu/fs/log_block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20899/4/src/kudu/fs/log_block_manager-test.cc@1296 PS4, Line 1296: auto rng = std::default_random_engine{}; Did you consider seting a random seed? Seed seems to be randomised in some of the other tests in this file (SeedRandom()). I don't see any reason not to. (It was probably just forgotten). Also you could delete (begin + (1-ratio) * count, end) to get an absolutely superfluous perf improvement :P http://gerrit.cloudera.org:8080/#/c/20899/4/src/kudu/fs/log_block_manager-test.cc@1298 PS4, Line 1298: ids.erase(ids.begin(), ids.begin() + static_cast(kLiveBlockRatio * kNumBlocks)); > Is it better to use another variable to store the to-delete ids, for exampl ids is a local variable in this 17 lines lambda. The current way seems kind of safe for me (and better performance). -- To view, visit http://gerrit.cloudera.org:8080/20899 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f73f0572319784b07845b02d15a7d3e6f31abce Gerrit-Change-Number: 20899 Gerrit-PatchSet: 4 Gerrit-Owner: Ádám Bakai Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Tue, 16 Jan 2024 07:48:30 + Gerrit-HasComments: Yes
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Hello Kudu Jenkins, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20894 to look at the new patch set (#2). Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Fix CheckHolePunch for bigger than 4k blocks. With the new 64k ARM core, you can use filesystems with up to 64k block size. One test it fixes on 64k xfs: log_block_manager-test/ContainerPreallocationTest Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 --- M src/kudu/fs/dir_util.cc 1 file changed, 8 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/20894/2 -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
[kudu-CR] Fix CheckHolePunch for bigger than 4k blocks.
Zoltan Martonka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20894 Change subject: Fix CheckHolePunch for bigger than 4k blocks. .. Fix CheckHolePunch for bigger than 4k blocks. With the new 64k ARM core, you can use filesystems with up to 64k block size. One test it fixes on 64k xfs: log_block_manager-test/ContainerPreallocationTest Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 --- M src/kudu/fs/dir_util.cc 1 file changed, 8 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/20894/1 -- To view, visit http://gerrit.cloudera.org:8080/20894 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib4af73f0aa25db674fe0a34355cecd27a0c68417 Gerrit-Change-Number: 20894 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Martonka
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 4: > 4k block, then 15*4k holes, and then 16*512 4k block reserved Correction: 4k block, then 15*4k holes, and then 16*511 4k block reserved -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 4 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 14 Dec 2023 14:22:37 + Gerrit-HasComments: No
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 4: I looked into LogBlockManagerTest/ContainerPreallocationTest on our cloudera rhel (Xixu probably has a similar setup). 64k kernel with 4k disk. The following happens: + The encryption header is written without preallocating FLAGS_log_container_preallocate_bytes bytes. + When we try to write some data, kudu round up the current byte size to 64k and tries to allocate the next block from there until FLAGS_log_container_preallocate_bytes bytes. But it should round to 4k, because f_bsize is 4k and it is the one that allocate/hole punch uses. Since FLAGS_log_container_preallocate_bytes=33554432, Kudu expects to have 512 64k continuous block reserved. Instead it will have a 4k block, then 15*4k holes, and then 16*512 4k block reserved again. It is clearly kudu wrong and not the system api. If you wish to debug it yourself and make your own conclusion I would suggest: + on log_block_manager-test.cc line remove the false from INSTANTIATE_TEST_SUITE_P(...) + put a breakpoint in log_block_manager.cc line 1635: RETURN_NOT_OK_HANDLE_ERROR(data_file_->PreAllocate(off, len, RWFile::CHANGE_FILE_SIZE)); + Run ContainerPreallocationTest. When it stops in the breakpoint, you can check that the file is only 4k, and it tries to allocate after 64k. -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 4 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 14 Dec 2023 14:19:59 + Gerrit-HasComments: No
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG@7 PS5, Line 7: Fix block manager test on 64k filesystems > Just for the record and to make things simpler here, you could make use of I commented on other Gerrit ( https://gerrit.cloudera.org/#/c/20680/), please continue there. -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 14 Dec 2023 14:04:19 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3523 Use f bsize as Kudu block size instead of st blksize
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20680 ) Change subject: KUDU-3523 Use f_bsize as Kudu block size instead of st_blksize .. Patch Set 4: > The reason behind 64K block size on XFS could be because the > filesystem was created with "-b size=64K" mkfs option. Could you > confirm that? > > Also, in such cases, even though user is able to create a > filesystem with block size greater than general page size on linux > i.e. 4K, default mount API may not work because kernel is compiled > with 4K default page size and it doesn't allow a filesystem with > more than 4K block size to be mounted. > This could be one of the possibilities. > > Could you please share the following information: > 1. Output of command -> "getconf PAGE_SIZE" on the node where xfs > filesystem is hosted > 2. mkfs command used to create xfs filesystem (i.e. /dev/vdb) > 3. Check if punch hole is supported on a filesystem with block size > more than 4K. Ashwani, You can easily set up the environment, just use the cloudera-systest-base-rhel-8.8-hvm-arm-grp image (Virginia/public) in cloudcat. This will run a 64k rhel kernel on a filesystem created by "-b size=4k". (on public aws I can only select 4k kernel for arm rhel, so it is trickier there). Use "grep -ir pagesize /proc/self/smaps" to check kernel size. There is the following 4 combinations: Case1. 4k kernel with "-b size=4k" disk: Kudu works fine. Case2. 64k kernel with "-b size=4k" disk: This is the case that Wang Xixu is trying to fix. To my best understanding kudu kind of works without encryption (maybe does not preallocated properly, or there are other hidden issues, but I could start a server and use it). Kudu breaks with encryption. Case3. 64k kernel with "-b size=64k" disk: Status CheckHolePunch(…)” in dir_util.cc fails. Kudu is absolutely broken. Case4. 4k kernel with "-b size=64k": system refuses to mount the disk (mount(2) system call failed: Function not implemented.). Do we need to support 64k kernel? I guess we have to. But then do we also want to support Case3 (it would be logical)? -- To view, visit http://gerrit.cloudera.org:8080/20680 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I04343caf5fe377a4fa9b57e6e450307e6b995928 Gerrit-Change-Number: 20680 Gerrit-PatchSet: 4 Gerrit-Owner: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yifan Zhang Gerrit-Reviewer: Yingchun Lai Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 14 Dec 2023 13:58:42 + Gerrit-HasComments: No
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG@7 PS5, Line 7: Fix block manager test on 64k filesystems > Could you give more detail about the test on system where f_bsize=64k? For f_bsize=64k, I do the following. Start an rhel 8.8 with 64k kernel. Mount an additional disk and format it: mkfs.xfs -b /dsize=65536 -f /dev/ (to my best understanding you should prefer using such formating with 64k kernel or use the 4k kernel version of rhel arm). mount it and put kudu data dir on it (export TEST_TMPDIR=/mnt/disk2/kudu_test) you will have f_bsize=64k and st_blksize=64k. Kudu wont start because “Status CheckHolePunch(…)” in dir_util.cc has hardcoded values (also needs fixing). I took a deeper look at https://gerrit.cloudera.org/#/c/20680/. I think I finally start to understand the problem. Your fix is necessary (f_bsize is indeed what hole punch/ preallocation uses). However it is just one of multiple fixes that are needed. I suggest we continue the discussion in 20680 For the fix in this request: Fixing the “f_bsize > 4k || st_blksize > 4k” problem wont fix BlockManagerTest.TestMetadataOkayDespiteFailure. It will still fail for f_bsize=64k disk (as it does even on x86_64 with minor changes). I thing we should mere this fix, as it is just accidentally correlated with the f_bsize issues. However if you guys would like to keep it open until other issues are fixed, and we can test it on f_bsize=64k system, then we can leave it open. -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 14 Dec 2023 12:52:32 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Fix CountOnes() function
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20736 ) Change subject: [ARM] Fix CountOnes() function .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20736/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20736/2//COMMIT_MSG@9 PS2, Line 9: arms > OK, it's just a nit anyway. Sorry, I have no idea how I managed to miss it. -- To view, visit http://gerrit.cloudera.org:8080/20736 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc Gerrit-Change-Number: 20736 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Mon, 11 Dec 2023 09:56:57 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Fix CountOnes() function
Hello Tidy Bot, Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20736 to look at the new patch set (#4). Change subject: [ARM] Fix CountOnes() function .. [ARM] Fix CountOnes() function This function is only used on arms. It does not work on numbers greater than 255: For some reason ULL suffix was added to 2 masks, that should be 32bit, and losing the higher bits are actually the desired behaviour. Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc --- M src/kudu/gutil/bits.h 1 file changed, 6 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/20736/4 -- To view, visit http://gerrit.cloudera.org:8080/20736 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc Gerrit-Change-Number: 20736 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. Patch Set 6: I created a different ticket for the original issue: https://issues.apache.org/jira/browse/KUDU-3528 Also provided slight modifications to make the test fail on normal x86_64 machines. I would like to deliver this now, and fix the original issue later (because it is a long task, and has nothing to do with arm or the original purpose of this test). -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Fri, 01 Dec 2023 10:00:09 + Gerrit-HasComments: No
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/20725/5/src/kudu/fs/block_manager-test.cc File src/kudu/fs/block_manager-test.cc: http://gerrit.cloudera.org:8080/#/c/20725/5/src/kudu/fs/block_manager-test.cc@936 PS5, Line 936: FLAGS_log_container_max_size = 512 > I am not quite clear on why we need to increase container max size. It woul I tried to give an other explanation. Also instead of increasing the container size, so deletion of the log block file itself never happens, I realised it is easier to disable the dead container deletion: FLAGS_log_block_manager_delete_dead_container = false; -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 30 Nov 2023 16:21:48 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Hello Mahesh Reddy, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20725 to look at the new patch set (#6). Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. KUDU-3527 Fix block manager test on 64k filesystems BlockManagerTest.TestMetadataOkayDespiteFailure might fail on systems where fs_block_size=64k. Currently tablets fail to load if one metadata is missing but there is still a non-empty ".data" file. If FLAGS_env_inject_eio is not zero, then there is a chance that, when we delete the container file, we only delete the ".meta", but leave the ".data" file. In the current test on systems with fs_block_size=4k deletion never occurs. Changing to kNumAppends=64 will cause the test to randomly fail on x86 systems too, although only with a 2-3% chance (at least on my ubuntu20 machine). The short term solution is to make sure no container becomes full. Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 --- M src/kudu/fs/block_manager-test.cc 1 file changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/25/20725/6 -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] KUDU-3527 Fix block manager test on 64k filesystems
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20725 ) Change subject: KUDU-3527 Fix block manager test on 64k filesystems .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20725/5//COMMIT_MSG@7 PS5, Line 7: Fix block manager test on 64k filesystems > Have you looked at this patch -> https://gerrit.cloudera.org/#/c/20680/ Yes I did, but I don't fully understand the purpose of it. In our current cloudcat rhel 8.8 image f_bsize is 4k, so it would fix the test. But if in some system f_bsize=64k too, then the test would still fail. Basically if GetBlockSize() return >=64k, then the test has a chance to fail. -- To view, visit http://gerrit.cloudera.org:8080/20725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7e325bde502b7d7f39dd17fa84cb7eb42a3d7c20 Gerrit-Change-Number: 20725 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 30 Nov 2023 13:49:16 + Gerrit-HasComments: Yes
[kudu-CR] [ARM] Fix CountOnes() function
Hello Tidy Bot, Alexey Serbin, Ashwani Raina, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20736 to look at the new patch set (#3). Change subject: [ARM] Fix CountOnes() function .. [ARM] Fix CountOnes() function This function is only used on arms. It does not work on numbers greater than 255: For some reason ULL suffix was added to 2 masks, that should be 32bit, and losing the higher bits are actually the desired behaviour. Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc --- M src/kudu/gutil/bits.h 1 file changed, 4 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/20736/3 -- To view, visit http://gerrit.cloudera.org:8080/20736 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc Gerrit-Change-Number: 20736 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka
[kudu-CR] [arm64] Fix CountOnes() function
Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20736 ) Change subject: [arm64] Fix CountOnes() function .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20736/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20736/2//COMMIT_MSG@7 PS2, Line 7: Fix CountOnes() function > Out of curiosity - did this issue manifest into some failure? bitmap-test TestBitMap.TestIteration was failing. inline void ForEachBit(...) ub bitmap.h has int tot_count = Bits::CountOnes64withPopcount(w); in line 216. This returned 33686278 instead of 6. It also caused the test to fail really slow, because it prints this array (filled with "64") to std::cout. This function (ForEachSetBit) is also used in rowlock.cc, but perf test runs fine, I don't know what exactly uses the corresponding function. -- To view, visit http://gerrit.cloudera.org:8080/20736 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc Gerrit-Change-Number: 20736 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Ashwani Raina Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka Gerrit-Comment-Date: Thu, 30 Nov 2023 09:28:21 + Gerrit-HasComments: Yes
[kudu-CR] [arm64] Fix CountOnes() function
Hello Tidy Bot, Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20736 to look at the new patch set (#2). Change subject: [arm64] Fix CountOnes() function .. [arm64] Fix CountOnes() function This function is only used on arms. It does not work on numbers greather than 255: For some reason ULL sufix was added to 2 masks, that should be 32bit, and losing the higher bits are actually the desired behaviour. Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc --- M src/kudu/gutil/bits.h 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/36/20736/2 -- To view, visit http://gerrit.cloudera.org:8080/20736 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia4b78b5c1df9547548fc7980cec4b84349e868bc Gerrit-Change-Number: 20736 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Martonka Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Zoltan Martonka