Abhishek Chennaka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21773
Change subject: Squash cherry-picked commits #10 This is a combination of 20 commits. ...................................................................... Squash cherry-picked commits #10 This is a combination of 20 commits. This is the 1st commit message: KUDU-2671: Update upstream docs This patch updates the upstream docs to include range specific hash schemas within the partitioning section. An example with the proper sql syntax is also included in the kudu impala integration doc. Reviewed-on: http://gerrit.cloudera.org:8080/21108 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit cf550d6d7cdd61f6c65f9ef75a1706cb91839876) This is the commit message #2: Add a benchmark for CBTree concurrent writes. Before updating CBTree for ARM (where it is misbehaving currently), we should have a proper test for two scenarios: + Writing on multiple threads. + Reading on multiple threads while there are also active writes. If read threads wait for values to be inserted, it defeats the purpose of benchmarking. Therefore, we should first populate a tree with values for the read threads. The read threads will then read values that are already in the tree, while the write threads continue to insert new values. Setting up the tree for the second scenario essentially involves performing the first scenario. This is why both scenarios are combined into a single test. The new test provides the following new features (compared to just running DoTestConcurrentInsert with higher parameters): + Different threads read the value that inserted it + Reader threads can't be assigned to a certain writer thread. + Keys are better distributed than the previous shuffle method. + Allows measuring read-heavy performance (with a flag). Reading threads start concurrently with writing threads, not at the end of each write thread (unlike DoTestConcurrentInsert). Note that running only concurrent reads should not differ from TestScanPerformance, since no locking takes place and they do not sabotage each other. So no new test is required for that scenario. Reviewed-on: http://gerrit.cloudera.org:8080/21447 Reviewed-by: Zoltan Chovan <[email protected]> Reviewed-by: Ashwani Raina <[email protected]> Reviewed-by: Alexey Serbin <[email protected]> Tested-by: Alexey Serbin <[email protected]> (cherry picked from commit f4a47fe041b7f547fd4816706347523e06f94f6d) This is the commit message #3: [build] bootstrap-dev-env.sh fix for ubuntu 22.04+ On Ubuntu 22.04 python package is renamed to python2 On 23.10 and 24.04 python2 is no longer available. We should just use python3 on newer platforms. Tested on: 18.04, 20.04. 22.04, 23.10, 24.04 Reviewed-on: http://gerrit.cloudera.org:8080/20559 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 0ace3633fa22e04b8edc567669833c1615cd4ad9) This is the commit message #4: [metrics] Add metrics for tablet copy op time Add server-level statistics to track the time consumption of copy tablet operations. This is effective both for the source tablet and destination tablet during the copy operation. The addition of monitoring items will aid in historical issue tracking and analysis, as well as facilitate the configuration of monitoring alarms. Reviewed-on: http://gerrit.cloudera.org:8080/21356 Reviewed-by: Alexey Serbin <[email protected]> Tested-by: Alexey Serbin <[email protected]> (cherry picked from commit d370e0e4511508790c065340a52242ee09ecfea3) This is the commit message #5: [util] remove last vestiges of chromium Atomics from metrics Reviewed-on: http://gerrit.cloudera.org:8080/21505 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Yingchun Lai <[email protected]> (cherry picked from commit 6bdd0c3d8b747169f09150874cd8751debaf2ed1) This is the commit message #6: KUDU-3584 fix flakiness in TableKeyRangeTest When running client-test in TSAN/ASAN builds, the TableKeyRangeTest.TestGetTableKeyRange scenario would sometimes fail on busy nodes. This patch addresses the issue by increasing the timeout for scanners and for the write session, and making the related code more robust overall. I also took the liberty of cleaning up the related code. Reviewed-on: http://gerrit.cloudera.org:8080/21506 Reviewed-by: KeDeng <[email protected]> Reviewed-by: Marton Greber <[email protected]> Tested-by: Marton Greber <[email protected]> (cherry picked from commit 5063e80e1ca26c5f1b763a6ce3e4708cd8196a26) This is the commit message #7: KUDU-3567 Fix reource leak in AsyncKuduScanner To avoid resource leak in AsyncKuduScanner, we should reuse the HashedWheelTimer instance from the corresponding AsyncKuduClient object in AsyncKuduScanner. Reviewed-on: http://gerrit.cloudera.org:8080/21512 Reviewed-by: Wang Xixu <[email protected]> Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 3e43ae9f02da5602d4d3dc50b83204e0bafd1942) This is the commit message #8: [rpc] remove last vestiges of chromium Atomics from RPC Reviewed-on: http://gerrit.cloudera.org:8080/21513 Tested-by: Kudu Jenkins Reviewed-by: Marton Greber <[email protected]> (cherry picked from commit 5405d06eeccaf1b0eb559e4a531f48160141ed16) This is the commit message #9: [gitignore] ignore .qt, .qtc_clangd, .vscode dirs Reviewed-on: http://gerrit.cloudera.org:8080/21522 Tested-by: Kudu Jenkins Reviewed-by: Marton Greber <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> (cherry picked from commit 7ce157ef2e9823ec8dc5dec03651660408f46f2e) This is the commit message #10: [fs] remove chromium Atomics from FS Reviewed-on: http://gerrit.cloudera.org:8080/21521 Tested-by: Marton Greber <[email protected]> Reviewed-by: Marton Greber <[email protected]> Reviewed-by: Zoltan Chovan <[email protected]> (cherry picked from commit 5977402e3701b52888f91b7bb1e351f957e3c562) This is the commit message #11: KUDU-3580 Fix the crash caused when binaries run on older CPU machines After Kudu linking rocksdb, the Kudu binaries may crash with error "Illegal instruction" when running on machines which don't support newer CPU instruction (e.g. AVX512) but were built on a machine which supports. This patch enables the PORTABLE [1] option when building librocksdb to fix the issue. It should be noted that portable libraries may cause a slight performance degradation, it's recommend to disable portable option (by setting PORTABLE environment variable to OFF when build Kudu thirdparties) if there is no port requirements. The PORTABLE option only takes effect on librocksdb currently, the following content shows the comparation of the 'db_bench' tool of RocksDB with the '-DPORTABLE' option enabled and disabled benchmark results: - The test is similar to Kudu use case, random write and sequential read, key and value size is about 40 bytes. - The tests ran 3 times. - The binaries are built and run on the same machine which supports newer CPU instruction (e.g. AVX512). PORTABLE: $ ./db_bench -benchmarks=fillrandom,readseq -num=10000000 -key_size=40 -value_size=40 1. fillrandom : 5.237 micros/op 190954 ops/sec 52.369 seconds 10000000 operations; 14.6 MB/s readseq : 0.448 micros/op 2231382 ops/sec 2.833 seconds 6322271 operations; 170.2 MB/s 2. fillrandom : 5.236 micros/op 190981 ops/sec 52.361 seconds 10000000 operations; 14.6 MB/s readseq : 0.444 micros/op 2252646 ops/sec 2.806 seconds 6321658 operations; 171.9 MB/s 3. fillrandom : 5.182 micros/op 192960 ops/sec 51.824 seconds 10000000 operations; 14.7 MB/s readseq : 0.444 micros/op 2252317 ops/sec 2.807 seconds 6323209 operations; 171.8 MB/s NON-PORTABLE: $ ./db_bench -benchmarks=fillrandom,readseq -num=10000000 -key_size=40 -value_size=40 1. fillrandom : 5.190 micros/op 192676 ops/sec 51.900 seconds 10000000 operations; 14.7 MB/s readseq : 0.391 micros/op 2560051 ops/sec 2.470 seconds 6322786 operations; 195.3 MB/s 2. fillrandom : 5.156 micros/op 193945 ops/sec 51.561 seconds 10000000 operations; 14.8 MB/s readseq : 0.404 micros/op 2477956 ops/sec 2.551 seconds 6320644 operations; 189.1 MB/s 3. fillrandom : 5.527 micros/op 180940 ops/sec 55.267 seconds 10000000 operations; 13.8 MB/s readseq : 0.407 micros/op 2458297 ops/sec 2.571 seconds 6320885 operations; 187.6 MB/s 1. https://github.com/facebook/rocksdb/blob/v7.7.3/CMakeLists.txt#L248 Reviewed-on: http://gerrit.cloudera.org:8080/21287 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 1474380f5ccfd2f7e78756488d12eb52d2664132) This is the commit message #12: [metrics] Add metrics for tablet replica election Add tablet-level statistics to track the time consumed for replica leader election. For any tablet that initiates a leader election operation, the time will be recorded regardless of the outcome. The addition of monitoring items will aid in historical issue tracking and analysis, as well as facilitate the configuration of monitoring alarms. Reviewed-on: http://gerrit.cloudera.org:8080/21490 Reviewed-by: Yingchun Lai <[email protected]> Reviewed-by: Zoltan Chovan <[email protected]> Tested-by: Yingchun Lai <[email protected]> (cherry picked from commit 0299a254b3e832c94786b155c562cd5d5b46fcc3) This is the commit message #13: [metrics] Add metrics for tablet replication time Add tablet-level metric to track the time cost of replication between replicas. To verify the correctness of the new logic, I constructed a synchronization scenario based on write operations. The addition of monitoring items will aid in historical issue tracking and analysis, as well as facilitate the configuration of monitoring alarms. Reviewed-on: http://gerrit.cloudera.org:8080/21507 Reviewed-by: Zoltan Chovan <[email protected]> Reviewed-by: Yingchun Lai <[email protected]> Tested-by: Yingchun Lai <[email protected]> (cherry picked from commit 1b99da532f52d143c46440c3903785d642fb45a3) This is the commit message #14: [java] Use MetastoreConf instead of HiveConf Using the HiveConf class for Metastore configuration has been deprecated for a while. The actual configrations have been deprecated for about ~6 years, this happened around Hive 3.1 release [1]. This change replaces the HiveConf properties usage in TestKuduMetastorePlugin.java. [1] https://github.com/apache/hive/commit/105cc6543051fe5697de754a0f093539bdac59ff#diff-b7bbe8545a21ec7d7e9cfe40ef66444789e332996aaa9e7f1430dbe4822a2c9c Reviewed-on: http://gerrit.cloudera.org:8080/21548 Reviewed-by: Abhishek Chennaka <[email protected]> Tested-by: Abhishek Chennaka <[email protected]> Reviewed-by: Wang Xixu <[email protected]> Tested-by: Marton Greber <[email protected]> Reviewed-by: Marton Greber <[email protected]> (cherry picked from commit be4d9a987d7de6dfdc234f83b8302f6c43bcd3bf) This is the commit message #15: [cfile] allocate CFileWriter field on the stack when possible This patch updates BloomFileWriter and DeltaFileWriter classes to avoid allocating their CFileWriter member field on the heap, and also contains other minor clean-up. It makes the code cleaner and results in less calls to new/tcmalloc. This patch doesn't contain any functional modifications. Reviewed-on: http://gerrit.cloudera.org:8080/21543 Reviewed-by: Yingchun Lai <[email protected]> Tested-by: Marton Greber <[email protected]> Reviewed-by: Marton Greber <[email protected]> (cherry picked from commit eedce87cafbf9ecf17a213060595a39fcc79efba) This is the commit message #16: [tool] Add '--columns' param to 'table list' Currently there is no easy way to get table UUIDs from the kudu CLI, this patch adds the '--columns' optional parameter to the 'kudu table list' command that works similiar to 'kudu master/tserver list'. The available columns are: 'id', 'name', 'num_tablets', 'num_replicas', 'live_row_count'. Reviewed-on: http://gerrit.cloudera.org:8080/21496 Reviewed-by: Wang Xixu <[email protected]> Tested-by: Marton Greber <[email protected]> Reviewed-by: Marton Greber <[email protected]> (cherry picked from commit d91d5c95dab38770890cac6a30be63f80eb82fec) This is the commit message #17: [client] add ScanTokenStaleRaftMembershipTest This patch adds a new test scenario TabletLeaderChange into the newly added ScanTokenStaleRaftMembershipTest fixture. The motivation for this patch was a request to clarify on the Kudu C++ client's behavior in particular scenarios, which on itself was in the context of a follow-up to KUDU-3349. Reviewed-on: http://gerrit.cloudera.org:8080/21580 Reviewed-by: Ashwani Raina <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> Tested-by: Alexey Serbin <[email protected]> (cherry picked from commit e44e0d4892b0e2469a18aefb78062f5aa2e1799c) This is the commit message #18: KUDU-3590 Update expired test certificates Certificates generated in 2d624f877 expired after 1 year, causing several test failures in security-itest and rpc-test. This commit replaces the expired certificates with ones with a validity of 20 years. Note: If you git blame this file in 2044 in your flying car, please make sure to update the keys and certificates to whatever is reasonable for your quantum computers instead of simply relying on the instructions in the comments. Certificate chain: Issuer: CN=IntermediateCA, ST=California, C=US, [email protected], O=Apache Software Foundation, OU=Intermediate CA Validity Not Before: Jul 23 19:58:19 2024 GMT Not After : Apr 9 19:58:19 2044 GMT Subject: CN=127.0.0.1, ST=California, C=US, [email protected], O=Apache Software Foundation, OU=Kudu Issuer: C=US, ST=Some-State, O=Apache Software Foundation, CN=127.0.0.1, [email protected] Validity Not Before: Jul 23 19:55:34 2024 GMT Not After : Jul 18 19:55:34 2044 GMT Subject: CN=IntermediateCA, ST=California, C=US, [email protected], O=Apache Software Foundation, OU=Intermediate CA Issuer: C=US, ST=Some-State, O=Apache Software Foundation, CN=127.0.0.1, [email protected] Validity Not Before: Jul 23 17:50:11 2024 GMT Not After : Jun 29 17:50:11 2124 GMT Subject: C=US, ST=Some-State, O=Apache Software Foundation, CN=127.0.0.1, [email protected] Reviewed-on: http://gerrit.cloudera.org:8080/21607 Reviewed-by: Marton Greber <[email protected]> Tested-by: Marton Greber <[email protected]> Reviewed-by: Zoltan Chovan <[email protected]> (cherry picked from commit 4decbbdf553ed54796ddbbb49e1142925657110f) This is the commit message #19: KUDU-3594 Fix scan_token-test on ASAN scan_token-test was failing on ASAN builds since ScanTokenStaleRaftMembershipTest.TabletLeaderChange was introduced in e44e0d48. Unfortunately, this was not caught during code review, as builds were failing due to other reasons. This commit fixes this issue by deallocating a tablet-server map at the end of the test. Reviewed-on: http://gerrit.cloudera.org:8080/21610 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 2d9292eb9bb849bf6a359f0738e92d9e6248b8a6) This is the commit message #20: [build] KUDU-3551 Upgrade gradle to 7.6.4 The current version of gradle is 3+ years old, and the highest Java version supported by it is Java 13, which is ~5years old. Upgrading Kudu's gradle version will make it possible to move from Java8 to a more recently released stable Java version, that would futureproof the Kudu Java client. Gradle 7.6 makes it possible to use up to Java19. Additionally Gradle has fixed multiple security vulnerabilities and issues since 6.8.3 [1] * Gradle version has been upgraded to 7.6.4 from 6.8.3, both in the dependencies declaration and the actual wrapper code itself (/java/gradlew) * Removed jcenter from list of repositories [2] * Replaced compile and testCompile configurations with implementation and testImplementation respectively [3] * Replaced extension methods with archiveExtension [4] * Removed scopes.gradle and the propdeps plugin as optional and provided scopes are no longer supported/working in Gradle7, these have been replaced by either compileOnly * Upgraded scalafmt version [5] * Pinned the scalafmt config to version 1.5.1 to avoid the new formatting errors that popped up with the new scalafmt version * Upgraded Spark3 version to 3.2.4 to solve Guava related compilation errors (some classes have been deprecated and removed, which led to compilation failure) * Updated the shadow plugin version from 6.1.0 to 7.1.2 * Added LogCaptor to resolve issues with CapturingLogAppender, which stopped working in the Scala tests, fixing was taking too much time so this was implemented as a workaround, it only affected 4 tests in TestKuduBackup.scala, if the CLA errors are fixed later, the dependency can be removed and the test changes reverted. * As the legacy maven plugin was removed, the signing and publishing has been rewritten using the new maven-publish plugin. I tested the signing and publishing with a locally hosted Maven repository, however this needs further testing and confirmation from a commiter to ensure it works as expected. The new plugin uses a different task to publish the artifacts (publish instead of uploadArchives) an alias was created to keep backwards compatibility with existing docs/scripts. * The permissions for assign-location.py script were updated, as previously the permissions of it's symlink were copied, but that changed in the new gradle version and the original (non executable) permissions led to test failures. * Two tasks were updated in the shadow.gradle file as well, this needs another round of verification after the jars are built correctly * Dependency scopes were only modified when necessary (compilation failure) Additional manual testing: * Attila Bukor has tested and confirmed that the java artifact signing and publishing works. [1] https://docs.gradle.org/7.6.3/release-notes.html?_gl=1*n7ayww*_ga*MjQ4NTk5ODI5LjE2ODc0MDc0OTA.*_ga_7W7NC6YNPT*MTcxNzc2MzIzOS44NC4xLjE3MTc3NjQyMTUuMjYuMC4w [2] https://docs.gradle.org/6.9.2/userguide/upgrading_version_6.html?_gl=1*1xt1kk*_ga*MjQ4NTk5ODI5LjE2ODc0MDc0OTA.*_ga_7W7NC6YNPT*MTcwNzEyMzc1Ni4zLjEuMTcwNzEyODU2MS41NC4wLjA.#jcenter_deprecation [3] https://docs.gradle.org/6.9.2/userguide/upgrading_version_5.html?_gl=1*1ldn5ls*_ga*MjQ4NTk5ODI5LjE2ODc0MDc0OTA.*_ga_7W7NC6YNPT*MTcwNzEyMzc1Ni4zLjEuMTcwNzEyODY1My40NS4wLjA.#dependencies_should_no_longer_be_declared_using_the_compile_and_runtime_configurations [4] https://docs.gradle.org/6.9.2/dsl/org.gradle.api.tasks.bundling.AbstractArchiveTask.html?_gl=1*1ldn5ls*_ga*MjQ4NTk5ODI5LjE2ODc0MDc0OTA.*_ga_7W7NC6YNPT*MTcwNzEyMzc1Ni4zLjEuMTcwNzEyODY1My40NS4wLjA.#org.gradle.api.tasks.bundling.AbstractArchiveTask:extension [5] https://github.com/alenkacz/gradle-scalafmt/issues/39 Reviewed-on: http://gerrit.cloudera.org:8080/21030 Tested-by: Kudu Jenkins Reviewed-by: Attila Bukor <[email protected]> (cherry picked from commit e456bc775805d4555d99ce12d04e3ca0b8950760) Change-Id: I96aac6a39f317eec6ff4715ef8367f4a3546d5e1 --- M .gitignore M RELEASING.adoc M docker/bootstrap-dev-env.sh M docs/kudu_impala_integration.adoc M docs/schema_design.adoc M java/.scalafmt.conf M java/build.gradle M java/buildSrc/build.gradle M java/buildSrc/src/main/groovy/org/apache/kudu/gradle/DistTestTask.java M java/gradle/artifacts.gradle M java/gradle/dependencies.gradle M java/gradle/publishing.gradle M java/gradle/quality.gradle D java/gradle/scopes.gradle M java/gradle/shadow.gradle M java/gradle/wrapper/gradle-wrapper.properties M java/gradlew M java/kudu-backup-common/build.gradle M java/kudu-backup-tools/build.gradle M java/kudu-backup/build.gradle M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala M java/kudu-client/build.gradle M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java M java/kudu-hive/build.gradle M java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java M java/kudu-jepsen/build.gradle M java/kudu-proto/build.gradle M java/kudu-spark-tools/build.gradle M java/kudu-spark/build.gradle M java/kudu-subprocess/build.gradle M java/kudu-test-utils/build.gradle M src/kudu/cfile/bloomfile.cc M src/kudu/cfile/bloomfile.h M src/kudu/cfile/cfile_util.cc M src/kudu/cfile/cfile_util.h M src/kudu/cfile/cfile_writer.cc M src/kudu/cfile/cfile_writer.h M src/kudu/client/client-internal.cc M src/kudu/client/client-internal.h M src/kudu/client/client-test.cc M src/kudu/client/scan_token-test.cc M src/kudu/common/schema.h M src/kudu/consensus/leader_election.cc M src/kudu/consensus/leader_election.h M src/kudu/consensus/raft_consensus.cc M src/kudu/consensus/raft_consensus.h M src/kudu/fs/block_manager-stress-test.cc M src/kudu/fs/file_block_manager.cc M src/kudu/fs/file_block_manager.h M src/kudu/fs/log_block_manager-test.cc M src/kudu/fs/log_block_manager.cc M src/kudu/fs/log_block_manager.h M src/kudu/hms/CMakeLists.txt M src/kudu/integration-tests/raft_consensus-itest.cc M src/kudu/integration-tests/tablet_copy-itest.cc M src/kudu/rpc/proxy.cc M src/kudu/rpc/proxy.h M src/kudu/rpc/rpc-bench.cc M src/kudu/rpc/rpc_stub-test.cc M src/kudu/scripts/assign-location.py M src/kudu/security/test/test_certs.cc M src/kudu/tablet/cbtree-test.cc M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltafile.h M src/kudu/tablet/ops/op_driver.cc M src/kudu/tablet/tablet_metrics.cc M src/kudu/tablet/tablet_metrics.h M src/kudu/tablet/tablet_replica-test.cc M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_table.cc M src/kudu/tserver/tablet_copy_client.cc M src/kudu/tserver/tablet_copy_client.h M src/kudu/tserver/tablet_copy_service.cc M src/kudu/tserver/tablet_copy_source_session.cc M src/kudu/tserver/tablet_copy_source_session.h A src/kudu/util/atomic-utils.h M src/kudu/util/bloom_filter.h M src/kudu/util/metrics.h M thirdparty/build-thirdparty.sh 79 files changed, 2,481 insertions(+), 1,296 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/21773/1 -- To view, visit http://gerrit.cloudera.org:8080/21773 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: newchange Gerrit-Change-Id: I96aac6a39f317eec6ff4715ef8367f4a3546d5e1 Gerrit-Change-Number: 21773 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Mahesh Reddy <[email protected]>
