Abhishek Chennaka has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21767
Change subject: Squash cherry-picked commits #4 This is a combination of 20 commits. ...................................................................... Squash cherry-picked commits #4 This is a combination of 20 commits. This is the 1st commit message: [thirpdarty] upgrade gperftools to 2.13 With gperftools 2.13 come many updates since 2.8.1. At least, as a cumulative update from 2.11rc2 (a.k.a. 2.10.80 pre-release) comes the support for generic_fp stack collection method on aarch64 platform [1], so now there is a parity between x86_64 and aarch64 builds in terms of available options for stack trace collection in gperftools. Also, the libgcc-based stacktrace capturing is more robust with the new version of gperftools. In addition, version 2.13 includes many other fixes since 2.8.1, for details see [2]. With the new version, by default gperftools are built with the '-fno-omit-frame-pointer -momit-leaf-frame-pointer' flag combination if supported by the platform, so I removed the --enable-frame-pointers flag for the configure script. The gperftools-tcmalloc-osx-fix.patch patch is no longer needed since it's automatically "included" into the gperftools source code. [1] https://github.com/gperftools/gperftools/releases/tag/gperftools-2.10.80 [2] https://github.com/gperftools/gperftools/releases Reviewed-on: http://gerrit.cloudera.org:8080/20545 Tested-by: Kudu Jenkins Reviewed-by: Marton Greber <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit 8e790439d57e939ef3dc992b327ae3d1ac18b0fa) This is the commit message #2: [CLI] Set rpc_max_message_size to accommodate huge response payloads It has been observed that for certain Kudu CLI commands response payload size is too big to fit in the current default RPC message size limit of 50MiB. This patch adds logic to set the value of RPC message max size for Kudu CLI based on maximum available memory or maximum possible RPC message size limit of 2GiB. This would help accommodate that extra heavy response payload. Along with increasing default value of rpc_max_message_size, default value of tablet_transaction_memory_limit_mb is also set to at least same value as rpc_max_message_size to pass the group validation check. This change of default flag values is only applicable to Kudu CLI tool. Reviewed-on: http://gerrit.cloudera.org:8080/20535 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 49e90d15eaaa38dcee2752d06e4f44d1f231fd94) This is the commit message #3: Remove unused parameter Parameter is_leader is not used in the implement of function Status Tablet::DecodeWriteOperations, therefore remove it. Reviewed-on: http://gerrit.cloudera.org:8080/20565 Reviewed-by: Alexey Serbin <[email protected]> Tested-by: Alexey Serbin <[email protected]> (cherry picked from commit fcd311964c12cfac42043b985f929c30976135eb) This is the commit message #4: [rpc] add local address into connection negotiation trace I found it would be easier for RPC connection negotiation troubleshooting to have the information on both the remote and the local addresses of the connection in the negotiation trace. That would help to remove ambiguity when there are multiple Kudu clients running at the same node from which RPC connections are initiated, so now it's easier to find corresponding pairs of negotiation traces in client- and server-side logs. I didn't add any tests, but I manually verified that the information on the local address is present in the connection negotiation traces, for example: 1011 19:59:40.775119 (+ 0us) reactor.cc:618] Submitting negotiation task for client connection to 127.206.101.1:46339 (local address 127.0.0.1:47450) 1011 19:59:40.775135 (+ 0us) reactor.cc:618] Submitting negotiation task for server connection from 127.0.0.1:47450 (local address 127.206.101.1:46339) Reviewed-on: http://gerrit.cloudera.org:8080/20564 Tested-by: Kudu Jenkins Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit ad32f5c34734f88d11029dba0ced57979dfc29b9) This is the commit message #5: [Python] KUDU-3351 Add write op metrics This is a follow-up patch for commit: 0ddcaaabc97c85a4715ae79ff5604feb9b342779, adding per-session write op metrics to the Python client. In the test function "test_insert_and_mutate_rows" I added verification function calls, to check that the metrics are gathered properly. Only "upsert_ignore_errors" is left out, as UPSERT IGNORE is not yet supported by the Python client. I'm planning to address that in the next patch. Reviewed-on: http://gerrit.cloudera.org:8080/20526 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit c02ad2fe699c7607d6fbe52d2c7e73ca2313d36e) This is the commit message #6: [Python] KUDU-3353 add support for UPSERT IGNORE This patch is a follow-up to commit: ec3a9f75b6924a70ecbf08e3805228ad9b92b9f0, it adds UPSERT IGNORE support to the Python client. Extended the already existing tests: * added write op metrics verification for immutable column tests, * extended immutable column tests with UPSERT IGNORE test, * addressed an UPSERT IGNORE TODO in the auto-incrementing column tests. Reviewed-on: http://gerrit.cloudera.org:8080/20527 Tested-by: Kudu Jenkins Reviewed-by: Yingchun Lai <[email protected]> (cherry picked from commit 1b015d7ee9cc8af262f398961cf471f76bb225ea) This is the commit message #7: [tablet] a small clean up on 'disable_compaction' While changing the code in the Tablet class, I found the naming of the methods related to the 'disable_compaction' property (a) confusing and (b) non-compliant with the project's style guide. This patch addresses the issue. I also extended the test coverage for the related functionality a little bit. Reviewed-on: http://gerrit.cloudera.org:8080/20577 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: KeDeng <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> (cherry picked from commit 4814bb2be2809e56bb9eda58c727ce8970ac9f0f) This is the commit message #8: [thirdparty] upgrade bison up to 3.8.2 This patch updates the bison library in Kudu's 3rd-party from 3.5.4 to 3.8.2 version. The rationale behind the upgrade was getting the following build error on macOS when building with clang 15.0.0 which comes with Xcode 15.3: kudu/thirdparty/src/bison-3.5.4/lib/obstack.c:351:31: error: incompatible function pointer types initializing 'void (*)(void) __attribute__((noreturn))' with an expression of type 'void (void)' [-Wincompatible-function-pointer-types] __attribute_noreturn__ void (*obstack_alloc_failed_handler) (void) ^ 1 error generated. make[2]: *** [lib/libbison_a-obstack.o] Error 1 Reviewed-on: http://gerrit.cloudera.org:8080/21192 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit 96a1770ad9f9be90fa9ce81eeb69561a18ca2da4) This is the commit message #9: [test] fix negotiation-test for krb5 version 1.19.2 TestNegotiation.TestPreflight was failing on Ubuntu 22.04.3 LTS because of the error message mismatch: src/kudu/rpc/negotiation-test.cc:1533: Failure Value of: s.ToString() Expected: contains regular expression "error accessing keytab: Permission denied " Actual: "Runtime error: GSSAPI Error: No credentials were supplied, or the credentials were unavailable or inaccessible (Permission denied)" This patch addresses the issue, making the reference error message pattern more generic. Reviewed-on: http://gerrit.cloudera.org:8080/20599 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit 060bb39b3549a3ebac99ad98cbbe72a87e834bfb) This is the commit message #10: [server] cleanup on setting JWT verifier for messenger This patch doesn't contain any functional modifications. Reviewed-on: http://gerrit.cloudera.org:8080/20608 Reviewed-by: Abhishek Chennaka <[email protected]> Tested-by: Alexey Serbin <[email protected]> (cherry picked from commit 9c68353839efcbf639cffbd22f65b6d3855d3463) This is the commit message #11: [rpc] add hostname in Messenger to use it in ServerNegotiation This patch introduces a new 'hostname' property for the Messenger class to use it in ServerNegotiation. By doing so, a call to GetFQDN() is avoided during RPC connection negotiation at the server side. A call to GetFQDN() might be quite expensive since sometimes it turns into a remote call, and if DNS resolver is slow or misconfigured, the round-trip might take several seconds. By avoiding calls to GetFQDN() as part of ServerNegotiation::InitSaslServer(), the server-side negotiation is now more robust at least when using Kerberos credentials for RPC authentication. The hostname (usually FQDN) is retrieved as a part of KuduServer::Init() and stored in the messenger. It's used later on to set the 'hostname' attribute for the server metrics and also when running server-side RPC connection negotiation. Of course, this is done under assumption that the name of a node isn't changing under a running Kudu server. This patch also contains a few test scenarios to cover the new functionality. Existing {TabletServerTest,MasterTest}.ServerAttributes test scenarios have been updated accordingly. Reviewed-on: http://gerrit.cloudera.org:8080/20609 Tested-by: Kudu Jenkins Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit 3932a7780b8005b7204b3058302600ea67cd1f2b) This is the commit message #12: [tablet] fix compilation warnings This patch fixes the following compilation warnings on macOS: ----- src/kudu/tablet/compaction-test.cc:281:53: warning: format specifies type 'long' but the argument has type 'int64_t' (aka 'long long') [-Wformat] snprintf(keybuf, sizeof(keybuf), kRowKeyFormat, row_key); ----- src/kudu/tablet/ops/op_driver.cc:82:16: warning: 'OpCompleted' overrides a member function but is not marked 'override' [-Winconsistent-missing-override] virtual void OpCompleted() { ^ src/kudu/tablet/ops/op.h:338:16: note: overridden virtual function is here virtual void OpCompleted(); ^ Reviewed-on: http://gerrit.cloudera.org:8080/20619 Tested-by: Kudu Jenkins Reviewed-by: Yifan Zhang <[email protected]> Reviewed-by: Mahesh Reddy <[email protected]> (cherry picked from commit 4a1ce1b0225d3416dcd753b915c3f4b3d65056e5) This is the commit message #13: Fix the missing parameterized unit test Reviewed-on: http://gerrit.cloudera.org:8080/20656 Tested-by: Yingchun Lai <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> (cherry picked from commit 0ce3e14f4d9d2f70cfa87952fb83df1e2f619945) This is the commit message #14: KUDU-3519: List masters in /dump-entities Tablet servers are already listed on the /dump-entities master endpoint, but the masters are missing. To allow easily accessing master addresses and UUIDs in a parsable format, this commit adds the master info as well. Reviewed-on: http://gerrit.cloudera.org:8080/20628 Reviewed-by: Marton Greber <[email protected]> Tested-by: Kudu Jenkins Reviewed-by: Wang Xixu <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit 7be030baee38ea971f5e3afddcff4c61cbe2acf3) This is the commit message #15: [compaction] Skip memory allocation for ancient undo deltas Currently, while applying REDO mutations to base row and create corresponding UNDO deltas for each REDO mutation, compaction code doesn't check whether any mutation is ancient that is anyway going to be ignored and removed from list of UNDO deltas in later stage of processing. This change checks each REDO mutation beforehand and doesn't allocate any memory if found ancient. This avoids unnecessary memory usage. Reviewed-on: http://gerrit.cloudera.org:8080/20546 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 3bdaf50b5565bb9f8b4a0652f94102c7f272ebb1) This is the commit message #16: [util] fix compilation warning This patch fixes compilation warning produced by GCC: src/kudu/util/env_posix.cc:484:22: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘kudu::{anonymous}::EncryptionAlgorithm’ [-Wformat=] StringPrintf("no cipher for algorithm 0x%02x", eh->algorithm)); I also wrapped the results of GetEVPCipher() into PREDICT_FALSE() macro. Reviewed-on: http://gerrit.cloudera.org:8080/20671 Reviewed-by: Yifan Zhang <[email protected]> Tested-by: Kudu Jenkins Reviewed-by: Mahesh Reddy <[email protected]> (cherry picked from commit ee9e0f04b8e9163898c032edeac65d143cc5195f) This is the commit message #17: [master] fix typo in MasterAddrsToCsv() I saw a funny log message, and it turned to be a typo: Unable to fetch master addresses: OK This patch addresses the issue. Reviewed-on: http://gerrit.cloudera.org:8080/20670 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Mahesh Reddy <[email protected]> Reviewed-by: Abhishek Chennaka <[email protected]> (cherry picked from commit aee98702cb8131f92ae18f9d6862ddc618145bdb) This is the commit message #18: [tools] KUDU-3337 Add unsafe_create_cmeta tool We've seen some cases when a power outage on XFS lead to empty cmeta files, causing some tablets to fail to start (KUDU-2195). There is a flag to force fsync, but it's disabled by default except for XFS. Fortunately, it's possible to reconstruct how a cmeta should look like based on the information found in ksck (peers) and WAL dumps (term and config index). Still, the only way to actually create a cmeta file even if this information is available, was to copy an existing cmeta file and run "kudu pbc edit" on it, which is very error-prone and hard to automate. This commit introduces a new unsafe_create_cmeta tool under local_replica, which creates a new cmeta file based on the term, config index and peers as provided in CLI arguments. I manually tested this tool by using it to recover a tablet with three empty cmeta files. Reviewed-on: http://gerrit.cloudera.org:8080/18029 Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 13a66ea9b088eec1de74249b738cc74333eefc4a) This is the commit message #19: KUDU-3498 Scanner keeps alive in periodically Kudu caches the scanner in the tablet server for continuing reading. It will be expired if the idle time is over the defined scanner ttl time. Sometimes the client reads a batch of data, if the data is every large, it takes a long time to handle it. Then the client reads the next batch using the same scanner, the scanner will be expired even if it has sent a keep alive request. This patch adds support for keeping a scanner alive periodically. It uses a timer to send keep alive requests background. So, it will never be expired when the scanner is in using. Reviewed-on: http://gerrit.cloudera.org:8080/20282 Tested-by: Alexey Serbin <[email protected]> Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit 8644d88dae6a76c5df3595b8a5aeb13df4d6ab5c) This is the commit message #20: Fix typo Reviewed-on: http://gerrit.cloudera.org:8080/20704 Reviewed-by: Zoltan Martonka <[email protected]> Tested-by: Kudu Jenkins Reviewed-by: Alexey Serbin <[email protected]> (cherry picked from commit fdaf70671ea87d2ba15dbaecf98a0a447a194834) Change-Id: I4618dbb2240d9baece1e6f47f9c5629e469d56f9 --- M python/kudu/__init__.py M python/kudu/client.pyx M python/kudu/libkudu_client.pxd M python/kudu/tests/common.py M python/kudu/tests/test_client.py M src/kudu/client/client-test.cc M src/kudu/client/client.cc M src/kudu/client/client.h M src/kudu/client/scanner-internal.cc M src/kudu/client/scanner-internal.h M src/kudu/integration-tests/alter_table-test.cc M src/kudu/integration-tests/ts_recovery-itest.cc M src/kudu/master/master-test.cc M src/kudu/master/master_path_handlers.cc M src/kudu/rpc/messenger.cc M src/kudu/rpc/messenger.h M src/kudu/rpc/negotiation-test.cc M src/kudu/rpc/negotiation.cc M src/kudu/rpc/reactor.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/security/tls_handshake.cc M src/kudu/server/server_base.cc M src/kudu/tablet/compaction-test.cc M src/kudu/tablet/compaction.cc M src/kudu/tablet/compaction.h M src/kudu/tablet/delta_compaction.cc M src/kudu/tablet/ops/op_driver.cc M src/kudu/tablet/ops/write_op.cc M src/kudu/tablet/tablet-test-base.h M src/kudu/tablet/tablet.cc M src/kudu/tablet/tablet.h M src/kudu/tablet/tablet_history_gc-test.cc M src/kudu/tablet/tablet_mm_ops-test.cc M src/kudu/tablet/tablet_mm_ops.cc M src/kudu/tablet/tablet_mm_ops.h M src/kudu/tools/kudu-tool-test.cc M src/kudu/tools/tool_action_local_replica.cc M src/kudu/tools/tool_main.cc M src/kudu/tserver/tablet_server-test.cc M src/kudu/util/env_posix.cc M src/kudu/util/process_memory.cc M src/kudu/util/process_memory.h M thirdparty/build-definitions.sh M thirdparty/download-thirdparty.sh M thirdparty/patches/gperftools-Replace-namespace-base-with-namespace-tcmalloc.patch D thirdparty/patches/gperftools-tcmalloc-osx-fix.patch M thirdparty/vars.sh 47 files changed, 1,261 insertions(+), 1,156 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/21767/1 -- To view, visit http://gerrit.cloudera.org:8080/21767 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.17.x Gerrit-MessageType: newchange Gerrit-Change-Id: I4618dbb2240d9baece1e6f47f9c5629e469d56f9 Gerrit-Change-Number: 21767 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]>
