[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. The new metric is sampled by each acceptor thread when accepting an RPC connection. It's possible to change the frequency of the sampling (completely disabling it, if necessary) by tuning the --rpc_listen_socket_stats_every_log2 flag. I added basic tests scenarios to cover the newly introduced functionality. In addition, an extra performance test scenario has been added into rpc-bench. The new scenario is to measure an extra latency introduced by capturing diagnostic snapshots on the listening RPC socket. Some results are below (3 measurement at each setting), and based on these I think that setting --rpc_listen_socket_stats_every_log2=3 by default makes sense: this about 1.2us of latency in average per RPC request. The numbers are in microseconds, so the overall latency of handling RPC requests and the sustainable RPC rate do not seem to be adversely affected by this patch. If anybody finds otherwise, they can always set --rpc_listen_socket_stats_every_log2=-1 for their Kudu cluster if they don't care about the listening socket's backlog stats. The results below have been captured when running the command below for N in the set of { -1, 0, 3, 5 }: ./rpc-bench --gtest_filter='*RpcAcceptorBench*' \ --client_threads=2 \ --rpc_listen_socket_stats_every_log2=N --- collecting diagnostics on the listening RPC socket ... is disabled Dispatched 99851 connection requests in 1 seconds Request dispatching time (us): min 0 max 48 average 2.4426495478262611 Dispatched 98651 connection requests in 1 seconds Request dispatching time (us): min 0 max 54 average 2.4904663916229941 Dispatched 99200 connection requests in 1 seconds Request dispatching time (us): min 0 max 53 average 2.4747076612903225 --- collecting diagnostics on the listening RPC socket ... every 1 connection(s) Dispatched 65383 connection requests in 1 seconds Request dispatching time (us): min 6 max 208 average 11.071424201639545 Dispatched 65162 connection requests in 1 seconds Request dispatching time (us): min 6 max 392 average 11.258256092320913 Dispatched 65428 connection requests in 1 seconds Request dispatching time (us): min 6 max 290 average 11.209445208619899 --- collecting diagnostics on the listening RPC socket ... every 8 connection(s) Dispatched 99902 connection requests in 1 seconds Request dispatching time (us): min 0 max 148 average 3.628295729815219 Dispatched 98139 connection requests in 1 seconds Request dispatching time (us): min 0 max 101 average 3.6546429044518489 Dispatched 101681 connection requests in 1 seconds Request dispatching time (us): min 0 max 98 average 3.549552030369489 --- collecting diagnostics on the listening RPC socket ... every 32 connection(s) Dispatched 100832 connection requests in 1 seconds Request dispatching time (us): min 0 max 114 average 2.727457553157727 Dispatched 100214 connection requests in 1 seconds Request dispatching time (us): min 0 max 71 average 2.8103658171512964 Dispatched 99106 connection requests in 1 seconds Request dispatching time (us): min 0 max 52 average 2.7968841442495913 Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Reviewed-on: http://gerrit.cloudera.org:8080/20908 Tested-by: Alexey Serbin Reviewed-by: Yingchun Lai --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/rpc-bench.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h M src/kudu/util/net/socket.cc M src/kudu/util/net/socket.h 11 files changed, 500 insertions(+), 13 deletions(-) Approvals: Alexey Serbin: Verified Yingchun Lai: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 5: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG@28 PS3, Line 28: 3 > Yes, Abhishek is correct. I got it. 3 is OK to me. http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc@1908 PS3, Line 1908: > Ah, indeed -- the idea was to skip testing Unix sockets, right. I correcte Remove ”getaddrinfo“ is OK to me. -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 5 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Thu, 25 Jan 2024 06:21:22 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has removed a vote on this change. Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 5 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 5: Verified+1 unrelated test failure in RaftConsensusITest.TestSlowFollower -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 5 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Thu, 25 Jan 2024 05:43:52 + Gerrit-HasComments: No
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Hello Yingchun Lai, Attila Bukor, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20908 to look at the new patch set (#5). Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. The new metric is sampled by each acceptor thread when accepting an RPC connection. It's possible to change the frequency of the sampling (completely disabling it, if necessary) by tuning the --rpc_listen_socket_stats_every_log2 flag. I added basic tests scenarios to cover the newly introduced functionality. In addition, an extra performance test scenario has been added into rpc-bench. The new scenario is to measure an extra latency introduced by capturing diagnostic snapshots on the listening RPC socket. Some results are below (3 measurement at each setting), and based on these I think that setting --rpc_listen_socket_stats_every_log2=3 by default makes sense: this about 1.2us of latency in average per RPC request. The numbers are in microseconds, so the overall latency of handling RPC requests and the sustainable RPC rate do not seem to be adversely affected by this patch. If anybody finds otherwise, they can always set --rpc_listen_socket_stats_every_log2=-1 for their Kudu cluster if they don't care about the listening socket's backlog stats. The results below have been captured when running the command below for N in the set of { -1, 0, 3, 5 }: ./rpc-bench --gtest_filter='*RpcAcceptorBench*' \ --client_threads=2 \ --rpc_listen_socket_stats_every_log2=N --- collecting diagnostics on the listening RPC socket ... is disabled Dispatched 99851 connection requests in 1 seconds Request dispatching time (us): min 0 max 48 average 2.4426495478262611 Dispatched 98651 connection requests in 1 seconds Request dispatching time (us): min 0 max 54 average 2.4904663916229941 Dispatched 99200 connection requests in 1 seconds Request dispatching time (us): min 0 max 53 average 2.4747076612903225 --- collecting diagnostics on the listening RPC socket ... every 1 connection(s) Dispatched 65383 connection requests in 1 seconds Request dispatching time (us): min 6 max 208 average 11.071424201639545 Dispatched 65162 connection requests in 1 seconds Request dispatching time (us): min 6 max 392 average 11.258256092320913 Dispatched 65428 connection requests in 1 seconds Request dispatching time (us): min 6 max 290 average 11.209445208619899 --- collecting diagnostics on the listening RPC socket ... every 8 connection(s) Dispatched 99902 connection requests in 1 seconds Request dispatching time (us): min 0 max 148 average 3.628295729815219 Dispatched 98139 connection requests in 1 seconds Request dispatching time (us): min 0 max 101 average 3.6546429044518489 Dispatched 101681 connection requests in 1 seconds Request dispatching time (us): min 0 max 98 average 3.549552030369489 --- collecting diagnostics on the listening RPC socket ... every 32 connection(s) Dispatched 100832 connection requests in 1 seconds Request dispatching time (us): min 0 max 114 average 2.727457553157727 Dispatched 100214 connection requests in 1 seconds Request dispatching time (us): min 0 max 71 average 2.8103658171512964 Dispatched 99106 connection requests in 1 seconds Request dispatching time (us): min 0 max 52 average 2.7968841442495913 Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/rpc-bench.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h M src/kudu/util/net/socket.cc M src/kudu/util/net/socket.h 11 files changed, 500 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/20908/5 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG@28 PS3, Line 28: 3 > I guess the rationale behind that would be 32 connections might be too in-f Yes, Abhishek is correct. And wth --rpc_listen_socket_stats_every_log2=10 it's even less extra latency added (i.e. sampling every 1024 accepted connections), of course :) However, the idea was to find a value that would allow for having meaningful data for the new metric, while not adding too much latency. After some consideration, I decided to set the default value of the new flag to 3. Setting it to 5 would mean the sampling might give us values with deltas up to of 32 from the actual queue length, and I thought that deviation was too high. Of course, if there was some chronic issue with getting to many incoming connections requests, it would be possible to spot that the RX queue had been at capacity even with such low sampling frequency, but the idea was to provide some meaningful metric for non-pathological cases as well. However, I can change this if you feel strongly about this. http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc@1908 PS3, Line 1908: UnixSocket > UnixSocket will not be tested? Ah, indeed -- the idea was to skip testing Unix sockets, right. I corrected this. Or you actually meant to enable this test for Unix sockets as well? -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Thu, 25 Jan 2024 02:23:08 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Hello Yingchun Lai, Attila Bukor, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20908 to look at the new patch set (#4). Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. The new metric is sampled by each acceptor thread when accepting an RPC connection. It's possible to change the frequency of the sampling (completely disabling it, if necessary) by tuning the --rpc_listen_socket_stats_every_log2 flag. I added basic tests scenarios to cover the newly introduced functionality. In addition, an extra performance test scenario has been added into rpc-bench. The new scenario is to measure an extra latency introduced by capturing diagnostic snapshots on the listening RPC socket. Some results are below (3 measurement at each setting), and based on these I think that setting --rpc_listen_socket_stats_every_log2=3 by default makes sense: this about 1.2us of latency in average per RPC request. The numbers are in microseconds, so the overall latency of handling RPC requests and the sustainable RPC rate do not seem to be adversely affected by this patch. If anybody finds otherwise, they can always set --rpc_listen_socket_stats_every_log2=-1 for their Kudu cluster if they don't care about the listening socket's backlog stats. The results below have been captured when running the command below for N in the set of { -1, 0, 3, 5 }: ./rpc-bench --gtest_filter='*RpcAcceptorBench*' \ --client_threads=2 \ --rpc_listen_socket_stats_every_log2=N --- collecting diagnostics on the listening RPC socket ... is disabled Dispatched 99851 connection requests in 1 seconds Request dispatching time (us): min 0 max 48 average 2.4426495478262611 Dispatched 98651 connection requests in 1 seconds Request dispatching time (us): min 0 max 54 average 2.4904663916229941 Dispatched 99200 connection requests in 1 seconds Request dispatching time (us): min 0 max 53 average 2.4747076612903225 --- collecting diagnostics on the listening RPC socket ... every 1 connection(s) Dispatched 65383 connection requests in 1 seconds Request dispatching time (us): min 6 max 208 average 11.071424201639545 Dispatched 65162 connection requests in 1 seconds Request dispatching time (us): min 6 max 392 average 11.258256092320913 Dispatched 65428 connection requests in 1 seconds Request dispatching time (us): min 6 max 290 average 11.209445208619899 --- collecting diagnostics on the listening RPC socket ... every 8 connection(s) Dispatched 99902 connection requests in 1 seconds Request dispatching time (us): min 0 max 148 average 3.628295729815219 Dispatched 98139 connection requests in 1 seconds Request dispatching time (us): min 0 max 101 average 3.6546429044518489 Dispatched 101681 connection requests in 1 seconds Request dispatching time (us): min 0 max 98 average 3.549552030369489 --- collecting diagnostics on the listening RPC socket ... every 32 connection(s) Dispatched 100832 connection requests in 1 seconds Request dispatching time (us): min 0 max 114 average 2.727457553157727 Dispatched 100214 connection requests in 1 seconds Request dispatching time (us): min 0 max 71 average 2.8103658171512964 Dispatched 99106 connection requests in 1 seconds Request dispatching time (us): min 0 max 52 average 2.7968841442495913 Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/rpc-bench.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h M src/kudu/util/net/socket.cc M src/kudu/util/net/socket.h 11 files changed, 501 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/20908/4 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 3: Code-Review+1 (1 comment) Looks good to me. Let's get Yingchun's comments addressed. http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG@28 PS3, Line 28: 3 > According to the test result below, isn't it better when --rpc_listen_socke I guess the rationale behind that would be 32 connections might be too in-frequent to collect the metrics. -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Wed, 24 Jan 2024 22:11:10 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20908/3//COMMIT_MSG@28 PS3, Line 28: 3 According to the test result below, isn't it better when --rpc_listen_socket_stats_every_log2=5? http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc File src/kudu/rpc/rpc-test.cc: http://gerrit.cloudera.org:8080/#/c/20908/3/src/kudu/rpc/rpc-test.cc@1908 PS3, Line 1908: UnixSocket UnixSocket will not be tested? -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Wed, 24 Jan 2024 14:41:31 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Wang Xixu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Wed, 24 Jan 2024 02:02:26 + Gerrit-HasComments: No
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Hello Yingchun Lai, Attila Bukor, Kudu Jenkins, Abhishek Chennaka, Wang Xixu, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20908 to look at the new patch set (#3). Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. The new metric is sampled by each acceptor thread when accepting an RPC connection. It's possible to change the frequency of the sampling (completely disabling it, if necessary) by tuning the --rpc_listen_socket_stats_every_log2 flag. I added basic tests scenarios to cover the newly introduced functionality. In addition, an extra performance test scenario has been added into rpc-bench. The new scenario is to measure an extra latency introduced by capturing diagnostic snapshots on the listening RPC socket. Some results are below (3 measurement at each setting), and based on these I think that setting --rpc_listen_socket_stats_every_log2=3 by default makes sense: this about 1.2us of latency in average per RPC request. The numbers are in microseconds, so the overall latency of handling RPC requests and the sustainable RPC rate do not seem to be adversely affected by this patch. If anybody finds otherwise, they can always set --rpc_listen_socket_stats_every_log2=-1 for their Kudu cluster if they don't care about the listening socket's backlog stats. The results below have been captured when running the command below for N in the set of { -1, 0, 3, 5 }: ./rpc-bench --gtest_filter='*RpcAcceptorBench*' \ --client_threads=2 \ --rpc_listen_socket_stats_every_log2=N --- collecting diagnostics on the listening RPC socket ... is disabled Dispatched 99851 connection requests in 1 seconds Request dispatching time (us): min 0 max 48 average 2.4426495478262611 Dispatched 98651 connection requests in 1 seconds Request dispatching time (us): min 0 max 54 average 2.4904663916229941 Dispatched 99200 connection requests in 1 seconds Request dispatching time (us): min 0 max 53 average 2.4747076612903225 --- collecting diagnostics on the listening RPC socket ... every 1 connection(s) Dispatched 65383 connection requests in 1 seconds Request dispatching time (us): min 6 max 208 average 11.071424201639545 Dispatched 65162 connection requests in 1 seconds Request dispatching time (us): min 6 max 392 average 11.258256092320913 Dispatched 65428 connection requests in 1 seconds Request dispatching time (us): min 6 max 290 average 11.209445208619899 --- collecting diagnostics on the listening RPC socket ... every 8 connection(s) Dispatched 99902 connection requests in 1 seconds Request dispatching time (us): min 0 max 148 average 3.628295729815219 Dispatched 98139 connection requests in 1 seconds Request dispatching time (us): min 0 max 101 average 3.6546429044518489 Dispatched 101681 connection requests in 1 seconds Request dispatching time (us): min 0 max 98 average 3.549552030369489 --- collecting diagnostics on the listening RPC socket ... every 32 connection(s) Dispatched 100832 connection requests in 1 seconds Request dispatching time (us): min 0 max 114 average 2.727457553157727 Dispatched 100214 connection requests in 1 seconds Request dispatching time (us): min 0 max 71 average 2.8103658171512964 Dispatched 99106 connection requests in 1 seconds Request dispatching time (us): min 0 max 52 average 2.7968841442495913 Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/negotiation.cc M src/kudu/rpc/rpc-bench.cc M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h M src/kudu/util/net/socket.cc M src/kudu/util/net/socket.h 11 files changed, 502 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/20908/3 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Tue, 23 Jan 2024 19:54:28 + Gerrit-HasComments: No
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20908/2/src/kudu/rpc/acceptor_pool.cc File src/kudu/rpc/acceptor_pool.cc: http://gerrit.cloudera.org:8080/#/c/20908/2/src/kudu/rpc/acceptor_pool.cc@118 PS2, Line 118: $0: invalid setting for $1; must be less than 64" > why should less than 64? Because of the possible width of the counter. Do think anybody needs to do sampling less frequently than once in 2^63 accepted connections? :) -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Tue, 23 Jan 2024 15:15:29 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Wang Xixu has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/20908/2/src/kudu/rpc/acceptor_pool.cc File src/kudu/rpc/acceptor_pool.cc: http://gerrit.cloudera.org:8080/#/c/20908/2/src/kudu/rpc/acceptor_pool.cc@118 PS2, Line 118: $0: invalid setting for $1; must be less than 64" why should less than 64? -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <1450306...@qq.com> Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Tue, 23 Jan 2024 07:42:32 + Gerrit-HasComments: Yes
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Hello Yingchun Lai, Attila Bukor, Kudu Jenkins, Abhishek Chennaka, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20908 to look at the new patch set (#2). Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. The new metric is sampled by each acceptor thread when accepting an RPC connection. It's possible to change the frequency of the sampling (completely disabling it, if necessary) by tuning the --rpc_listen_socket_stats_every_log2 flag. I also added basic tests scenarios to cover the newly introduced functionality. Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h 7 files changed, 335 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/20908/2 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has removed a vote on this change. Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/20908 ) Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. Patch Set 1: Verified+1 unrelated test failures in a Java test -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Thu, 18 Jan 2024 18:39:54 + Gerrit-HasComments: No
[kudu-CR] [rpc] introduce rpc listened socket rx queue size metric
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20908 Change subject: [rpc] introduce rpc_listened_socket_rx_queue_size metric .. [rpc] introduce rpc_listened_socket_rx_queue_size metric This patch introduces a new 'rpc_listened_socket_rx_queue_size' histogram metric for AcceptorPool. The metric allows for tracking the size of the listened RPC socket's RX queue. The new metric shows meaningful numbers only on Linux since it's based on the DiagnosticSocket, where the latter is implemented only on Linux as of now. I also added basic tests scenarios to cover the newly introduced functionality. Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 --- M src/kudu/rpc/acceptor_pool.cc M src/kudu/rpc/acceptor_pool.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc-test.cc M src/kudu/util/CMakeLists.txt M src/kudu/util/net/diagnostic_socket.cc M src/kudu/util/net/diagnostic_socket.h 7 files changed, 224 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/20908/1 -- To view, visit http://gerrit.cloudera.org:8080/20908 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I83580659bac39d9171f1ee0d0e88676ed0d50b99 Gerrit-Change-Number: 20908 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin