[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Alexey Serbin has submitted this change and it was merged. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. [c++ client] AUTO_FLUSH_BACKGROUND optimizations Optimizations after initial performance testing of the Kudu C++ client library with AUTO_FLUSH_BACKGROUND flush mode support. The most important tuning is the default flush watermark for the mutation buffer. Changing it from 80% to 50% gave near 30% performance boost in throughput for scenarios when a client pushes data to the server as fast as it can, using workloads of 8M rows like (int64, int32, string, string, int) where strings are about 32 bytes long in average. Each thread ran its own single-session KuduClient, where each session was running in AUTO_FLUSH_BACKGROUND flush mode. 1-thread insertion (8M rows per thread) 80% watermark: total : 35229.7 ms per row: 0.00440372 ms 50% watermark: total : 22562.8 ms per row: 0.00282035 ms 2-thread insertion (4M rows per thread) 80% watermark: total : 19683.6 ms per row: 0.00246046 ms 50% watermark: total : 12931.8 ms per row: 0.00161647 ms 4-thread insertion (2M rows per thread) 80% watermark: total : 11941.9 ms per row: 0.00149274 ms 50% watermark: total : 7724.68 ms per row: 0.000965585 ms Other related session parameters: mutation buffer size: 7M (default) maximum number of batchers: 2(default) time-based flush interval: 1 second (default) The tests were run dual Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (12 cores per CPU) with 98GiB of memory. This is a follow-up for 93be1310d227cf05025864654ca3f6713c2ddc2c. Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Reviewed-on: http://gerrit.cloudera.org:8080/4308 Tested-by: Kudu Jenkins Tested-by: Alexey SerbinReviewed-by: David Ribeiro Alves --- M src/kudu/client/batcher.cc M src/kudu/client/batcher.h M src/kudu/client/client.h M src/kudu/client/session-internal.cc M src/kudu/client/write_op.cc 5 files changed, 10 insertions(+), 13 deletions(-) Approvals: David Ribeiro Alves: Looks good to me, approved Alexey Serbin: Verified Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
David Ribeiro Alves has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 3: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4308/3/src/kudu/client/batcher.h File src/kudu/client/batcher.h: Line 127: static int64_t GetOperationSizeInBuffer(KuduWriteOperation* write_op) { > The reason I added this method is to allow calling private method of the Ku Didn't realize. I think you could friend it still, but this was already here so likely not worth the trouble. -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Alexey Serbin has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4308/3/src/kudu/client/batcher.h File src/kudu/client/batcher.h: Line 127: static int64_t GetOperationSizeInBuffer(KuduWriteOperation* write_op) { > does it even make sense to keep this method? The reason I added this method is to allow calling private method of the KuduWriteOperation from KuduSession::Data (an embedded class). As far as I know, it's not possible to make an embedded class a friend of other class. So, I didn't find a better way to keep SizeInBuffer() private for KuduWriteOperation and call it from an embedded class. If you know a better way to resolve this situation, I would be happy to remove this method. -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
David Ribeiro Alves has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4308/3/src/kudu/client/batcher.h File src/kudu/client/batcher.h: Line 127: static int64_t GetOperationSizeInBuffer(KuduWriteOperation* write_op) { does it even make sense to keep this method? -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Alexey Serbin has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Alexey Serbin has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 2: (2 comments) Thank you for the review! I posted the updated version. http://gerrit.cloudera.org:8080/#/c/4308/2//COMMIT_MSG Commit Message: Line 13: for the mutation buffer. Changing it from 80% to 50% gave good > s/good/some actual numbers (e.g. XX%-YY% throughput increase or something) Done Line 53: The tests were run at ve0518.halxg.cloudera.com against binaries > dont point to an internal cloudera machine on the commit message, if needed Good idea -- I'll add info CPU cores and available memory. -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Kudu Jenkins has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 3: Build Started http://104.196.14.100/job/kudu-gerrit/3401/ -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4308 to look at the new patch set (#3). Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. [c++ client] AUTO_FLUSH_BACKGROUND optimizations Optimizations after initial performance testing of the Kudu C++ client library with AUTO_FLUSH_BACKGROUND flush mode support. The most important tuning is the default flush watermark for the mutation buffer. Changing it from 80% to 50% gave near 30% performance boost in throughput for scenarios when a client pushes data to the server as fast as it can, using workloads of 8M rows like (int64, int32, string, string, int) where strings are about 32 bytes long in average. Each thread ran its own single-session KuduClient, where each session was running in AUTO_FLUSH_BACKGROUND flush mode. 1-thread insertion (8M rows per thread) 80% watermark: total : 35229.7 ms per row: 0.00440372 ms 50% watermark: total : 22562.8 ms per row: 0.00282035 ms 2-thread insertion (4M rows per thread) 80% watermark: total : 19683.6 ms per row: 0.00246046 ms 50% watermark: total : 12931.8 ms per row: 0.00161647 ms 4-thread insertion (2M rows per thread) 80% watermark: total : 11941.9 ms per row: 0.00149274 ms 50% watermark: total : 7724.68 ms per row: 0.000965585 ms Other related session parameters: mutation buffer size: 7M (default) maximum number of batchers: 2(default) time-based flush interval: 1 second (default) The tests were run dual Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (12 cores per CPU) with 98GiB of memory. This is a follow-up for 93be1310d227cf05025864654ca3f6713c2ddc2c. Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 --- M src/kudu/client/batcher.cc M src/kudu/client/batcher.h M src/kudu/client/client.h M src/kudu/client/session-internal.cc M src/kudu/client/write_op.cc 5 files changed, 10 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/4308/3 -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
David Ribeiro Alves has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/4308/2//COMMIT_MSG Commit Message: Line 13: for the mutation buffer. Changing it from 80% to 50% gave good s/good/some actual numbers (e.g. XX%-YY% throughput increase or something) Line 53: The tests were run at ve0518.halxg.cloudera.com against binaries dont point to an internal cloudera machine on the commit message, if needed (though I doubt it) you can mention the machine specs. -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: David Ribeiro Alves Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Kudu Jenkins has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 2: Build Started http://104.196.14.100/job/kudu-gerrit/3376/ -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4308 to look at the new patch set (#2). Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. [c++ client] AUTO_FLUSH_BACKGROUND optimizations Some optimizations after initial performance testing of the Kudu C++ client library with AUTO_FLUSH_BACKGROUND flush mode support. The most important tuning is the default flush watermark for the mutation buffer. Changing it from 80% to 50% gave good performance boost. The data below is for workloads of 8M rows like (int64, int32, string, string, int) where strings are about 32 bytes long in average. Each thread ran its own single-session KuduClient, where each session was running in AUTO_FLUSH_BACKGROUND flush mode. 1-thread insertion (8M rows per thread) 80% watermark: total : 35229.7 ms per row: 0.00440372 ms 50% watermark: total : 22562.8 ms per row: 0.00282035 ms 2-thread insertion (4M rows per thread) 80% watermark: total : 19683.6 ms per row: 0.00246046 ms 50% watermark: total : 12931.8 ms per row: 0.00161647 ms 4-thread insertion (2M rows per thread) 80% watermark: total : 11941.9 ms per row: 0.00149274 ms 50% watermark: total : 7724.68 ms per row: 0.000965585 ms Other related session parameters: mutation buffer size: 7M (default) maximum number of batchers: 2(default) time-based flush interval: 1 second (default) The tests were run at ve0518.halxg.cloudera.com against binaries built in release configuration at the same machine. Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 --- M src/kudu/client/batcher.cc M src/kudu/client/batcher.h M src/kudu/client/client.h M src/kudu/client/session-internal.cc M src/kudu/client/write_op.cc 5 files changed, 10 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/4308/2 -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Adar Dembo has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 1: (2 comments) The Java client uses 50% too so even if the performance didn't change having that consistency is nice. http://gerrit.cloudera.org:8080/#/c/4308/1//COMMIT_MSG Commit Message: PS1, Line 13: Chaning Nit: Changing PS1, Line 53: ve0518.halxg.cloudera.com Presumably the machine was relatively idle at the time? -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: Yes
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Alexey Serbin has uploaded a new change for review. http://gerrit.cloudera.org:8080/4308 Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. [c++ client] AUTO_FLUSH_BACKGROUND optimizations Some optimizations after initial performance testing of the Kudu C++ client library with AUTO_FLUSH_BACKGROUND flush mode support. The most important tuning is the default flush watermark for the mutation buffer. Chaning it from 80% to 50% gave good performance boost. The data below is for workloads of 8M rows like (int64, int32, string, string, int) where strings are about 32 bytes long in average. Each thread ran its own single-session KuduClient, where each session was running in AUTO_FLUSH_BACKGROUND flush mode. 1-thread insertion (8M rows per thread) 80% watermark: total : 35229.7 ms per row: 0.00440372 ms 50% watermark: total : 22562.8 ms per row: 0.00282035 ms 2-thread insertion (4M rows per thread) 80% watermark: total : 19683.6 ms per row: 0.00246046 ms 50% watermark: total : 12931.8 ms per row: 0.00161647 ms 4-thread insertion (2M rows per thread) 80% watermark: total : 11941.9 ms per row: 0.00149274 ms 50% watermark: total : 7724.68 ms per row: 0.000965585 ms Other related session parameters: mutation buffer size: 7M (default) maximum number of batchers: 2(default) time-based flush interval: 1 second (default) The tests were run at ve0518.halxg.cloudera.com against binaries built in release configuration at the same machine. Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 --- M src/kudu/client/batcher.cc M src/kudu/client/batcher.h M src/kudu/client/session-internal.cc M src/kudu/client/write_op.cc 4 files changed, 9 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/08/4308/1 -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin
[kudu-CR] [c++ client] AUTO FLUSH BACKGROUND optimizations
Kudu Jenkins has posted comments on this change. Change subject: [c++ client] AUTO_FLUSH_BACKGROUND optimizations .. Patch Set 1: Build Started http://104.196.14.100/job/kudu-gerrit/3224/ -- To view, visit http://gerrit.cloudera.org:8080/4308 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey SerbinGerrit-Reviewer: Kudu Jenkins Gerrit-HasComments: No