Hello Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/15995 to look at the new patch set (#5). Change subject: [maintenance] use workload statistics to scale perf_improvement ...................................................................... [maintenance] use workload statistics to scale perf_improvement When we consider the performance improvement brought by maintenance operations, we could use workload statistics to find how 'hot' the tablet has been in the last few minutes so we prefer 'hot' tablets. This patch use recent read/write rate of a tablet as a workload score, calculate a final perf score based on a op's raw perf_improvement, the tablet's workload score and the table's priority, so maintenance ops for a 'hotter' tablet are more likely to lauch. We tested this on a 6-node cluster and set maintenance_manager_num_threads=2, run various YCSB workloads on a table with 64 tablets. worklaod_a: phase=load recordcount=1000000000 operationcount=1000000000 insertproportion=1 result: measurements Before change After change [INSERT]AverageLatency(us) 46.08928 47.35706 [INSERT]95thPercentileLatency(us) 4 4 [INSERT]99thPercentileLatency(us) 8 7 workload_b: phase=run recordcount=1000000000 operationcount=10000000 insertproportion=0 updateproportion=0.2 scanproportion=0.8 deleteproportion=0 requestdistribution=zipfian hotspotdatafraction=0.2 hotspotopnfraction=0.8 maxscanlength=100 scanlengthdistribution=zipfian result: measurements Before change After change [UPDATE]AverageLatency(us) 4.38919 5.10809 [UPDATE]95thPercentileLatency(us) 7 8 [UPDATE]99thPercentileLatency(us) 11 14 [SCAN]AverageLatency(us) 1249.86523 1081.45440 [SCAN]95thPercentileLatency(us) 1036 1119 [SCAN]99thPercentileLatency(us) 3993 2891 workload_c: phase=run recordcount=2000000000 operationcount=100000000 insertproportion=0.8 updateproportion=0 scanproportion=0.2 deleteproportion=0 requestdistribution=zipfian hotspotdatafraction=0.2 hotspotopnfraction=0.8 maxscanlength=100 scanlengthdistribution=zipfian result: measurements Before change After change [INSERT]AverageLatency(us) 7.98435 8.20646 [INSERT]95thPercentileLatency(us) 7 7 [INSERT]99thPercentileLatency(us) 11 11 [SCAN]AverageLatency(us) 1376.30270 1207.82823 [SCAN]95thPercentileLatency(us) 2449 1398 [SCAN]99thPercentileLatency(us) 23615 19775 We can see that in zipfian update/scan and insert/scan workloads, scan performance has improved with this change. Change-Id: Ie3afcc359002d1392164ba2fda885f8930ef8696 --- M src/kudu/tablet/tablet.cc M src/kudu/tablet/tablet.h M src/kudu/tablet/tablet_mm_ops.cc M src/kudu/tablet/tablet_mm_ops.h M src/kudu/tablet/tablet_replica_mm_ops.cc M src/kudu/tablet/tablet_replica_mm_ops.h M src/kudu/util/maintenance_manager-test.cc M src/kudu/util/maintenance_manager.cc M src/kudu/util/maintenance_manager.h 9 files changed, 134 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/95/15995/5 -- To view, visit http://gerrit.cloudera.org:8080/15995 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie3afcc359002d1392164ba2fda885f8930ef8696 Gerrit-Change-Number: 15995 Gerrit-PatchSet: 5 Gerrit-Owner: Yifan Zhang <chinazhangyi...@163.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com>