Todd Lipcon has submitted this change and it was merged.

Change subject: compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize
......................................................................


compaction_policy: avoid O(n^2) calls to EstimateOnDiskSize

In a cluster workload with a 130GB+ tablet, I found that the maintenance
manager scheduler thread was spending tens of seconds inside
RowSetInfo::CollectOrdered(), mostly inside calls to
EstimateOnDiskSize(). While any individual call is not exceedingly slow,
they involve a lot of virtual function calls and potential CPU cache
misses, so it appears to add up.

I deployed this patch on the cluster and found that the
MaintenanceManager 'FindBestOps' call went from ~16 seconds to ~350ms.

Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Reviewed-on: http://gerrit.cloudera.org:8080/4191
Tested-by: Kudu Jenkins
Reviewed-by: David Ribeiro Alves <dral...@apache.org>
---
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
2 files changed, 13 insertions(+), 6 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic2949218d7f5fd822571a7b14d1d0b4430aeee1d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to