Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/15254 )
Change subject: KUDU-3056: Reduce HdrHistogramAccumulator overhead ...................................................................... Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@19 PS1, Line 19: adjust > adjusts Done http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@24 PS1, Line 24: Last > Lastly Done http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@25 PS1, Line 25: The result is relatively similar : output in the Spark accumulator with a significantly smaller : histogram. > I took a quick look at the HdrHistograms we use server-side. Most use a num It's good to know this aligns with the server side significant digits. Yeah, his claim is accurate. But the total transmitted size back to the spark driver is num-tasks * histogram-size. So 1 MiB is a lot for the driver to collect when there are hundreds of tasks. (In the reported Jira there are ~900). http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@30 PS1, Line 30: implimentation > implementation Done http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@32 PS1, Line 32: not > drop this Done http://gerrit.cloudera.org:8080/#/c/15254/1//COMMIT_MSG@53 PS1, Line 53: to > "so as to" Done http://gerrit.cloudera.org:8080/#/c/15254/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala: http://gerrit.cloudera.org:8080/#/c/15254/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/HdrHistogramAccumulator.scala@47 PS1, Line 47: histogram > Why don't we need .copy() here any more? ooh, good catch. -- To view, visit http://gerrit.cloudera.org:8080/15254 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic7c2a33bc61a2baa38703ea3340a07e06ab39db3 Gerrit-Change-Number: 15254 Gerrit-PatchSet: 1 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Thu, 20 Feb 2020 18:30:20 +0000 Gerrit-HasComments: Yes
