[
https://issues.apache.org/jira/browse/HBASE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-6066:
-------------------------
Priority: Critical (was: Minor)
Tags: noob
Labels: noob (was: )
I upped priority on this so we'll less likely forget about it. Seems like
small changes can make big improvement. Tagged it noob since the hard part --
the prescription -- has been done by two gentlemen software engineers (Kannan
and Todd).
> some low hanging read path improvement ideas
> ---------------------------------------------
>
> Key: HBASE-6066
> URL: https://issues.apache.org/jira/browse/HBASE-6066
> Project: HBase
> Issue Type: Improvement
> Reporter: Kannan Muthukkaruppan
> Priority: Critical
> Labels: noob
> Attachments: metric-stringbuilder-fix.patch
>
>
> I was running some single threaded scan performance tests for a table with
> small sized rows that is fully cached. Some observations...
> We seem to be doing several wasteful iterations over and/or building of
> temporary lists.
> 1) One such is the following code in HRegionServer.next():
> {code}
> boolean moreRows = s.next(values, HRegion.METRIC_NEXTSIZE);
> if (!values.isEmpty()) {
> for (KeyValue kv : values) { ------> #### wasteful in most
> cases
> currentScanResultSize += kv.heapSize();
> }
> results.add(new Result(values));
> {code}
> By default the "maxScannerResultSize" is Long.MAX_VALUE. In those cases,
> we can avoid the unnecessary iteration to compute currentScanResultSize.
> 2) An example of a wasteful temporary array, is "results" in
> RegionScanner.next().
> {code}
> results.clear();
> boolean returnResult = nextInternal(limit, metric);
> outResults.addAll(results);
> {code}
> results then gets copied over to outResults via an addAll(). Not sure why we
> can not directly collect the results in outResults.
> 3) Another almost similar exmaple of a wasteful array is "results" in
> StoreScanner.next(), which eventually also copies its results into
> "outResults".
> 4) Reduce overhead of "size metric" maintained in StoreScanner.next().
> {code}
> if (metric != null) {
> HRegion.incrNumericMetric(this.metricNamePrefix + metric,
> copyKv.getLength());
> }
> results.add(copyKv);
> {code}
> A single call to next() might fetch a lot of KVs. We can first add up the
> size of those KVs in a local variable and then in a finally clause increment
> the metric one shot, rather than updating AtomicLongs for each KV.
> 5) RegionScanner.next() calls a helper RegionScanner.next() on the same
> object. Both are synchronized methods. Synchronized methods calling nested
> synchronized methods on the same object are probably adding some small
> overhead. The inner next() calls isFilterDone() which is a also a
> synchronized method. We should factor the code to avoid these nested
> synchronized methods.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira