[
https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569146#comment-16569146
]
Hudson commented on HBASE-20866:
--------------------------------
Results for branch branch-1
[build #402 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/402/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(x) {color:red}-1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/402//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/402//JDK7_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/402//JDK8_Nightly_Build_Report_(Hadoop2)/]
(x) {color:red}-1 source release artifact{color}
-- See build output for details.
> HBase 1.x scan performance degradation compared to 0.98 version
> ---------------------------------------------------------------
>
> Key: HBASE-20866
> URL: https://issues.apache.org/jira/browse/HBASE-20866
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Affects Versions: 1.3.2
> Reporter: Vikas Vishwakarma
> Assignee: Vikas Vishwakarma
> Priority: Critical
> Fix For: 1.3.3
>
> Attachments: HBASE-20866.branch-1.3.001.patch,
> HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch
>
>
> Internally while testing 1.3 as part of migration from 0.98 to 1.3 we
> observed perf degradation in scan performance for phoenix queries varying
> from few 10's to upto 200% depending on the query being executed. We tried
> simple native HBase scan and there also we saw upto 40% degradation in
> performance when the number of column qualifiers are high (40-50+)
> To identify the root cause of performance diff between 0.98 and 1.3 we
> carried out lot of experiments with profiling and git bisect iterations,
> however we were not able to identify any particular source of scan
> performance degradation and it looked like this is an accumulated degradation
> of 5-10% over various enhancements and refactoring.
> We identified few major enhancements like partialResult handling,
> ScannerContext with heartbeat processing, time/size limiting, RPC
> refactoring, etc that could have contributed to small degradation in
> performance which put together could be leading to large overall degradation.
> One of the changes is
> [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which
> implements partialResult handling. In ClientScanner.java the results received
> from server are cached on the client side by converting the result array into
> an ArrayList. This function gets called in a loop depending on the number of
> rows in the scan result. Example for ten’s of millions of rows scanned, this
> can be called in the order of millions of times.
> In almost all the cases 99% of the time (except for handling partial results,
> etc). We are just taking the resultsFromServer converting it into a ArrayList
> resultsToAddToCache in addResultsToList(..) and then iterating over the list
> again and adding it to cache in loadCache(..) as given in the code path below
> In ClientScanner → loadCache(..) → getResultsToAddToCache(..) →
> addResultsToList(..) →
> {code:java}
> loadCache() {
> ...
> List<Result> resultsToAddToCache =
> getResultsToAddToCache(values, callable.isHeartbeatMessage());
> ...
> …
> for (Result rs : resultsToAddToCache) {
> rs = filterLoadedCell(rs);
> cache.add(rs);
> ...
> }
> }
> getResultsToAddToCache(..) {
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null &&
> scan.getAllowPartialResults();
> ..
> if (allowPartials || isBatchSet) {
> addResultsToList(resultsToAddToCache, resultsFromServer, 0,
> (null == resultsFromServer ? 0 : resultsFromServer.length));
> return resultsToAddToCache;
> }
> ...
> }
> private void addResultsToList(List<Result> outputList, Result[] inputArray,
> int start, int end) {
> if (inputArray == null || start < 0 || end > inputArray.length) return;
> for (int i = start; i < end; i++) {
> outputList.add(inputArray[i]);
> }
> }{code}
>
> It looks like we can avoid the result array to arraylist conversion
> (resultsFromServer --> resultsToAddToCache ) for the first case which is also
> the most frequent case and instead directly take the values arraay returned
> by callable and add it to the cache without converting it into ArrayList.
> I have taken both these flags allowPartials and isBatchSet out in loadcahe()
> and I am directly adding values to scanner cache if the above condition is
> pass instead of coverting it into arrayList by calling
> getResultsToAddToCache(). For example:
> {code:java}
> protected void loadCache() throws IOException {
> Result[] values = null;
> ..
> final boolean isBatchSet = scan != null && scan.getBatch() > 0;
> final boolean allowPartials = scan != null && scan.getAllowPartialResults();
> ..
> for (;;) {
> try {
> values = call(callable, caller, scannerTimeout);
> ..
> } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) {
> ..
> }
> if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE
> if (values != null) {
> for (int v=0; v<values.length; v++) {
> Result rs = values[v];
> ....
> cache.add(rs);
> ...
> } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING ..
> List<Result> resultsToAddToCache =
> getResultsToAddToCache(values, callable.isHeartbeatMessage());
> for (Result rs : resultsToAddToCache) {
> ....
> cache.add(rs);
> ...
> }
> }
> {code}
>
> I am seeing upto 10% improvement in scan time with these changes, sample PE
> execution results given below.
> ||PE (1M , 1 thread)||with addResultsToList||without
> addResultsToList||%improvement||
> |ScanTest|9228|8448|9|
> |RandomScanWithRange10Test|393413|378222|4|
> |RandomScanWithRange100Test|1041860|980147|6|
> Similarly we are observing upto 10% improvement in simple native HBase scan
> test used internally that just scans through a large region filtering all the
> rows. I still have to do the phoenix query tests with this change. Posting
> the initial observations for feedback/comments and suggestions.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)