[ 
https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558097#comment-16558097
 ] 

Vikas Vishwakarma edited comment on HBASE-20896 at 7/26/18 9:47 AM:
--------------------------------------------------------------------

Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code:java}
private Result[] prependCombined(Result[] results, int length) throws 
IOException {
....
 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}
In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List<Result> regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}
In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code:java}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

At a high level the above has been replaced by the results being added to cache 
which is passed from ClientScanner.java directly and updating the 
numberOfCompleteRows while adding the element to cache
ConnectionUtils.java
{code}
  public static ScanResultCache createScanResultCache(Scan scan, 
LinkedList<Result> cache) {
    if (scan.getAllowPartialResults()) {
      return new AllowPartialScanResultCache(cache);
    } else if (scan.getBatch() > 0) {
      return new BatchScanResultCache(cache, scan.getBatch());
    } else {
      return new CompleteScanResultCache(cache);
    }
  }
{code}

And for respective scanner cache where addResultToCache is same as what was 
happening in ScanResultCache java earlier
{code}
  protected void checkUpdateNumberOfCompleteRowsAndCache(Result rs) {
    numberOfCompleteRows++;
    addResultToCache(rs);
  }
....
  protected void addResultToCache(Result rs) {
    cache.add(rs);
    for (Cell cell : rs.rawCells()) {
      resultSize += CellUtil.estimatedHeapSizeOf(cell);
    }
    count++;
    lastResult = rs;
  }

{code}








was (Author: vik.karma):
Summarizing the changes

We have avoided the following array conversions and iterations by directly 
adding the results to cache and updating the relevant counts while loading the 
cache instead of doing these separately.

In CompleteScanResultCache.java separate array creation for prependCombined and 
separate array iteration for updating numberOfCompleteRows has been avoided
{code}

private Result[] prependCombined(Result[] results, int length) throws 
IOException {
....
 Result[] prependResults = new Result[length + 1];
 prependResults[0] = combine();
 System.arraycopy(results, start, prependResults, 1, length);
 return prependResults;
 }

private Result[] updateNumberOfCompleteResultsAndReturn(Result... results) {
 numberOfCompleteRows += results.length;
 return results;
 }

{code}


In BatchScanResultCache.java regroupedResults arrayList generation and 
arrayList to array conversion has been avoided 
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 List<Result> regroupedResults = new ArrayList<>();
 for (Result result : results) {
..
 regroupedResults.add(...);
..
 return regroupedResults.toArray(new Result[0]);
 }

{code}


In AllowPartialScanResultCache.java we avoid copying of array into subarray and 
seprate iteration over the array to update numberOfCompleteRows
{code}
 public Result[] addAndGet(Result[] results, boolean isHeartbeatMessage) throws 
IOException {
...
 if (i > 0) {
 results = Arrays.copyOfRange(results, i, results.length);
 }
 for (Result result : results) {
 if (!result.mayHaveMoreCellsInRow()) {
 numberOfCompleteRows++;
 }
 }
 return results;
 }
{code}

> Port HBASE-20866 to branch-1 and branch-1.4 
> --------------------------------------------
>
>                 Key: HBASE-20896
>                 URL: https://issues.apache.org/jira/browse/HBASE-20896
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Andrew Purtell
>            Assignee: Vikas Vishwakarma
>            Priority: Major
>             Fix For: 1.5.0, 1.4.7
>
>         Attachments: HBASE-20896.branch-1.4.001.patch, 
> HBASE-20896.branch-1.4.002.patch, HBASE-20896.branch-1.4.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to