Hongjiang Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17773 )

Change subject: KUDU-1260: Fix prefetching bug on Java scanner
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@290
PS4, Line 290:    * Every time the client calls nextRows(), a new RPC together 
with a prefetching RPC
             :    * are sent to the server side, so the queue will not have 
more than 2 items.
> Right -- every Deferred in the queue has its prefetch callback, and the cal
Hmm, there is no explicitly code to guarantee "the callback for the added 
Deferred called before nextRows() called by the user", but if there are two 
items in the queue, we know neither of them chains the prefetch because only 
dequeued Deferred can chain prefetch, which means if both of them were called, 
no new item was added to the queue.


http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@642
PS4, Line 642:       Deferred<RowResultIterator> prefetcherDeferred = 
cachedPrefetcherDeferred.poll();
             :       prefetcherDeferred.chain(new 
Deferred<RowResultIterator>().addCallback(prefetch));
> Right, but the next prefetching makes sense only after the latest prefetch
It is risky to chain prefetch for the Deferred still in the queue, which may 
enqueue more than 2 items. That is why only dequeued Deferred chains prefetch 
callback.

In addition, two preteching happens if user calls nextRows() twice like the 
following, which is not a proper usage.

while(it.hasMore()) {
  it.nextRows();
  it.nextRows();
}

Here the main concern is adding more prefetch callbacks than 2, which can avoid 
extra IO on the server. I think we can add another check:

      Deferred<RowResultIterator> prefetcherDeferred = 
cachedPrefetcherDeferred.poll();
      if (cachedPrefetcherDeferred.size() < CACHE_CAPACITY) {
         prefetcherDeferred.chain(new 
Deferred<RowResultIterator>().addCallback(prefetch));
      }



--
To view, visit http://gerrit.cloudera.org:8080/17773
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I853a041d86c75ec196d7d4ff45af4673c5c5f5cd
Gerrit-Change-Number: 17773
Gerrit-PatchSet: 7
Gerrit-Owner: Hongjiang Zhang <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Hongjiang Zhang <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 20 Aug 2021 04:12:49 +0000
Gerrit-HasComments: Yes

Reply via email to