Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17773 )
Change subject: KUDU-1260: Fix prefetching bug on Java scanner ...................................................................... Patch Set 7: (2 comments) Thank you very much for iterating on this patch! Just curious -- is it possible to simplify things a bit, switching to trivial case of queue of size 1 and using blocking put() method instead of add() when adding next pre-fetch deferred into the queue? http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java: http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@290 PS4, Line 290: * Every time the client calls nextRows(), a new RPC together with a prefetching RPC : * are sent to the server side, so the queue will not have more than 2 items. > Hmm, there is no explicitly code to guarantee "the callback for the added D I'm just trying to understand how we could make this code bullet-proof in the sense of not trying to put too many pre-fetch RPCs into the queue, so we are sure the code doesn't unexpectedly throw an exception when pre-fetching RPCs return fast. What if we make the queue to be just size 1 and call cachedPrefetcherDeferred.put(prefetcherDeferred) instead of cachedPrefetcherDeferred.add(prefetcherDeferred), basically switching to blocking inserts into the queue? Could it work as expected? http://gerrit.cloudera.org:8080/#/c/17773/4/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java@642 PS4, Line 642: Deferred<RowResultIterator> prefetcherDeferred = cachedPrefetcherDeferred.poll(); : prefetcherDeferred.chain(new Deferred<RowResultIterator>().addCallback(prefetch)); > It is risky to chain prefetch for the Deferred still in the queue, which ma Yep, I think ideally we don't want to schedule many prefetches to run in parallel. If we can get to some reliable way to have no more than one active prefetch at any time, that would be great. What if the queue were of size 1 and instead of add() the code would call put()? Could it work? -- To view, visit http://gerrit.cloudera.org:8080/17773 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I853a041d86c75ec196d7d4ff45af4673c5c5f5cd Gerrit-Change-Number: 17773 Gerrit-PatchSet: 7 Gerrit-Owner: Hongjiang Zhang <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Hongjiang Zhang <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 24 Aug 2021 05:39:16 +0000 Gerrit-HasComments: Yes
