Repository: kudu
Updated Branches:
  refs/heads/master 56cc69226 -> 591e6ddfd


[java client] ITClient's new snapshot scanners can easily timeout

This patch makes it so that a scanner timeout won't kill the test. It also
takes care of a typo in related javadoc.

The failure in question:
19:56:29.459 [ERROR - Thread-10] (ITClient.java:134) Got error while getting 
row 260
org.apache.kudu.client.NonRecoverableException: could not wait for desired 
snapshot timestamp to be consistent: Timed out waiting for all transactions 
with ts < P: 1475006188645381 usec, L: 0 to commit
        at 
org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:548)
        at org.apache.kudu.client.TabletClient.decode(TabletClient.java:482)
        at org.apache.kudu.client.TabletClient.decode(TabletClient.java:83)
...

On the server-side:
W0927 19:56:29.455293   746 tablet_service.cc:1509] Error setting up scanner 
with request new_scan_request { tablet_id: "5584580c8ff3463e8092814f4b45ee99" 
limit: 9223372036854775807 projected_columns { name: "key" type: INT32 is_key: 
true is_nullable: false encoding: AUTO_ENCODING compression: 
DEFAULT_COMPRESSION cfile_block_size: 0 } projected_columns { name: "column1_i" 
type: INT32 is_key: false is_nullable: false encoding: AUTO_ENCODING 
compression: DEFAULT_COMPRESSION cfile_block_size: 0 } projected_columns { 
name: "column2_i" type: INT32 is_key: false is_nullable: false encoding: 
AUTO_ENCODING compression: DEFAULT_COMPRESSION cfile_block_size: 0 } 
projected_columns { name: "column3_s" type: STRING is_key: false is_nullable: 
true encoding: DICT_ENCODING compression: LZ4 cfile_block_size: 4096 } 
projected_columns { name: "column4_b" type: BOOL is_key: false is_nullable: 
false encoding: AUTO_ENCODING compression: DEFAULT_COMPRESSION 
cfile_block_size: 0 } read_mode: READ_AT_SNAPSHOT
  snap_timestamp: 6041625348691480576 propagated_timestamp: 6041625348556406784 
cache_blocks: true order_mode: UNORDERED column_predicates { column: "key" 
equality { value: "\004\001\000\000" } } } batch_size_bytes: 1048576
W0927 19:56:29.457337   746 rpcz_store.cc:234] Call 
kudu.tserver.TabletServerService.Scan from 127.0.0.1:44116 (request call id 3) 
took 492ms (client timeout 500).
W0927 19:56:29.458102   746 rpcz_store.cc:238] Trace:
 0927 19:56:28.964746 (+     0us) service_pool.cc:143] Inserting onto call queue
 0927 19:56:28.965109 (+   363us) service_pool.cc:202] Handling call
 0927 19:56:28.967353 (+  2244us) tablet_service.cc:1466] Creating iterator
 0927 19:56:28.967400 (+    47us) tablet_service.cc:1782] Waiting for 
operations in snapshot to commit
 0927 19:56:29.455140 (+487740us) tablet_service.cc:1485] Iterator created
 0927 19:56:29.455202 (+    62us) tablet_service.cc:1500] Iterator init: Timed 
out: could not wait for desired snapshot timestamp to be consistent: Timed out 
waiting for all transactions with ts < P: 1475006188645381 usec, L: 0 to commit

Change-Id: I08de2fcd6736c2bc75ef0c92c8c8f48f9d7da49e
Reviewed-on: http://gerrit.cloudera.org:8080/4545
Reviewed-by: Adar Dembo <[email protected]>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/591e6ddf
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/591e6ddf
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/591e6ddf

Branch: refs/heads/master
Commit: 591e6ddfdda0e685dba3ce4ca3ebaa9bc6731c0c
Parents: 56cc692
Author: Jean-Daniel Cryans <[email protected]>
Authored: Tue Sep 27 14:13:44 2016 -0700
Committer: Jean-Daniel Cryans <[email protected]>
Committed: Tue Sep 27 22:34:50 2016 +0000

----------------------------------------------------------------------
 .../src/test/java/org/apache/kudu/client/ITClient.java   | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/591e6ddf/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
----------------------------------------------------------------------
diff --git 
a/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java 
b/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
index c482a86..d4ffec0 100644
--- a/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
+++ b/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
@@ -409,9 +409,18 @@ public class ITClient extends BaseKuduTest {
      * We need to do this because the scans in this client aren't fault 
tolerant.
      * @param message message to print if the exception contains a real error
      * @param e the exception to check
-     * @return true if the scanner failed because it wasn't false, otherwise 
false
+     * @return true if the scanner failed on a non-FATAL error, otherwise 
false which will kill
+     *         this test
      */
     private boolean checkAndReportError(String message, KuduException e) {
+      // It's possible to get timeouts if we're unlucky. A particularly common 
one is
+      // "could not wait for desired snapshot timestamp to be consistent" 
since we're using
+      // READ_AT_SNAPSHOT scanners.
+      // TODO revisit once KUDU-1656 is taken care of.
+      if (e.getStatus().isTimedOut()) {
+        LOG.warn("Received a scan timeout", e);
+        return true;
+      }
       // Do nasty things, expect nasty results. The scanners are a bit too 
happy to retry TS
       // disconnections so we might end up retrying a scanner on a node that 
restarted, or we might
       // get disconnected just after sending an RPC so when we reconnect to 
the same TS we might get

Reply via email to