Repository: kudu
Updated Branches:
refs/heads/master 56cc69226 -> 591e6ddfd
[java client] ITClient's new snapshot scanners can easily timeout
This patch makes it so that a scanner timeout won't kill the test. It also
takes care of a typo in related javadoc.
The failure in question:
19:56:29.459 [ERROR - Thread-10] (ITClient.java:134) Got error while getting
row 260
org.apache.kudu.client.NonRecoverableException: could not wait for desired
snapshot timestamp to be consistent: Timed out waiting for all transactions
with ts < P: 1475006188645381 usec, L: 0 to commit
at
org.apache.kudu.client.TabletClient.dispatchTSErrorOrReturnException(TabletClient.java:548)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:482)
at org.apache.kudu.client.TabletClient.decode(TabletClient.java:83)
...
On the server-side:
W0927 19:56:29.455293 746 tablet_service.cc:1509] Error setting up scanner
with request new_scan_request { tablet_id: "5584580c8ff3463e8092814f4b45ee99"
limit: 9223372036854775807 projected_columns { name: "key" type: INT32 is_key:
true is_nullable: false encoding: AUTO_ENCODING compression:
DEFAULT_COMPRESSION cfile_block_size: 0 } projected_columns { name: "column1_i"
type: INT32 is_key: false is_nullable: false encoding: AUTO_ENCODING
compression: DEFAULT_COMPRESSION cfile_block_size: 0 } projected_columns {
name: "column2_i" type: INT32 is_key: false is_nullable: false encoding:
AUTO_ENCODING compression: DEFAULT_COMPRESSION cfile_block_size: 0 }
projected_columns { name: "column3_s" type: STRING is_key: false is_nullable:
true encoding: DICT_ENCODING compression: LZ4 cfile_block_size: 4096 }
projected_columns { name: "column4_b" type: BOOL is_key: false is_nullable:
false encoding: AUTO_ENCODING compression: DEFAULT_COMPRESSION
cfile_block_size: 0 } read_mode: READ_AT_SNAPSHOT
snap_timestamp: 6041625348691480576 propagated_timestamp: 6041625348556406784
cache_blocks: true order_mode: UNORDERED column_predicates { column: "key"
equality { value: "\004\001\000\000" } } } batch_size_bytes: 1048576
W0927 19:56:29.457337 746 rpcz_store.cc:234] Call
kudu.tserver.TabletServerService.Scan from 127.0.0.1:44116 (request call id 3)
took 492ms (client timeout 500).
W0927 19:56:29.458102 746 rpcz_store.cc:238] Trace:
0927 19:56:28.964746 (+ 0us) service_pool.cc:143] Inserting onto call queue
0927 19:56:28.965109 (+ 363us) service_pool.cc:202] Handling call
0927 19:56:28.967353 (+ 2244us) tablet_service.cc:1466] Creating iterator
0927 19:56:28.967400 (+ 47us) tablet_service.cc:1782] Waiting for
operations in snapshot to commit
0927 19:56:29.455140 (+487740us) tablet_service.cc:1485] Iterator created
0927 19:56:29.455202 (+ 62us) tablet_service.cc:1500] Iterator init: Timed
out: could not wait for desired snapshot timestamp to be consistent: Timed out
waiting for all transactions with ts < P: 1475006188645381 usec, L: 0 to commit
Change-Id: I08de2fcd6736c2bc75ef0c92c8c8f48f9d7da49e
Reviewed-on: http://gerrit.cloudera.org:8080/4545
Reviewed-by: Adar Dembo <[email protected]>
Tested-by: Kudu Jenkins
Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/591e6ddf
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/591e6ddf
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/591e6ddf
Branch: refs/heads/master
Commit: 591e6ddfdda0e685dba3ce4ca3ebaa9bc6731c0c
Parents: 56cc692
Author: Jean-Daniel Cryans <[email protected]>
Authored: Tue Sep 27 14:13:44 2016 -0700
Committer: Jean-Daniel Cryans <[email protected]>
Committed: Tue Sep 27 22:34:50 2016 +0000
----------------------------------------------------------------------
.../src/test/java/org/apache/kudu/client/ITClient.java | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/kudu/blob/591e6ddf/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
----------------------------------------------------------------------
diff --git
a/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
b/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
index c482a86..d4ffec0 100644
--- a/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
+++ b/java/kudu-client/src/test/java/org/apache/kudu/client/ITClient.java
@@ -409,9 +409,18 @@ public class ITClient extends BaseKuduTest {
* We need to do this because the scans in this client aren't fault
tolerant.
* @param message message to print if the exception contains a real error
* @param e the exception to check
- * @return true if the scanner failed because it wasn't false, otherwise
false
+ * @return true if the scanner failed on a non-FATAL error, otherwise
false which will kill
+ * this test
*/
private boolean checkAndReportError(String message, KuduException e) {
+ // It's possible to get timeouts if we're unlucky. A particularly common
one is
+ // "could not wait for desired snapshot timestamp to be consistent"
since we're using
+ // READ_AT_SNAPSHOT scanners.
+ // TODO revisit once KUDU-1656 is taken care of.
+ if (e.getStatus().isTimedOut()) {
+ LOG.warn("Received a scan timeout", e);
+ return true;
+ }
// Do nasty things, expect nasty results. The scanners are a bit too
happy to retry TS
// disconnections so we might end up retrying a scanner on a node that
restarted, or we might
// get disconnected just after sending an RPC so when we reconnect to
the same TS we might get