Adar Dembo created KUDU-2668:
--------------------------------
Summary: TestKuduClient.readYourWrites tests are flaky
Key: KUDU-2668
URL: https://issues.apache.org/jira/browse/KUDU-2668
Project: Kudu
Issue Type: Bug
Components: java, test
Affects Versions: 1.9.0
Reporter: Adar Dembo
I looped TestKuduClient 1000 times in dist-test while working on another
problem, and saw the following failures:
{noformat}
1 testReadYourWritesBatchLeaderReplica
14 testReadYourWritesSyncClosestReplica
15 testReadYourWritesSyncLeaderReplica
{noformat}
In all cases, the stack trace of the failure was effectively this:
{noformat}
java.util.concurrent.ExecutionException: java.lang.AssertionError
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.kudu.client.TestKuduClient.readYourWrites(TestKuduClient.java:1113)
...
Caused by: java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at
org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1098)
at
org.apache.kudu.client.TestKuduClient$2.call(TestKuduClient.java:1055)
...
{noformat}
The offending lines:
{code}
AsyncKuduScanner scanner = asyncClient.newScannerBuilder(table)
.readMode(AsyncKuduScanner.ReadMode.READ_YOUR_WRITES)
.replicaSelection(replicaSelection)
.build();
KuduScanner syncScanner = new KuduScanner(scanner);
long preTs = asyncClient.getLastPropagatedTimestamp();
assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
asyncClient.getLastPropagatedTimestamp());
long row_count = countRowsInScan(syncScanner);
long expected_count = 100L * (i + 1);
assertTrue(expected_count <= row_count);
// After the scan, verify that the chosen snapshot timestamp is
// returned from the server and it is larger than the previous
// propagated timestamp.
assertNotEquals(AsyncKuduClient.NO_TIMESTAMP,
scanner.getSnapshotTimestamp());
--> assertTrue(preTs < scanner.getSnapshotTimestamp());
{code}
It's possible that this is just test flakiness, but I'm setting a higher
priority so we can understand whether that's the case, or whether there's
something wrong with read-your-writes scans.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)