Alexey says that KuduSession::Flush() already blocks on all outstanding async flushes, and apparently that's true, so either there is a non-obvious bug there or it's caused by something else.
Mike On Tue, Nov 22, 2016 at 4:46 PM, Mike Percy <[email protected]> wrote: > Hi Alexey, > Thank you very much for investigating this test and providing an easy > repro case. I just got around to spending some time on this due to David's > read-your-writes (RYW) patches adding to this test. I think we should try > to make this test reliable to be sure RYW doesn't have anomalies we aren't > aware of. I applied your (second) patch mentioned earlier in this thread > and was able to quickly reproduce the issues you described. > > I suspect that the problem stems from the fact that in the C++ client, > session->Flush() is a no-op if an async flush is already in-progress. At > least, it looks to me like those are the current semantics. If I remove the > session->FlushAsync(null) from your patch then the test passes for me. > > The problem with a non-blocking Flush() while an async flush is > outstanding that is that tablet_history_gc-itest relies on the following > series of operations in order to ensure it knows what the consistent set of > values should be for a particular timestamp: > > FlushSessionOrDie(session); > } > SaveSnapshot(std::move(snapshot), clock_->Now()); > > So it waits until the data is flushed to the server and then captures the > current server timestamp as a key for its record of expected data values at > that timestamp. The above appears in all of the write code paths in the > test. If FlushSessionOrDie() is an async no-op, and the client has not > truly flushed outstanding data at the current time stamp, then the > "SaveSnapshot()" code path will think the latest writes are included in the > snapshot for the current time, but in actuality the old value will be > stored for that timestamp on the server. > > This matches with the observed behavior based on running the (patched) > test with --vmodule=tablet_history_gc-itest=2 and seeing the following in > the log: > > I1122 15:42:48.355686 15303 tablet_history_gc-itest.cc:166] Saving > snapshot at ts = P: 1000000000 usec, L: 1 (4096000000001) > I1122 15:42:48.355768 15303 tablet_history_gc-itest.cc:380] Starting > round 1 > ... > I1122 15:42:48.355917 15303 tablet_history_gc-itest.cc:463] Updating row > to { 511, 1535296252, 1535296251, NOT_DELETED } > I1122 15:42:48.355926 15303 tablet_history_gc-itest.cc:463] Updating row > to { 511, 1708088659, 1708088658, NOT_DELETED } > ... > I1122 15:42:48.975769 15303 tablet_history_gc-itest.cc:166] Saving > snapshot at ts = P: 1000000000 usec, L: 514 (4096000000514) > I1122 15:42:48.975796 15303 tablet_history_gc-itest.cc:496] Updated 512 > rows > I1122 15:42:48.975807 15303 tablet_history_gc-itest.cc:380] Starting > round 2 > ... > I1122 15:42:54.547406 15303 tablet_history_gc-itest.cc:213] Round 26: > Verifying snapshot scan for timestamp P: 1000000000 usec, L: 1559 > (4096000001559) > ../../src/kudu/integration-tests/tablet_history_gc-itest.cc:242: Failure > Value of: int_val > Actual: 1535296252 > Expected: snap_iter->second.int_val > Which is: 1708088659 > at row key 511 > > So I think we have a couple options: > > 1. Make KuduSession::Flush() "timeline-consistent" (block until all > outstanding flushes complete), or > 2. Only use manual flushing in this test​ > > Personally, I think the current observed semantics of KuduSession::Flush() > are not ideal and I'd advocate that we do #1. > > Mike > >
