Alexey says that KuduSession::Flush() already blocks on all outstanding
async flushes, and apparently that's true, so either there is a non-obvious
bug there or it's caused by something else.

Mike

On Tue, Nov 22, 2016 at 4:46 PM, Mike Percy <[email protected]> wrote:

> Hi Alexey,
> Thank you very much for investigating this test and providing an easy
> repro case. I just got around to spending some time on this due to David's
> read-your-writes (RYW) patches adding to this test. I think we should try
> to make this test reliable to be sure RYW doesn't have anomalies we aren't
> aware of. I applied your (second) patch mentioned earlier in this thread
> and was able to quickly reproduce the issues you described.
>
> I suspect that the problem stems from the fact that in the C++ client,
> session->Flush() is a no-op if an async flush is already in-progress. At
> least, it looks to me like those are the current semantics. If I remove the
> session->FlushAsync(null) from your patch then the test passes for me.
>
> The problem with a non-blocking Flush() while an async flush is
> outstanding that is that tablet_history_gc-itest relies on the following
> series of operations in order to ensure it knows what the consistent set of
> values should be for a particular timestamp:
>
>           FlushSessionOrDie(session);
>         }
>         SaveSnapshot(std::move(snapshot), clock_->Now());
>
> So it waits until the data is flushed to the server and then captures the
> current server timestamp as a key for its record of expected data values at
> that timestamp. The above appears in all of the write code paths in the
> test. If FlushSessionOrDie() is an async no-op, and the client has not
> truly flushed outstanding data at the current time stamp, then the
> "SaveSnapshot()" code path will think the latest writes are included in the
> snapshot for the current time, but in actuality the old value will be
> stored for that timestamp on the server.
>
> This matches with the observed behavior based on running the (patched)
> test with --vmodule=tablet_history_gc-itest=2 and seeing the following in
> the log:
>
> I1122 15:42:48.355686 15303 tablet_history_gc-itest.cc:166] Saving
> snapshot at ts = P: 1000000000 usec, L: 1 (4096000000001)
> I1122 15:42:48.355768 15303 tablet_history_gc-itest.cc:380] Starting
> round 1
> ...
> I1122 15:42:48.355917 15303 tablet_history_gc-itest.cc:463] Updating row
> to { 511, 1535296252, 1535296251, NOT_DELETED }
> I1122 15:42:48.355926 15303 tablet_history_gc-itest.cc:463] Updating row
> to { 511, 1708088659, 1708088658, NOT_DELETED }
> ...
> I1122 15:42:48.975769 15303 tablet_history_gc-itest.cc:166] Saving
> snapshot at ts = P: 1000000000 usec, L: 514 (4096000000514)
> I1122 15:42:48.975796 15303 tablet_history_gc-itest.cc:496] Updated 512
> rows
> I1122 15:42:48.975807 15303 tablet_history_gc-itest.cc:380] Starting
> round 2
> ...
> I1122 15:42:54.547406 15303 tablet_history_gc-itest.cc:213] Round 26:
> Verifying snapshot scan for timestamp P: 1000000000 usec, L: 1559
> (4096000001559)
> ../../src/kudu/integration-tests/tablet_history_gc-itest.cc:242: Failure
> Value of: int_val
>   Actual: 1535296252
> Expected: snap_iter->second.int_val
> Which is: 1708088659
> at row key 511
>
> So I think we have a couple options:
>
>    1. Make KuduSession::Flush() "timeline-consistent" (block until all
>    outstanding flushes complete), or
>    2. Only use manual flushing in this test​
>
> Personally, I think the current observed semantics of KuduSession::Flush()
> are not ideal and I'd advocate that we do #1.
>
> Mike
>
>

Reply via email to