Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16952 )

Change subject: wip KUDU-2612: background task to commit transaction
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16952/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16952/1//COMMIT_MSG@24
PS1, Line 24: There are some nuances here
> What if the leadership of txn status replica has changed while the commit 
> phase was in progress?  I saw the status of the finalization tasks is being 
> re-fetched from the txn status tablet, but I guess it's better to ask those 
> explicitly.

This is a good question and isn't addressed in this patch yet. Essentially, 
once we get responses from each participant, we'll try to write to the 
transaction status table, and at that point, we should take the leadership 
lock. If that happens, the task should just exit.

> What does TxnStatusManager do?  I guess it would try sending FINALIZE_COMMIT 
> again and again before the whole process times out.  What's next?

It depends on whether the participants ever come back around. If the error is 
transient, we can expect that eventually the FINALIZE_COMMIT will succeed. If 
not, e.g. because the participant was deleted, it's interesting. Right now, the 
behavior is that we would try to abort, but that's at odds with the fact that 
we've already called FINALIZE_COMMIT on some of the participants. To avoid 
this, we could assume that a deleted participant is considered successful 
(which I suppose makes sense, if you imagine the FINALIZE_COMMIT ran and then 
the participant was immediately deleted). Otherwise transient errors, I think, 
should just stop running the task at a certain point, and leave partially 
committed data on the participants. Scans that are waiting on non-committed 
data will just time out.

> Another error handling question is maybe a little bit early to think about 
> since we only want to support REPEATED READ isolation level first: what 
> happens if one of the two (or more) transactions have concurrent write to the 
> same range of data based on the read which can cause write skew.

I don't think that is addressable by changing how we do commits. If we want 
serializability, we need a read lock, and the transactions should fail upon 
attempting to read lock, and fail the transaction if there's a conflict.



--
To view, visit http://gerrit.cloudera.org:8080/16952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e
Gerrit-Change-Number: 16952
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 19 Jan 2021 07:44:59 +0000
Gerrit-HasComments: Yes

Reply via email to