David Ribeiro Alves has posted comments on this change.

Change subject: Allow tablet shutdown without completing txs
......................................................................


Patch Set 12:

Posting a transcript of a side chat, for posterity:

...
[1:12 PM] David Alves: no, no other transaction with a timestamp t1 > t" should 
be _made visible_ for the tablet
[1:12 PM] David Alves: (not the tserver)
[1:12 PM] Andrew Wong: ah ok
[1:12 PM] Andrew Wong: good
[1:12 PM] David Alves: that's what I meant that I don't care too much about 
actually stopping txns from applying
[1:13 PM] David Alves: but I do care that no new scan at a timestmap > t can 
start (or continue)
[1:13 PM] Andrew Wong: but also i was wondering whether there were any 
preclusions if we want to support multi-tablet txns
[1:13 PM] Andrew Wong: as far as things go _now_, i think we can enforce that 
by changing the replica state
[1:13 PM] David Alves: not sure that is true
[1:14 PM] Andrew Wong: oh?
[1:14 PM] Andrew Wong: hrmm i thought we'd only scan or copy if the replica 
were RUNNING
[1:14 PM] Andrew Wong: let me check that
[1:14 PM] Andrew Wong: or do you mean if we go mid-scan / mid-copy
[1:14 PM] David Alves: I think there's opportunity for a scan to get a replica 
state of RUNNING, then be assigned a timestamp that is > t
[1:14 PM] David Alves: yeah
[1:15 PM] David Alves: so I was suggesting that, instead of transpiring the 
status of the apply to the transcation driver, you could instead change the 
state of the mvcc manager internally
[1:16 PM] David Alves: and pass along a shutdown timestamp
[1:16 PM] David Alves: and then, when creating a snapshot on mvcc you would 
check whether the snapshot timestmap is lower than the shutdown timestamp
[1:17 PM] David Alves: this also has the advantage of simplifying your patch, 
no changes to the transaction driver are required
[1:21 PM] Andrew Wong: why bother snapshotting at all if the tablet is going to 
be FAILED/shut down anyway?
[1:22 PM] Andrew Wong: instead of aborting or something
[1:25 PM] Andrew Wong: brb grabbing some lunch
[1:25 PM] David Alves: was doing the same
[1:25 PM] David Alves: yeah, that's kind of what I meant
[1:27 PM] David Alves: oh, wait. yeah, I guess maybe we don't need the 
timestmap at all
[1:28 PM] David Alves: but we do need the mvcc manager to return a non-ok 
status for trasactions that were wating on timestamps equal to or greater than 
t to be "consistent"
[1:29 PM] David Alves: so, in summary:
[1:30 PM] David Alves: - the tablet can (and must) shutdown the mvcc manager 
internally. when this happens it should break all pending scans waiting for 
timestamp >= t to be consistent
[1:31 PM] David Alves: eventually this will propagate to the tablet replica 
(through the asynch callback triggered by the error manager) and scans won't 
even have the chance to start
[1:31 PM] David Alves: sounds reasonable?
[1:33 PM] Andrew Wong: Mostly, but why not abort all scans?
[1:39 PM] David Alves: what do you mean?
[1:41 PM] Andrew Wong:
    |  - the tablet can (and must) shutdown the mvcc manager internally. when 
this happens it should break all pending scans waiting for timestamp >= t to be 
consistent
    |  Shutting down mvcc should about all scans to the tablet, right?
[1:41 PM] Andrew Wong: Should abort*
[1:42 PM] David Alves: only "consistent" scans actually wait for timestamps
[1:42 PM] David Alves: the other ones don't. they take a snapshot of mvcc and 
move forth
[1:42 PM] David Alves: don't see how you can abort those
[1:42 PM] David Alves: since they don't check back with mvcc
[1:43 PM] David Alves: they will once they see replica in non-RUNNING state
[1:43 PM] David Alves: but I don't think we should concern ourselves with these 
scans too much
[1:46 PM] Andrew Wong: Ah I see
[1:47 PM] Andrew Wong: Yeah sounds reasonable
[1:48 PM] Andrew Wong: I'll also consider whether setting the replica state 
synchonously makes sense. I think it does, hopefully there won't be races
[1:54 PM] David Alves: that I'm not so sure of
[1:55 PM] David Alves: particularly as there's work to do
[1:55 PM] David Alves: in that case
[1:55 PM] David Alves: which opens the possiblity of deadlocks, etc

-- 
To view, visit http://gerrit.cloudera.org:8080/7439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: No

Reply via email to