[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5294

to look at the new patch set (#8).

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..

KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

In order for a consensus replica to safely move "safe" time, it needs
to know that all transactions that come before it have at least
started. One way to do this is to call transaction_->Start() on
transaction_driver_->Init() which we know is called by the thread
updating consensus for non-leader transctions. This should be ok, as
the leader has already correctly serialized the transactions.

However, if transactions are now started on Init() then, in the case
they abort (i.e. when transaction_driver_->ReplicationFinished() is
called with a non-ok status, again done by the thread updating
consensus), we now must make sure that the transaction is actually
removed from mvcc before that method returns. Otherwise consensus
might call Start() on another transaction with the same timestamp
before the failed transaction is removed from mvcc.

To do this, this patch adds a latch to make sure that, if replication
failed, the prepare phase has completed before returning from
ReplicationFinished().

Finally this patch adds a couple of statements to increase the
driver's refcount so that it survives until the latch may be called.

Ran this in dist-test, slow mode, asan, 1000 times. No failures.

Results:
exactly_once_writes-itest: 
http://dist-test.cloudera.org//job?job_id=david.alves.1480659005.9899
raft_consensus-itest : 
http://dist-test.cloudera.org//job?job_id=david.alves.1480659414.10085

Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
---
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
2 files changed, 58 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/5294/8
-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5294

to look at the new patch set (#7).

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..

KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

In order for a consensus replica to safely move "safe" time, it needs
to know that all transactions that come before it have at least
started. One way to do this is to call transaction_->Start() on
transaction_driver_->Init() which we know is called by the thread
updating consensus, for non-leader transctions. This should be ok as
the leader has already correctly serialized the transactions.

However, if transactions are now started on Init() then, in the case
they abort (i.e. when transaction_driver_->ReplicationFinished() is
called with a non-ok status, again done by the thread updating
consensus), we now must make sure that the transaction is actually
removed from mvcc before that method returns. Otherwise consensus
might call Start() on another transaction with the same timestamp
before the failed transaction is removed from mvcc.

To do this, this patch adds a latch to make sure that, if replication
failed, the prepare phase has completed before returning from
ReplicationFinished().

Finally this patch adds a couple of statements to increase the
driver's refcount so that it survives until the latch may be called.

Ran this in dist-test, slow mode, asan, 1000 times. No failures.

Results:
exactly_once_writes-itest: 
http://dist-test.cloudera.org//job?job_id=david.alves.1480659005.9899
raft_consensus-itest : 
http://dist-test.cloudera.org//job?job_id=david.alves.1480659414.10085

Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
---
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
2 files changed, 58 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/5294/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5294/6/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

Line 148:   RETURN_NOT_OK(txn_tracker_->Add(this));
> if this fails, then it seems like transaction_->Start() would have been cal
you're right, I will add a test to make sure this doesn't happen. maybe even 
solve KUDU-625


PS6, Line 421: If replication has failed we need to make sure the transaction 
is removed from mvcc before we
 :   // return from this method,
> seems like it would be cleaner to just be using some kind of locking here..
yeah, that makes sense to me. I think this class is due for a new overall.


PS6, Line 429: replication_state_copy
> why not just read 'status' directly?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5294/6/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

Line 148:   RETURN_NOT_OK(txn_tracker_->Add(this));
if this fails, then it seems like transaction_->Start() would have been called, 
and then we're relying on the destructor here to abort the mvcc transaction? is 
that path well covered? not sure if we have a test which makes a replica 
transaction tracker hit its memory limit, and KUDU-625 makes me nervous


PS6, Line 421: If replication has failed we need to make sure the transaction 
is removed from mvcc before we
 :   // return from this method,
seems like it would be cleaner to just be using some kind of locking here... I 
don't want to block progress on this if we have stress tests showing that it 
works OK, but let's think more about how to restructure the different 
prepare/replicate/etc interleavings to be more comprehensible.


PS6, Line 429: replication_state_copy
why not just read 'status' directly?


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5294/4/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

Line 102:   prepare_latch_(0) {
> we need to only set the latch if the transaction actually Init'ed want to b
Actually I changed my mind. I think you're right, if submitting to prepare 
fails then calling ReplicationFinished() is wrong. Will start the latch at 1


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5294

to look at the new patch set (#5).

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..

KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

In order for a consensus replica to safely move "safe" time, it needs
to know that all transactions that come before it have at least
started. One way to do this is to call transaction_->Start() on
transaction_driver_->Init() which we know is called by the thread
updating consensus, for non-leader transctions. This should be ok as
the leader has already correctly serialized the transactions.

However, if transactions are now started on Init() then, in the case
they abort (i.e. when transaction_driver_->ReplicationFinished() is
called with a non-ok status, again done by the thread updating
consensus), we now must make sure that the transaction is actually
removed from mvcc before that method returns. Otherwise consensus
might call Start() on another transaction with the same timestamp
before the failed transaction is removed from mvcc.

To do this, this patch adds a latch to make sure that, if replication
failed, the prepare phase has completed before returning from
ReplicationFinished().

Finally this patch adds a couple of statements to increase the
driver's refcount so that it survives until the latch may be called.

Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
---
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
2 files changed, 61 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/5294/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/5294/4//COMMIT_MSG
Commit Message:

PS4, Line 12: which we know
> Not sure this is true, see comments inline
I meant "replica" transactions, will clarify


PS4, Line 18: consensus thread
> Is the consensus thread the prepare thread? Not sure which pool we are talk
by "consensus thread" I mean the thread that is updating consensus and that is 
triggering replica transactions. This doesn't change the "leader transactions" 
path in terms of Start()


http://gerrit.cloudera.org:8080/#/c/5294/3/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

PS3, Line 431:   if (PREDICT_FALSE(replication_state_copy == 
REPLICATION_FAILED)) {
 : prepare_latch_.Wait();
 :   }
> at first glance this latch thing seems a bit sketchy, but I need to spend a
I tried a couple of different things before I resorted to the latch. The main 
point here is that we need to make sure that the transaction is no longer on 
mvcc before returning from this method.

Because of the race between the Prepare(), running on the prepare pool, and 
this method, running on the consensus update thread, to call ApplyAsync() it's 
hard to have this guarantee.

I tried different ways of fudging with the vars but ultimately could not find a 
non-racy way to make sure that, If we changed the state to REPLICATION_FAILED 
above in time for the prepare thread to see it (and move on to apply, which 
will actually abort), making sure that it completes before returning from this 
method.

Note that we only wait on the latch for replica transactions that have failed.


http://gerrit.cloudera.org:8080/#/c/5294/4/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

Line 102:   prepare_latch_(0) {
> Since a given transaction prepares only once, why not initialize the prepar
we need to only set the latch if the transaction actually Init'ed want to be 
double sure that it's submitted to prepare before we risk waiting.


PS4, Line 118: in the consensus thread
> I'm confused by this comment. I'm pretty sure this is executed on a TabletS
I meant by the thread that is updating consensus. note that this is the 
"REPLICA" path. Will make this clearer in the comments.


PS4, Line 148: innited
> typo: inited
Done


Line 391:   scoped_refptr ref(this);
> This is pretty smelly. Can we not ensure the caller has a ref, instead? Als
the object won't be destroyed before we change the state below.
I agree that lifetime of this class is screwed up, but we seem to resort to 
increasing the refcount in several other places and would like not to put 
sorting that out on the critical path.


Line 435: 
> nit: spurious line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5294/3/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

PS3, Line 431:   if (PREDICT_FALSE(replication_state_copy == 
REPLICATION_FAILED)) {
 : prepare_latch_.Wait();
 :   }
at first glance this latch thing seems a bit sketchy, but I need to spend an 
hour or so looking at it to see if I can come up with a better idea... sorry I 
didn't get a chance to look in detail this evening.


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-12-01 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change.

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5294/4//COMMIT_MSG
Commit Message:

PS4, Line 12: which we know
Not sure this is true, see comments inline


PS4, Line 18: consensus thread
Is the consensus thread the prepare thread? Not sure which pool we are talking 
about here.


http://gerrit.cloudera.org:8080/#/c/5294/4/src/kudu/tablet/transactions/transaction_driver.cc
File src/kudu/tablet/transactions/transaction_driver.cc:

Line 102:   prepare_latch_(0) {
Since a given transaction prepares only once, why not initialize the 
prepare_latch_ to 1? Then on abort we can count it down.


PS4, Line 118: in the consensus thread
I'm confused by this comment. I'm pretty sure this is executed on a 
TabletService service pool thread:

TabletServiceImpl::Write() -> TabletPeer::SubmitWrite() -> 
TabletPeer::NewLeaderTransactionDriver() -> TransactionDriver::Init()


PS4, Line 148: innited
typo: inited


Line 391:   scoped_refptr ref(this);
This is pretty smelly. Can we not ensure the caller has a ref, instead? Also, 
if this is so racy, what guarantees that this object is not destroyed before we 
take the ref on this line?

If we really must do this then we should do a more careful job of documenting 
the lifecycle guarantees.


Line 435: 
nit: spurious line


-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

2016-11-30 Thread David Ribeiro Alves (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5294

to look at the new patch set (#3).

Change subject: KUDU-798 (part 3) Make replica transactions start/abort on the 
consensus thread
..

KUDU-798 (part 3) Make replica transactions start/abort on the consensus thread

In order for a replica to safely move "safe" time, it needs to know
that all transactions that come before it have at least started.
One way to do this is to call transaction_->Start() on
transaction_driver_->Init() which we know is called by the consensus
thread. This should be ok as the leader has already correctly serialized
the transactions.

However, if transactions are now started on Init() then, in the case
they abort (i.e. when transaction_driver_->ReplicationFinished() is
called with a non-ok status, again done on the consensus thread), we
now must make sure that the transaction is actually removed from
mvcc in-line. Otherwise the consensus thread might call Start() on
another transaction with the same timestamp before the failed
transaction is removed from mvcc.

To do this, this patch adds a latch to make sure that, if replication
failed, the prepare phase has completed.

Finally this patch adds a couple of statements to increase the
driver's refcount so that it survives until the latch may be called.

Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
---
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
2 files changed, 60 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/5294/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie360e597eea86551c453717d7a1a000848027f4c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins