[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-12 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#10).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 470 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/10
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 10
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-12 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 10:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.h@193
PS6, Line 193:
> Yeah, we updated the reference in the constructor, so if copying field-to-f
Ah makes sense. Sorry I missed that! Thanks a lot for pointing out.


http://gerrit.cloudera.org:8080/#/c/17097/9/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/9/src/kudu/tablet/lock_manager.cc@460
PS9, Line 460: (MonoDelta::FromMilliseconds(25
> This seems to be pretty short timeout.  Why just a millisecond?  Our timeou
Ack


http://gerrit.cloudera.org:8080/#/c/17097/9/src/kudu/tablet/lock_manager.cc@474
PS9, Line 474:  if (!partition_lock_) {
> What if by this time the lock has been released by the other party?  Will i
Done



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 10
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 21:33:57 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-12 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17097/8/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17097/8/src/kudu/tablet/lock_manager-test.cc@214
PS8, Line 214: , t
> We need to pass 't' as a copy.
Done


http://gerrit.cloudera.org:8080/#/c/17097/8/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/8/src/kudu/tablet/lock_manager.cc@461
PS8, Line 461: MonoDelta elapsed(MonoTime::Now() - start);
 : if (elapsed > timeout) {
> nit: just store elapsed and inline MonoTime::Now()?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 18:14:19 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-12 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#9).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 464 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/9
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: acquire and release partition lock

2021-03-12 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17159 )

Change subject: KUDU-2612: acquire and release partition lock
..


Patch Set 4:

(11 comments)

Addressed partial comments, will try to address the rest later.

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/client/batcher.cc@530
PS3, Line 530: //For example, Status::IllegalState() 
originated from
 : //TabletServerErrorPB::TXN_ILLEGAL_STATE 
response and
 :
> +1
No, this is not about TXN_LOCKED_RETRY. It is about error handling of 
TxnOpDispatcher. IIUC, TxnOpDispatcher currently only wraps the status of the 
call but not the error code. So TXN_LOCKED_ABORT is retried even it should not 
be (and handled by L510) for transactional ones. But for transactional ones it 
is taking care by L510.

For TXN_LOCKED_RETRY_OP we return it with ServiceUnavailable Status. so now it 
is handled in L473. But added the specific checking there.


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@813
PS3, Line 813: TEST_F(TxnParticipantITest, 
TestTxnSystemClientBeginTxnLockAbort) {
> Do you mind adding a scenario with calling ParticipateInTransaction() for t
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@833
PS3, Line 833: s.IsAborted
> Hmm, interesting: what actor is to abort the transaction?  I'm curious abou
Sure, will add a comment. Now TxnManager will just pass the error back to the 
client, which will cause the op to fail. And the client need to figure out what 
to do with the error (instead of TxnManager silently abort the transaction). Or 
do you have other suggestions?


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_participant-itest.cc@930
PS3, Line 930: FLAGS_enable_txn_partition_lock = false;
> It would be great if you could add a note explaining that this scenario bec
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_write_ops-itest.cc
File src/kudu/integration-tests/txn_write_ops-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/integration-tests/txn_write_ops-itest.cc@1046
PS3, Line 1046:
> nit: for completeness sake, maybe commit `second_txn` and ensure we no long
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/participant_op.cc@123
PS3, Line 123:   }
> Probably, we want to distinguish in the trace whether this is was a no-op o
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/write_op.cc
File src/kudu/tablet/ops/write_op.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/write_op.cc@67
PS3, Line 67: DEFINE_int32(tablet_inject_latency_on_apply_write_op_ms, 0,
:  "How much latency to inject when a write op is 
applied. "
:  "For testing only!");
: TAG_FLAG(tablet_inject_latency_on_apply_write_op_ms, un
> Could we just reuse --enable_txn_partition_lock? When might we want to set
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/ops/write_op.cc@569
PS3, Line 569: }
> LOG(DFATAL) or LOG(FATAL)?
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/tablet.h@166
PS3, Line 166: return
 :   // 'TXN_LOCKED_ABORT' or 'TXN_LOCKED_RETRY_OP' error
> How is it to return those if the return type of this method is Status?
Ah, sorry for not being precise. Will update the comment.


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/txn_participant.h
File src/kudu/tablet/txn_participant.h:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/tablet/txn_participant.h@220
PS3, Line 220: partition_lock_ = std::move(partition_lock);
> nit: maybe DCHECK or CHECK that we own the lock?
Done


http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/transactions/participant_rpc.cc
File src/kudu/transactions/participant_rpc.cc:

http://gerrit.cloudera.org:8080/#/c/17159/3/src/kudu/transactions/participant_rpc.cc@180
PS3, Line 180:   case TabletServerErrorPB::TXN_ILLEGAL_STATE
> Can we also explicitly handle TXN_LOCKED_RETRY_OP?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: 

[kudu-CR] KUDU-2612: acquire and release partition lock

2021-03-12 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17159

to look at the new patch set (#4).

Change subject: KUDU-2612: acquire and release partition lock
..

KUDU-2612: acquire and release partition lock

This patch introduces the logic for actual acquiring and releasing
the partition lock for transactions and non-transactional operations.
For each transaction, we try to acquire the partition lock in the
ParticipantOp prepare phase of BEGIN_TXN and release the lock when
COMMIT_TXN or ABORT_TXN is applied. Moreover, we take the parition
lock for non-transactional operations as well to ensure we don’t
have duplicate keys. If the partition lock cannot be acquired,
the transaction (or write op) will be aborted or retried.

Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
---
M src/kudu/client/batcher.cc
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/integration-tests/txn_commit-itest.cc
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/integration-tests/txn_write_ops-itest.cc
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/ops/write_op.cc
M src/kudu/tablet/ops/write_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.h
M src/kudu/transactions/participant_rpc.cc
14 files changed, 465 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/17159/4
--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#8).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 464 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/8
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 8
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#7).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 464 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/7
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 7
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager-test.cc@158
PS6, Line 158:   ASSERT_EQ(1, lock_manager_.partition_lock_refs());
> Looking at this as a block box, does it make sense to check that the first
Done


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager-test.cc@252
PS6, Line 252: if (acquired) {
> Why it might not be acquired?  Is it possible?
This might happen if the non-transactional op take the op first.


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager-test.cc@267
PS6, Line 267: if (acquired) {
 :   CHECK_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code);
 :   CHECK_EQ(1, lock_manager_.partition_lock_refs());
 : }
> I'm not sure I understand this piece if reading the comment for the thread'
Sorry forget to update the comment.. Updated it.


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.h@193
PS6, Line 193: ScopedPartitionLock
> What if an instance of this class is copied?  Would it result in incorrect
I don't think so, since we only update the reference at the constructor. But I 
disabled the copy constructor anyway.


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc@448
PS6, Line 448: kMaxBackoffExp
> nit: this might be a constexpr
Ack


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc@496
PS6, Line 496: ms
> Is it in seconds or milliseconds?
Ah, sorry for the confusion. It is in milliseconds, since I think we don't need 
to wait at the second level, as the caller() should already know the lock can 
be acquired (fast) with TryAcquirePartitionLock() succeeded previously. WDYT?

I moved the log to TryAcquirePartitionLock() to avoid the report being screwed.


http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc@520
PS6, Line 520:   partition_lock_refs_ -= 1;
 :   DCHECK_GE(partition_lock_refs_, 0);
 :   if (partition_lock_refs_ == 0) {
 : partition_lock_.reset();
 :   }
> nit: this might be shortened to
Done



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 6
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 06:28:23 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc@485
PS6, Line 485: SleepFor(MonoDelta::FromMilliseconds(1L << backoff_exp));
> The point here is that the lock based on spinlock is taken at line 452 and
Ah I though you are referring to the partition lock. Makes sense, updated it.



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 6
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 05:03:49 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/6/src/kudu/tablet/lock_manager.cc@485
PS6, Line 485: SleepFor(MonoDelta::FromMilliseconds(1L << backoff_exp));
> BTW, why to hold that spinlock-based lock when sleeping (i.e., why not to r
Hmm, we are only sleeping if we cannot hold the lock. But I will try to use a 
semaphore instead.



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 6
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 03:56:59 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

PS5:
> Could you also add some (maybe multithreaded) tests for the Wait... functio
Ah right, missed that.


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager-test.cc@261
PS5, Line 261: partition_lock.IsAcquired()
> Ah, I guess you can use the moved instances, but maybe add specific NOLINT
Ack


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.h@81
PS5, Line 81:   // Similar to the above, but wait until the lock is acquired.
> nit: could you mention that the caller is expected to ensure there is no de
Done


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@367
PS5, Line 367: bool must_acquire
> nit: maybe use an enum?
Done


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@372
PS5, Line 372: DCHECK_NOTNULL(lock_);
> nit: can just used DCHECK(lock_)
Done


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@409
PS5, Line 409:   manager_ = other->manager_;
> nit: add a DCHECK() on other != this, just to catch programming errors.
Done


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@437
PS5, Line 437:   if (!txn_id.IsValid()) {
 : id = std::numeric_limits::max();
 :   } else {
 : id = txn_id.value();
 :   }
> nit: this might be done using tri-state operator:
Done


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@473
PS5, Line 473: PartitionLock* LockManager::WaitUntilAcquiredPartitionLock(const 
TxnId& txn_id) {
> What happens when this is running for too long, going over the overall time
Yeah, this is used for case where we expect the lock can be acquired eventually 
(e.g. for a follower to replicate the participanOp).


http://gerrit.cloudera.org:8080/#/c/17097/5/src/kudu/tablet/lock_manager.cc@478
PS5, Line 478: (lock = TryAcquirePartitionLock(txn_id, ))
> Would it make sense to extend the TryAcquirePartitionLock() method with a t
Makes sense, updated it.



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 12 Mar 2021 00:18:21 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-11 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#6).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 457 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/6
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 6
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: acquire and release partition lock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17159 )

Change subject: KUDU-2612: acquire and release partition lock
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/lock_manager.cc@401
PS1, Line 401: }
> For consistency, maybe add a check that other != this ?
Will update it in the other CR.



--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 11 Mar 2021 18:57:42 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: acquire and release partition lock

2021-03-11 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17159 )

Change subject: KUDU-2612: acquire and release partition lock
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17159/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17159/1//COMMIT_MSG@13
PS1, Line 13: COMMI
> We'll also need to make sure the TxnOpDispatcher handles the error cases we
Done


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc@95
PS1, Line 95: l acquired = partition_lock_.IsAcquired();
> nit: based on the current code, this cannot be any other code, right? If so
Done


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc@199
PS1, Line 199:  "ParticipantOp::Prepare
> nit: can use 'replica'
Done


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc@200
PS1, Line 200: TRACE("PREPARE: Starting.");
> Shouldn't there be a 'break' before this line? Or are we taking the partiti
Ah, right, 'break' is needed. Thanks for catching it!


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc@300
PS1, Line 300:   return Status::InvalidArgument("unknown op type");
> As for the sequence of releasing txn and its lock and partition lock: shoul
Yes, it makes sense. Updated it.


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/ops/participant_op.cc@303
PS1, Line 303: return Status::OK();
> nit: could store the type of the operation instead since the operation as i
Will do.


http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

http://gerrit.cloudera.org:8080/#/c/17159/1/src/kudu/tablet/tablet.cc@223
PS1, Line 223: using std::endl;
> warning: using decl 'TabletServerErrorPB' is unused [misc-unused-using-decl
Done



--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 11 Mar 2021 18:57:01 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-10 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17097/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17097/3//COMMIT_MSG@12
PS3, Line 12: sact
> nit: acquire?
Done


http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager-test.cc@195
PS3, Line 195:   CountDownLatch latch(kNumStartAtOnce);
> nit: since lock acquisition is quite fast operation, maybe it's worth synch
Done


http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager-test.cc@208
PS3, Line 208:   ASSERT_EQ(0, lock_manager_.partition_lock_refs());
> nit: since lock acquisition is quite fast operation, maybe it's worth synch
Done


http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager.h@19
PS3, Line 19: #include 
> Nit: Can use the C++ compatible  instead.
Done


http://gerrit.cloudera.org:8080/#/c/17097/3/src/kudu/tablet/lock_manager.h@157
PS3, Line 157:  multi-rows t
> I find it weird that a class named *Lock doesn't have any lock/unlock/relea
I found it is more clear to have a specific entity to represent the lock. Even 
though it is just a wrapper around TxnId. But it can be further extended to 
contain more required field if needed in future.



--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 11 Mar 2021 07:21:28 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-03-10 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#5).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction (or write op) at a time, the same
transaction can acquire the lock for multiple times. To prevent deadlock,
'wait-die' scheme is used, which if the transaction requires a lock held by
another transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 396 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/5
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 5
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: acquire and release partition lock

2021-03-10 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17159

to look at the new patch set (#3).

Change subject: KUDU-2612: acquire and release partition lock
..

KUDU-2612: acquire and release partition lock

This patch introduces the logic for actual acquiring and releasing
the partition lock for transactions and non-transactional operations.
For each transaction, we try to acquire the partition lock in the
ParticipantOp prepare phase of BEGIN_TXN and release the lock when
COMMIT_TXN or ABORT_TXN is applied. Moreover, we take the parition
lock for non-transactional operations as well to ensure we don’t
have duplicate keys. If the partition lock cannot be acquired,
the transaction (or write op) will be aborted or retried.

Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
---
M src/kudu/client/batcher.cc
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/integration-tests/txn_write_ops-itest.cc
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/ops/write_op.cc
M src/kudu/tablet/ops/write_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.h
M src/kudu/transactions/participant_rpc.cc
12 files changed, 376 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/17159/3
--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] (wip) KUDU-2612: acquire and release PartitionLock

2021-03-09 Thread Hao Hao (Code Review)
Hello Tidy Bot, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17159

to look at the new patch set (#2).

Change subject: (wip) KUDU-2612: acquire and release PartitionLock
..

(wip) KUDU-2612: acquire and release PartitionLock

This patch introduces the logic for acquiring and releasing the
PartitionLock at ParticipantOp for BEGIN_TXN, FINALIZE_COMMIT
and ABORT_TXN.

wip: move code for ScopedPartitionLock to the previous patch,
 add acquire and release PartitionLock at WriteOp,
 update TxnOpDispatcher error handling,
 add tests.

Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
---
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.h
8 files changed, 138 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/17159/2
--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] (wip) KUDU-2612: acquire and release PartitionLock

2021-03-08 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17159


Change subject: (wip) KUDU-2612: acquire and release PartitionLock
..

(wip) KUDU-2612: acquire and release PartitionLock

This patch introduces the logic for acquiring and releasing the
PartitionLock at ParticipantOp for BEGIN_TXN, FINALIZE_COMMIT
and ABORT_TXN.

wip: move code for ScopedPartitionLock to the previous patch,
 add acquire and release PartitionLock at WriteOp,
 add tests.

Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
---
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
6 files changed, 91 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/17159/1
--
To view, visit http://gerrit.cloudera.org:8080/17159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If26733cae16810f3b3afd1fd05dcb984e6366939
Gerrit-Change-Number: 17159
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING

2021-03-01 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17124 )

Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1
Gerrit-Change-Number: 17124
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Mar 2021 00:03:12 +
Gerrit-HasComments: No


[kudu-CR] [client-test] more robust TestRetrieveAuthzTokenInParallel

2021-02-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17128 )

Change subject: [client-test] more robust TestRetrieveAuthzTokenInParallel
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0afa5ae72ccb96a218b6c6ff026ad88a70dea4f7
Gerrit-Change-Number: 17128
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:28:16 +
Gerrit-HasComments: No


[kudu-CR] [java] KUDU-3213: try at different server on TABLET NOT RUNNING

2021-02-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17124 )

Change subject: [java] KUDU-3213: try at different server on TABLET_NOT_RUNNING
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java:

http://gerrit.cloudera.org:8080/#/c/17124/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScanner.java@118
PS3, Line 118: quiesceTserver.waitFor();
 : assertEquals(0, quiesceTserver.waitFor()
why do we need to call waitFor() twice here?



--
To view, visit http://gerrit.cloudera.org:8080/17124
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I38ac84a52676ff361fa1ba996665b338d1bbfba1
Gerrit-Change-Number: 17124
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:25:15 +
Gerrit-HasComments: Yes


[kudu-CR] k8s: minor updates to the StatefulSet

2021-02-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17098 )

Change subject: k8s: minor updates to the StatefulSet
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50b0084f70d30187bcca4e356e38c35b2486c611
Gerrit-Change-Number: 17098
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:12:13 +
Gerrit-HasComments: No


[kudu-CR] test: add more natural test for KUDU-2233

2021-02-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17114 )

Change subject: test: add more natural test for KUDU-2233
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17114/2/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

http://gerrit.cloudera.org:8080/#/c/17114/2/src/kudu/tablet/compaction.cc@77
PS2, Line 77: DEFINE_bool(dcheck_on_kudu_2233_invariants, true
> Yes, this one different as I can see: it's not quite used in NDEBUG case, w
I am fine with the current code, but if Alexey thinks differentiate the flag to 
be not exposed in NDEBUG case can help with the performance of execution. Then 
I am also good with that.



--
To view, visit http://gerrit.cloudera.org:8080/17114
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icac3ad0a1b6bb9b17d5b6a901dc5bba79c0840fa
Gerrit-Change-Number: 17114
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 20:03:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: don't return NOT FOUND when BEGIN TXN has not yet run

2021-02-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17127 )

Change subject: KUDU-2612: don't return NOT_FOUND when BEGIN_TXN has not yet run
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fa8caba384ee30536114a3e6466ad90b6d8e45d
Gerrit-Change-Number: 17127
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 26 Feb 2021 19:56:59 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: allow aborting after beginning to commit

2021-02-25 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17022 )

Change subject: KUDU-2612: allow aborting after beginning to commit
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17022/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17022/3//COMMIT_MSG@10
PS3, Line 10: FINALIZE_IN_PROGRESS state to serve as an intermediate step 
between
: COMMIT_IN_PROGRESS
Can you briefly describe the difference between COMMIT_IN_PROGRESS and 
FINALIZE_IN_PROGRESS. e.g. COMMIT_IN_PROGRESS covers the process from 
BeginCommit to Finalize. And FINALIZE_IN_PROGRESS covers the process from 
Finalize to Committed? Also for my own understanding once in 
FINALIZE_IN_PROGRESS state the transaction cannot be aborted?


http://gerrit.cloudera.org:8080/#/c/17022/3/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/17022/3/src/kudu/transactions/txn_status_manager.h@155
PS3, Line 155: DCHECK_EQ(TxnStatePB::OPEN, abort_txn_);
I don't quite understand why not check TxnStatePB::ABORT_IN_PROGRESS instead of 
OPEN, Since  ABORT_IN_PROGRESS is the previous state of Aborted?



--
To view, visit http://gerrit.cloudera.org:8080/17022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1b6596df2db5601f7e17e528ad6dc68057b67f8
Gerrit-Change-Number: 17022
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 25 Feb 2021 19:57:09 +
Gerrit-HasComments: Yes


[kudu-CR] tablet: allow interleaving of row liveness between compaction input rows

2021-02-25 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16752 )

Change subject: tablet: allow interleaving of row liveness between compaction 
input rows
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16752/2/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

http://gerrit.cloudera.org:8080/#/c/16752/2/src/kudu/tablet/compaction.cc@453
PS2, Line 453: / If the last chang
> We discussed this on Slack, that I don't think it's a valid sequence in ter
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16752
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I042a7d70d32edf9d2a3a077790821893f162880a
Gerrit-Change-Number: 16752
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 25 Feb 2021 19:00:49 +
Gerrit-HasComments: Yes


[kudu-CR] test: add more natural test for KUDU-2233

2021-02-25 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17114 )

Change subject: test: add more natural test for KUDU-2233
..


Patch Set 2: Code-Review+1

(1 comment)

LGTM, thanks a lot for providing this patch test upgrade cases.

http://gerrit.cloudera.org:8080/#/c/17114/2/src/kudu/tablet/compaction.cc
File src/kudu/tablet/compaction.cc:

http://gerrit.cloudera.org:8080/#/c/17114/2/src/kudu/tablet/compaction.cc@78
PS2, Line 78: caused by KUDU-2233
nit: maybe specify this is test only as well.



--
To view, visit http://gerrit.cloudera.org:8080/17114
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icac3ad0a1b6bb9b17d5b6a901dc5bba79c0840fa
Gerrit-Change-Number: 17114
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 25 Feb 2021 18:45:36 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-02-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17097 )

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..


Patch Set 2:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17097/2//COMMIT_MSG@17
PS2, Line 17: transaction
> nit: you meant a write operation here, not the whole transaction, right?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc
File src/kudu/tablet/lock_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@159
PS2, Line 159: TEST_F(LockManagerTest, TestLockUnlockPartitionMultiTxn) {
 :   // Acquiring a lock that held by another transaction which has 
a lower txn ID
 :   // will get a 'TXN_LOCKED_ABORT' server error code.
 :   {
 : TxnId txn1(0);
 : ScopedPartitionLock first_lock(_manager_, 
LockManager::LOCK_EXCLUSIVE, txn1);
 : TabletServerErrorPB::Code code = 
TabletServerErrorPB::UNKNOWN_ERROR;
 : ASSERT_TRUE(first_lock.Acquired());
 : ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code);
 : ASSERT_EQ(1, lock_manager_.partition_lock_refs());
 :
 : TxnId txn2(1);
 : ScopedPartitionLock second_lock(_manager_, 
LockManager::LOCK_EXCLUSIVE, txn2);
 : ASSERT_FALSE(second_lock.Acquired());
 : ASSERT_EQ(TabletServerErrorPB::TXN_LOCKED_ABORT, code);
 :   }
> nit: probably makes sense to split TXN_LOCKED_ABORT code testing into a sep
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@179
PS2, Line 179: TxnId txn1(1);
 : ScopedPartitionLock first_lock(_manager_, 
LockManager::LOCK_EXCLUSIVE, txn1);
 : TabletServerErrorPB::Code code = 
TabletServerErrorPB::UNKNOWN_ERROR;
 : ASSERT_TRUE(first_lock.Acquired());
 : ASSERT_EQ(TabletServerErrorPB::UNKNOWN_ERROR, code);
 : ASSERT_EQ(1, lock_manager_.partition_lock_refs());
 :
 : TxnId txn2(0);
> nit: the values of these transaction IDs are pretty confusing IMO. Might be
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager-test.cc@223
PS2, Line 223: }
> nit: are there any post-conditions to check after releasing all the row loc
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h
File src/kudu/tablet/lock_manager.h:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@46
PS2, Line 46: // Super-simple lock manager implementation. This only supports 
exclusive
: // locks, and makes no attempt to prevent deadlocks if a single 
thread
: // takes multiple locks.
: //
: // In the future when we want to support multi-row transactions 
of some kind
: // we'll have to implement a proper lock manager with all its 
trappings,
: // but this should be enough for the single-row use case.
> nit: maybe, update this to reflect changes in this patch?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@85
PS2, Line 85:   // Lock to protect 'partition_lock_' and
:   // 'partition_lock_refs_'.
> nit: might be at a single line
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@90
PS2, Line 90: by
:   // the same transaction.
> nit: do you mind extending the comment to explain where the information to
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@154
PS2, Line 154: class PartitionLock {
> nit: mind adding a short description for this class?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@181
PS2, Line 181: assignment
> nit: assignment operator
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@188
PS2, Line 188: Acquired
> style nit: rename to 'IsAcquired' or 'is_acquired'
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.h@189
PS2, Line 189: Release
> nit: add a doc?
Done


http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc
File src/kudu/tablet/lock_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17097/2/src/kudu/tablet/lock_manager.cc@365
PS2, Line 365: LockManager::LockMode mode,
> I could not find where this parameter is used.  Did I miss anything?
Yes, this is not used. It is actually not used anywhere in the codebase. I 
added it just to be symmetric with the definition of ScopedRowLock. My 
understanding is it may be used later when we introduce shared lock. But I can 
remove it if you think it is redundant.



[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-02-23 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#3).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coarse grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction at a time, the same transaction
can hold the lock for multiple times. To prevent deadlock, 'wait-die'
scheme is used, which if the transaction requires a lock held by another
transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the write op (or participant op) of the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 278 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/3
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612 tablet servers automatically register txn participants

2021-02-23 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17037 )

Change subject: KUDU-2612 tablet servers automatically register txn participants
..


Patch Set 9: Code-Review+1

(1 comment)

Not sure if the failed test is related?

http://gerrit.cloudera.org:8080/#/c/17037/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/17037/9/src/kudu/tablet/tablet_replica.cc@1208
PS9, Line 1208: CancelPendingTxnOps
Is it possible for CancelPendingTxnOps to be called twice if BEGIN_TXN op 
failed? Since RegisteredParticipantCb can also trigger CancelPendingTxnOps? 
What will the impact be if so?



--
To view, visit http://gerrit.cloudera.org:8080/17037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia383f7afd208c44695c57aab82e3818fa1712ce6
Gerrit-Change-Number: 17037
Gerrit-PatchSet: 9
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 22:15:29 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-02-20 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17097

to look at the new patch set (#2).

Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coase grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction at a time, the same transaction
can hold the lock for multiple times. To prevent deadlock, 'wait-die'
scheme is used, which if the transaction requires a lock held by another
transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 279 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/2
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: add PartitionLock and ScopedPartitionLock

2021-02-20 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17097


Change subject: KUDU-2612: add PartitionLock and ScopedPartitionLock
..

KUDU-2612: add PartitionLock and ScopedPartitionLock

This patch introduces a coase grained partition level lock, PartitionLock
to prevent dirty writes for multi-rows transaction. A partition lock can
only be held by a single transaction at a time, the same transaction
can hold the lock for multiple times. To prevent deadlock, 'wait-die'
scheme is used, which if the transaction requires a lock held by another
transaction,
  1) abort the transaction immediately if it has a higher txn ID than
 the other one.
  2) Otherwise, retry the transaction.

A ScopedPartitionLock is also introduced to automatically manage the
lifecycle of a PartitionLock.

Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
---
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/lock_manager.h
M src/kudu/tserver/tserver.proto
4 files changed, 281 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/17097/1
--
To view, visit http://gerrit.cloudera.org:8080/17097
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I158115739ce3e7cfb77bbcb854e834336c1256b1
Gerrit-Change-Number: 17097
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] KUDU-2612 tablet servers automatically register txn participants

2021-02-18 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17037 )

Change subject: KUDU-2612 tablet servers automatically register txn participants
..


Patch Set 6:

(7 comments)

Just a first pass through.

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/ops/participant_op.cc@197
PS6, Line 197: 
RETURN_NOT_OK(tablet_replica_->UnregisterTxnOpDispatcher(txn_id()));
So begin commit can time out if the there are still writes pending when the 
retries finish?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@176
PS6, Line 176: void SubmitPendingTxnOps(int64_t txn_id);
 :   void CancelPendingTxnOps(int64_t txn_id, const Status& s);
nit: would you mind doc these two? Thanks!


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.h@458
PS6, Line 458: bool ready_;
 : bool unregistered_;
 : Status inflight_status_;
 : std::deque> ops_queue_;
nit: would you mind also doc these?


http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/17037/6/src/kudu/tablet/tablet_replica.cc@79
PS6, Line 79: 2
how this is evaluated. If default is 2, it doesn't seem to help a lot for 
buffering writes before begin txn?


http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tserver/ts_tablet_manager.h
File src/kudu/tserver/ts_tablet_manager.h:

http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tserver/ts_tablet_manager.h@247
PS5, Line 247: process
nit: to process for transactional write operations.


http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tserver/ts_tablet_manager.h@273
PS5, Line 273: static void RegisterAndBeginParticipantTxnTask(
nit: add a comment.


http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17037/5/src/kudu/tserver/ts_tablet_manager.cc@1115
PS5, Line 1115: MonoTime deadline,
  : StatusCallback cb) {
  :   VLOG(3) << Substitute("registering participant $0 for txn ID 
$1",
  : replica->tablet_id(), txn_id);
  :   DCHECK(txn_system_client);
  :   //
it is ok to check time out even before the actual RegisterParticipant get 
called?



--
To view, visit http://gerrit.cloudera.org:8080/17037
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia383f7afd208c44695c57aab82e3818fa1712ce6
Gerrit-Change-Number: 17037
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 18 Feb 2021 21:49:50 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: make BeginCommit return OK if already committed

2021-02-16 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17073 )

Change subject: KUDU-2612: make BeginCommit return OK if already committed
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17073/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17073/1//COMMIT_MSG@9
PS1, Line 9: This patch makes repeat calls to BeginCommit by the 
TxnStatusManager
   : return OK rather than an IllegalState error
Maybe I am missing something, I didn't see this actual change in the patch?



--
To view, visit http://gerrit.cloudera.org:8080/17073
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I420ba1c5460d59103984ea9a1f16177b82b8d0e1
Gerrit-Change-Number: 17073
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 17 Feb 2021 00:49:23 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add background task to abort transaction participants

2021-02-09 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17017 )

Change subject: KUDU-2612: add background task to abort transaction participants
..


Patch Set 8: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/integration-tests/txn_commit-itest.cc
File src/kudu/integration-tests/txn_commit-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/integration-tests/txn_commit-itest.cc@332
PS5, Line 332: UALLY([&] {
> At first my rationale was that if the transaction was aborted, from a clien
Ack, sounds good to me as well.


http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/transactions/txn_status_manager.cc@982
PS5, Line 982: RETURN_NOT_OK(status_tablet_.UpdateTransaction(
 :   txn_id, mutable_data->pb, ts_error));
 :   txn_lock.Commit();
 :   return Status::OK();
 : }
 :
 : Status TxnStatusManager::FinalizeAbortTransaction(int64_t 
txn_id) {
 :   leader_lock_.AssertAcquiredForReading();
 :
> Ah nice catch! This isn't necessary -- the commit tasks clean up after them
Ack



--
To view, visit http://gerrit.cloudera.org:8080/17017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I484c315c6f7331c5ec12cb06370fbaae9c7c343e
Gerrit-Change-Number: 17017
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 10 Feb 2021 00:37:52 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add background task to abort transaction participants

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17017 )

Change subject: KUDU-2612: add background task to abort transaction participants
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/client/transaction-internal.cc
File src/kudu/client/transaction-internal.cc:

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/client/transaction-internal.cc@309
PS5, Line 309: case TxnStatePB::ABORT_IN_PROGRESS:
 : case TxnStatePB::ABORTED:
 :   *is_complete = true;
 :   *completion_status = Status::Aborted("transaction has been 
aborted");
I understand we consider ABORT_IN_PROGRESS to be eventually aborted no matter 
what. But do you think it still worth returning different error msg to 
differentiate txn that is actually aborted v.s. still aborting?


http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/transactions/txn_status_manager.cc@331
PS5, Line 331: write the finalized
 : // commit timestamp to the tablet.
nit: update to write the abort record?


http://gerrit.cloudera.org:8080/#/c/17017/5/src/kudu/transactions/txn_status_manager.cc@982
PS5, Line 982: mutable_data->pb.set_commit_timestamp(commit_timestamp.value());
 :   RETURN_NOT_OK(status_tablet_.UpdateTransaction(
 :   txn_id, mutable_data->pb, ts_error));
 :
 :   {
 : std::lock_guard l(lock_);
 : commits_in_flight_.erase(txn_id);
 :   }
 :
Just curious, it seems this should belong to the previous patch which 
introduced background task to commit transactions?



--
To view, visit http://gerrit.cloudera.org:8080/17017
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I484c315c6f7331c5ec12cb06370fbaae9c7c343e
Gerrit-Change-Number: 17017
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 09 Feb 2021 06:50:47 +
Gerrit-HasComments: Yes


[kudu-CR] [util] remove AutoReleasePool and cleanup related code

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17036 )

Change subject: [util] remove AutoReleasePool and cleanup related code
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8adb5d08ec37e716baf524cb6c5f52366553883c
Gerrit-Change-Number: 17036
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 09 Feb 2021 04:38:18 +
Gerrit-HasComments: No


[kudu-CR] [master] turn off client location assignment by default

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17024 )

Change subject: [master] turn off client location assignment by default
..


Patch Set 3: Code-Review+1

Looks good to me, but I am not sure if Grant or Andrew have other thoughts on 
it.


--
To view, visit http://gerrit.cloudera.org:8080/17024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0
Gerrit-Change-Number: 17024
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 08 Feb 2021 22:56:02 +
Gerrit-HasComments: No


[kudu-CR] [master] turn off client location assignment by default

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17024 )

Change subject: [master] turn off client location assignment by default
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG@11
PS3, Line 11: resource consuming
> Right: clients do not actively participate in location assignment, the only
Ack



--
To view, visit http://gerrit.cloudera.org:8080/17024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0
Gerrit-Change-Number: 17024
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 08 Feb 2021 22:55:01 +
Gerrit-HasComments: Yes


[kudu-CR] [util] remove AutoReleasePool and cleanup related code

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17036 )

Change subject: [util] remove AutoReleasePool and cleanup related code
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17036/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17036/1//COMMIT_MSG@9
PS1, Line 9: AuthReleasePool
nit: AutoReleasePool



--
To view, visit http://gerrit.cloudera.org:8080/17036
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8adb5d08ec37e716baf524cb6c5f52366553883c
Gerrit-Change-Number: 17036
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 08 Feb 2021 22:54:28 +
Gerrit-HasComments: Yes


[kudu-CR] [master] turn off client location assignment by default

2021-02-08 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17024 )

Change subject: [master] turn off client location assignment by default
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG@11
PS3, Line 11: resource consuming
> From KUDU-3212:
Thanks for the explanation! So this is only master side resource consumption 
but not client side?



--
To view, visit http://gerrit.cloudera.org:8080/17024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0
Gerrit-Change-Number: 17024
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 08 Feb 2021 19:33:55 +
Gerrit-HasComments: Yes


[kudu-CR] [master] turn off client location assignment by default

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17024 )

Change subject: [master] turn off client location assignment by default
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17024/3//COMMIT_MSG@11
PS3, Line 11: resource consuming
Sorry, can you elaborate that a bit more? is t resource consuming on masters or 
clients? I looked into KUDU-3212, but wasn't able to find description related 
to this. Or maybe I missed it?



--
To view, visit http://gerrit.cloudera.org:8080/17024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0
Gerrit-Change-Number: 17024
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 06 Feb 2021 05:50:26 +
Gerrit-HasComments: Yes


[kudu-CR] txn commit-itest: reduce runtime of TestCommitWhileDeletingTxnStatusManager

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17033 )

Change subject: txn_commit-itest: reduce runtime of 
TestCommitWhileDeletingTxnStatusManager
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f4f3868bf7d88b0fe34cc63d6139ec867eba58e
Gerrit-Change-Number: 17033
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 06 Feb 2021 05:01:43 +
Gerrit-HasComments: No


[kudu-CR] rpc: reduce logging when server is shutting down

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17032 )

Change subject: rpc: reduce logging when server is shutting down
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I943a192bc12e3bae534b9129ac4ea8d24404adc6
Gerrit-Change-Number: 17032
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 06 Feb 2021 05:00:26 +
Gerrit-HasComments: No


[kudu-CR] txn commit-itest: deflake TestCommitTasksReloadOnLeadershipChange

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17031 )

Change subject: txn_commit-itest: deflake 
TestCommitTasksReloadOnLeadershipChange
..

txn_commit-itest: deflake TestCommitTasksReloadOnLeadershipChange

The test shows up on the flaky test dashboard as failing around 20% of
the time. As it turns out, transferring leadership by quiescing multiple
replicas can lead to flakiness if we happen to pick a lagging replica as
the new leader.

Instead of targeting a specific tablet server as the host of the new
leaders, we'll now just quiesce the old leader tablet server and stop
quiescing the other tablet servers.

I ran the test in DEBUG mode 100 times. Before this patch, it failed 16
times; with it, it passed 100/100 times.

Change-Id: I2b27864e72888367eb0af7de59e044a9e018c31b
Reviewed-on: http://gerrit.cloudera.org:8080/17031
Tested-by: Kudu Jenkins
Reviewed-by: Hao Hao 
---
M src/kudu/integration-tests/txn_commit-itest.cc
1 file changed, 8 insertions(+), 7 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Hao Hao: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2b27864e72888367eb0af7de59e044a9e018c31b
Gerrit-Change-Number: 17031
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] txn commit-itest: deflake TestCommitTasksReloadOnLeadershipChange

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17031 )

Change subject: txn_commit-itest: deflake 
TestCommitTasksReloadOnLeadershipChange
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17031
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b27864e72888367eb0af7de59e044a9e018c31b
Gerrit-Change-Number: 17031
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 06 Feb 2021 04:58:31 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: background task to commit transaction

2021-02-05 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16952 )

Change subject: KUDU-2612: background task to commit transaction
..


Patch Set 16: Code-Review+2

(3 comments)

Looks good to me. I didn't get a chance to check the tests closely. But I 
believe Alexey has done so. Thanks a lot Andrew for the patch!

http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@257
PS14, Line 257:   participant_ids_[participant_idx],
  :   std::move(op_pb),
  :   
MonoDelta::FromMilliseconds(FLAGS_txn_background_rpc_timeout_ms),
  :   std::move(participated_cb),
  :   _commit_timestamps_[participant_idx]);
  : }
> It's not identical everywhere, but sure.
Ack


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@268
PS14, Line 268: pated_cb = [this, scoped_this = std::move(scoped_this),
  :   participant_idx, commit_timestamp] 
(const Status& s) {
  : if (IsShuttingDownCleanupIfLastOp()) {
  :
> Yeah, added a comment to make that clear that it's intentional.
Ack


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@787
PS14, Line 787:
  :   std::lock_guar
> There is always at least the BEGIN_COMMIT op timestamp, regardless of any w
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e
Gerrit-Change-Number: 16952
Gerrit-PatchSet: 16
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 05 Feb 2021 08:12:29 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: background task to commit transaction

2021-02-04 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16952 )

Change subject: KUDU-2612: background task to commit transaction
..


Patch Set 14:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.h@92
PS14, Line 92: BeginCommitTask
nit: maybe name it BeginCommitAsyncTask if it is asynchronously as well. Same 
for below.


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.h@102
PS14, Line 102: commit_timestamp
nit: can you comment on how the timestamp will be used?


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.h@321
PS14, Line 321: // Loads the contents of the transaction status tablet into 
memory.
nit: can you update the comment to add that we will also load commits in flight 
and resume these commits?


http://gerrit.cloudera.org:8080/#/c/16952/7/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/7/src/kudu/transactions/txn_status_manager.cc@205
PS7, Line 205: return;
 : }
nit: should this comment be moved to L201 to cover 's.IsNotFound()' case?


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@123
PS14, Line 123: "
nit: space.


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@215
PS14, Line 215: If the participant has been deleted (it's not found), proceed 
as if it
  :   // were committed.
nit: remove?


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@257
PS14, Line 257: if (stop_task_ || txn_status_manager_->shutting_down()) {
  :   if (--ops_in_flight_ == 0) {
  : 
txn_status_manager_->RemoveCommitTask(this->txn_id_.value(), this);
  :   }
  :   return;
  : }
nit: do you think this block of code worth writing to a method of own? As I see 
it is repeating serval places?


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@268
PS14, Line 268: (PREDICT_FALSE(!s.ok())) {
  :   LOG(WARNING) << Substitute("Participant $0 
FINALIZE_COMMIT op returned $1",
  :  
participant_ids_[participant_idx], s.ToString());
  : }
So we proceed to ScheduleFinalizeCommitWrite even we encounter error here?


http://gerrit.cloudera.org:8080/#/c/16952/14/src/kudu/transactions/txn_status_manager.cc@787
PS14, Line 787: // If there are no participants for this transaction; just 
write an invalid
  : // timestamp
This happens when there is no ops in the transaction? What does invalid 
timestamp mean in this case?



--
To view, visit http://gerrit.cloudera.org:8080/16952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e
Gerrit-Change-Number: 16952
Gerrit-PatchSet: 14
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 05 Feb 2021 01:35:35 +
Gerrit-HasComments: Yes


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-04 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17021 )

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..


Patch Set 4:

> Patch Set 3: Code-Review+2
> 
> Thank you for addressing the flakiness issue!
>
> I guess I'll need to clarify on the original settings because they are very 
> close to what client's keepalive logic has.

Sounds good, thanks a lot!


--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 04 Feb 2021 18:49:47 +
Gerrit-HasComments: No


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17021 )

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..


Patch Set 3: Verified+1

Unrelated flaky failure 
ExactlyOnceSemanticsITest.TestWritesWithExactlyOnceSemanticsWithCrashyNodes


--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 04 Feb 2021 06:04:11 +
Gerrit-HasComments: No


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hao Hao has removed a vote on this change.

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17021 )

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17021/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17021/1//COMMIT_MSG@10
PS1, Line 10: And deflake it by decreasing the intervals
: of KeepTransactionAlive calls and reduce the RPC timeout
> I saw some other changes made.  Are those extra changes necessary to make t
Yeah, it is not relevant to address flakiness. I was trying to make the test 
more clean. Will revert it.


http://gerrit.cloudera.org:8080/#/c/17021/1/src/kudu/integration-tests/txn_status_manager-itest.cc
File src/kudu/integration-tests/txn_status_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17021/1/src/kudu/integration-tests/txn_status_manager-itest.cc@391
PS1, Line 391: SleepFor(MonoDelta::FromMilliseconds(
> It seems this line was used in debugging and is no longer needed -- please
Done



--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 04 Feb 2021 05:15:14 +
Gerrit-HasComments: Yes


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17021

to look at the new patch set (#3).

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..

[test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

This patch enables TxnKeepAliveMultiTxnStatusManagerInstances test in
txn_status_manager-itest. And deflake it by decreasing the intervals
of KeepTransactionAlive calls and reduce the RPC timeout. I looped
TxnStatusManagerITest 100 times and it passed.

Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
---
M src/kudu/integration-tests/txn_status_manager-itest.cc
1 file changed, 3 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/17021/3
--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17021

to look at the new patch set (#2).

Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..

[test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

This patch enables TxnKeepAliveMultiTxnStatusManagerInstances test in
txn_status_manager-itest. And deflake it by decreasing the intervals
of KeepTransactionAlive calls and reduce the RPC timeout. I looped
TxnStatusManagerITest 100 times and it passed.

Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
---
M src/kudu/integration-tests/txn_status_manager-itest.cc
1 file changed, 41 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/17021/2
--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

2021-02-03 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17021


Change subject: [test] enable 
TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances
..

[test] enable TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances

This patch enables TxnKeepAliveMultiTxnStatusManagerInstances test in
txn_status_manager-itest. And deflake it by decreasing the intervals
of KeepTransactionAlive calls and reduce the RPC timeout. I looped
TxnStatusManagerITest 100 times and it passed.

Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
---
M src/kudu/integration-tests/txn_status_manager-itest.cc
1 file changed, 14 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/21/17021/1
--
To view, visit http://gerrit.cloudera.org:8080/17021
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I27afb53ecb9d28dc15fa7f0c26e677dafee2030f
Gerrit-Change-Number: 17021
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-02 Thread Hao Hao (Code Review)
Hao Hao has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17013 )

Change subject: KUDU-2612: follow up of commit c033487
..

KUDU-2612: follow up of commit c033487

This patch address leftover comments for commit c033487.

Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Reviewed-on: http://gerrit.cloudera.org:8080/17013
Reviewed-by: Hao Hao 
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/tserver/ts_tablet_manager.cc
2 files changed, 4 insertions(+), 3 deletions(-)

Approvals:
  Hao Hao: Looks good to me, approved
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-02 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17013 )

Change subject: KUDU-2612: follow up of commit c033487
..


Patch Set 3: Code-Review+2

Patch Set 2: Code-Review+2
>
> Yes, sure: please feel free to address the flakiness in 
> TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances as a 
> separate changelist.
>
> Thank you!
Thanks! Carried over your +2, as the latest version is just to add a comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Feb 2021 19:17:15 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-02 Thread Hao Hao (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17013

to look at the new patch set (#3).

Change subject: KUDU-2612: follow up of commit c033487
..

KUDU-2612: follow up of commit c033487

This patch address leftover comments for commit c033487.

Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
---
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/tserver/ts_tablet_manager.cc
2 files changed, 4 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/17013/3
--
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-01 Thread Hao Hao (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17013

to look at the new patch set (#2).

Change subject: KUDU-2612: follow up of commit c033487
..

KUDU-2612: follow up of commit c033487

This patch address leftover comments for commit c033487.

Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
---
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/tserver/ts_tablet_manager.cc
2 files changed, 2 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/17013/2
--
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 2
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-01 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17013 )

Change subject: KUDU-2612: follow up of commit c033487
..


Patch Set 1:

(1 comment)

> Patch Set 1: Code-Review+1
>
> (1 comment)
>
> Thank you for addressing those comments from 
> https://gerrit.cloudera.org/#/c/16648/
>
> This change looks good, but 
> TxnStatusManagerITest.TxnKeepAliveMultiTxnStatusManagerInstances is still 
> failing.

Right, I haven't got a chance to dig whether this is a test flakiness or not. 
So I will revert the disable change for now and address it in another patch 
tomorrow. As they don't need to be within the same one. Does this sound good to 
you?

http://gerrit.cloudera.org:8080/#/c/17013/1/src/kudu/integration-tests/txn_status_table-itest.cc
File src/kudu/integration-tests/txn_status_table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/17013/1/src/kudu/integration-tests/txn_status_table-itest.cc@749
PS1, Line 749: 2
> In prior version it was 3, but I guess txn_id 2 works flawlessly as well he
Yes. :)



--
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 02 Feb 2021 02:24:24 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-02-01 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 16:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/integration-tests/txn_status_table-itest.cc
File src/kudu/integration-tests/txn_status_table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/integration-tests/txn_status_table-itest.cc@714
PS9, Line 714:   orig_leader
> nit: I guess this isn't necessary once the ASSERT_EVENTUALLY() below is gon
Done


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/integration-tests/txn_status_table-itest.cc@715
PS9, Line 715: TONED, /
> nit: just put 3 here?
Address it in a follow up patch.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1457
PS9, Line 1457: nStatusM
> Because the l.first_failed_status().ok() most likely will return not-OK for
Hmm, makes sense. I will address it in a follow up commit.



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 16
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 01 Feb 2021 22:57:56 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: follow up of commit c033487

2021-02-01 Thread Hao Hao (Code Review)
Hao Hao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17013


Change subject: KUDU-2612: follow up of commit c033487
..

KUDU-2612: follow up of commit c033487

This patch address leftover comments for commit c033487.

Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
---
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/tserver/ts_tablet_manager.cc
3 files changed, 3 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/17013/1
--
To view, visit http://gerrit.cloudera.org:8080/17013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3ac5c1c123e33d60bc7c23dec015072d5c5ed9fd
Gerrit-Change-Number: 17013
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao 


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Reviewed-on: http://gerrit.cloudera.org:8080/16648
Tested-by: Hao Hao 
Reviewed-by: Andrew Wong 
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
17 files changed, 850 insertions(+), 140 deletions(-)

Approvals:
  Hao Hao: Verified
  Andrew Wong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 16
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16648/15/src/kudu/integration-tests/txn_status_table-itest.cc
File src/kudu/integration-tests/txn_status_table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16648/15/src/kudu/integration-tests/txn_status_table-itest.cc@774
PS15, Line 774:   FLAGS_txn_status_tablet_inject_load_failure_error = true;
> You mentioned on slack that this won't do io unless we flush. It's probably
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sun, 31 Jan 2021 06:53:16 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16648/15/src/kudu/integration-tests/txn_status_table-itest.cc
File src/kudu/integration-tests/txn_status_table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16648/15/src/kudu/integration-tests/txn_status_table-itest.cc@774
PS15, Line 774:   FLAGS_txn_status_tablet_inject_load_failure_error = true;
> Do disk failures exercise this codepath? If so, we can just use --env_injec
Try to use --env_inject_eio but that seems to affect the log to be appended. Or 
maybe you have suggestions on how to narrow down to only affect transaction 
status tablet?



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sun, 31 Jan 2021 01:13:00 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 15: Verified+1

Unrelated org.apache.kudu.client.TestKuduMetrics flaky test failure.


--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sun, 31 Jan 2021 00:31:39 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has removed a vote on this change.

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 15:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16648/14/src/kudu/integration-tests/txn_status_manager-itest.cc
File src/kudu/integration-tests/txn_status_manager-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16648/14/src/kudu/integration-tests/txn_status_manager-itest.cc@447
PS14, Line 447:   ASSERT_EVENTUALLY([&]{
> nit: Why do we need to sleep for so long if we're already retrying?
Done


http://gerrit.cloudera.org:8080/#/c/16648/14/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/14/src/kudu/transactions/txn_status_manager.cc@443
PS14, Line 443:
  :   const int64_t term = consensus.CurrentTerm();
  :   // If the term has changed we assume the new leader is 
about to do the
  :   // necessary work in its
> Why might we get here? I thought we didn't want to crash at all? Can we get
Good point! It is overkilled to crash here if there is any failures preventing 
read from the transaction status tablet. Updated.



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 30 Jan 2021 23:51:19 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#15).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
17 files changed, 850 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/15
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 15
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 14:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/integration-tests/txn_status_table-itest.cc
File src/kudu/integration-tests/txn_status_table-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/integration-tests/txn_status_table-itest.cc@818
PS13, Line 818: class TxnStatusTableElectionStormITest : public 
TxnStatusTableITest {
> Is this guaranteed to happen? If we need to change leadership, quiescing mi
Done


http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/master/txn_manager-test.cc
File src/kudu/master/txn_manager-test.cc:

PS13:
> Why the changes in this file?
Came across this code and saw some nits can be improved (but I guess also 
introduced some redundant dependency). Will revert it back.


http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/transactions/txn_status_manager.cc@214
PS13, Line 214: PREDICT_FALSE(!replica_status_.ok() ||
  :   
FLAGS_txn_status_tablet_inject_uninitialized_leader_status_error)
> nit: wrap in a single PREDICT_FALSE
Done


http://gerrit.cloudera.org:8080/#/c/16648/13/src/kudu/transactions/txn_status_manager.cc@400
PS13, Line 400: PREDICT_FALSE(!s.ok() ||
> nit: maybe wrap the entire thing in PREDICT_FALSE instead of just the flag
Done



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 14
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 30 Jan 2021 08:21:54 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-30 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#14).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
17 files changed, 833 insertions(+), 140 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/14
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 14
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 13: Verified+1

Unrelated flaky test ExternalMiniClusterTest.TestBasicOperation


--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 13
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 29 Jan 2021 06:52:16 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has removed a vote on this change.

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 13
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-28 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#13).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/master/txn_manager-test.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
18 files changed, 825 insertions(+), 144 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/13
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 13
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 12:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.h
File src/kudu/tablet/tablet_replica.h:

http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.h@329
PS11, Line 329: transaction status table and is
> nit: just "transaction status table"
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.h@332
PS11, Line 332: More concretely, if this returns
  :   // 'true', callers must call 
DecreaseTxnCoordinatorTaskCounter().
  :   //
> nit: more concretely, if this returns 'true', callers must call DecreaseTxn
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.h@336
PS11, Line 336:
> nit: these methods don't run anything, so how about "if the caller should r
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.h@340
PS11, Line 340: transaction status table
  :   // and is
> nit: here too
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/tablet/tablet_replica.cc@1010
PS11, Line 1010: std::l
> nit: could use DCHECK_GE
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.h@100
PS11, Line 100: 'txn_coordinator' must be a 'TxnStatusManager*' because of the
  : // downcast in the initia
> nit: more specifically, 'txn_coordinator' must be a TxnStatusManager* becau
Done


http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/transactions/txn_status_manager.cc@83
PS6, Line 83:  false,
: "Finalize commit
> Sounds good. Curious if we have test coverage for this.
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.cc
File src/kudu/transactions/txn_status_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.cc@239
PS11, Line 239:
> Do we have tests for this?
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.cc@356
PS11, Line 356:
> nit: why not put this directly in the if() condition? Same with other insta
Done


http://gerrit.cloudera.org:8080/#/c/16648/11/src/kudu/transactions/txn_status_manager.cc@358
PS11, Line 358: t get an unexpected error response and bail instead
> nit: this might not be that useful. It'll be obvious that the tablet is shu
Done



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 12
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 29 Jan 2021 02:32:45 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-28 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#12).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/master/txn_manager-test.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
18 files changed, 825 insertions(+), 144 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/12
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 12
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR](gh-pages) Fix 1.14.0 blog post

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17007 )

Change subject: Fix 1.14.0 blog post
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I4343f2cd4e9cc194f44f84d82255c5de95e777aa
Gerrit-Change-Number: 17007
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Comment-Date: Thu, 28 Jan 2021 22:55:40 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) Update website for 1.14.0 release

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16990 )

Change subject: Update website for 1.14.0 release
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia97cc8456a8ad0a81bb9732c7576641f41e165aa
Gerrit-Change-Number: 16990
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Comment-Date: Thu, 28 Jan 2021 21:07:06 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: add a TxnSystemClient to the tservers

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16974 )

Change subject: KUDU-2612: add a TxnSystemClient to the tservers
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b5a2bb5c56ae4bb4b42069af5813e2780fc4bc
Gerrit-Change-Number: 16974
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 28 Jan 2021 19:38:19 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: add a TxnSystemClient to the tservers

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16974 )

Change subject: KUDU-2612: add a TxnSystemClient to the tservers
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc
File src/kudu/rpc/sasl_common.cc:

http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc@a280
PS4, Line 280:
> Servers initialize SASL in ServerBase::Init(), which calls the Once functio
Ack


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc
File src/kudu/transactions/txn_system_client.cc:

http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@62
PS4, Line 62: disable_txn_system_client_init
> Right, 'unsafe' flags are automatically 'hidden'.
Ack


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@383
PS4, Line 383: ile (!shutting_down_) {
 : static const MonoDelta kRetryInterval = 
MonoDelta::FromSeconds(1);
 : if (PREDICT_FALSE(FLAGS_disable_txn_system_client_init)) 
{
 :   KLOG_EVERY_N_SECS(WARNING, 60) <<
 :   Substitute("initialization of TxnSystemClient 
disabled, will retry in $0",
 :  kRetryInterval.ToString());
 :
> I don't, but it can.
Ack


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@390
PS4, Line 390:   continue;
 : }
 : // HACK: if the master addresses are all totally 
unreachable,
 : // KuduClientBuilder::Build() will hang, attempting 
fruitlessly to
 : // retry, in the below call to Create(). So first, make 
sure we can at
 : // least reach the masters; if not, try again.
 : // TODO(awong): there's still a small window between 
these pings and
 : // client creation. If this ends up being a problem, we 
may need to
 : // come to a more robust solution, e.g. adding a timeout 
to Create().
 : DnsResolver dns_resolver;
 : Status s;
 : for (const auto& hp : master_addrs) {
 :   vector addrs;
 :   s = dns_resolver.ResolveAddresses(hp, 
).AndThen([&] {
 : unique_ptr proxy(
 : new MasterServiceProxy(messenger, addrs[0], 
hp.host()));
 : PingRequestPB req;
 : PingResponsePB resp;
 : RpcController rpc;
 : 
rpc.set_timeout(MonoDelta::FromMilliseconds(FLAGS_rpc_negotiation_timeout_ms));
 :
> I'm not sure I understand the question. Are you asking what happens if, bet
Ack


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@443
PS4, Line 443: nava
> It happens in the threadpool's destructor anyway, but I can add it here too
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b5a2bb5c56ae4bb4b42069af5813e2780fc4bc
Gerrit-Change-Number: 16974
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 28 Jan 2021 19:38:14 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) Update website for 1.14.0 release

2021-01-28 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16990 )

Change subject: Update website for 1.14.0 release
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia97cc8456a8ad0a81bb9732c7576641f41e165aa
Gerrit-Change-Number: 16990
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Hao Hao 
Gerrit-Comment-Date: Thu, 28 Jan 2021 19:30:10 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3237 fix MaintenanceManagerTest.TestCompletedOpsHistory

2021-01-27 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16991 )

Change subject: KUDU-3237 fix MaintenanceManagerTest.TestCompletedOpsHistory
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16991/1/src/kudu/util/maintenance_manager-test.cc
File src/kudu/util/maintenance_manager-test.cc:

http://gerrit.cloudera.org:8080/#/c/16991/1/src/kudu/util/maintenance_manager-test.cc@587
PS1, Line 587: information
nit: information.



--
To view, visit http://gerrit.cloudera.org:8080/16991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I760287b3ed4d50e32d2f9257e5390fdf8fa8f288
Gerrit-Change-Number: 16991
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 28 Jan 2021 07:18:19 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612 Java client transaction implementation

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16929 )

Change subject: KUDU-2612 Java client transaction implementation
..


Patch Set 9: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16929/5/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java
File java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java:

http://gerrit.cloudera.org:8080/#/c/16929/5/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java@439
PS5, Line 439:
> I think yes -- the idea is that Kudu clusters are expected to be deployed w
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16929
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0236875e7264877c3f7a13da4a5a3da6423786b
Gerrit-Change-Number: 16929
Gerrit-PatchSet: 9
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Jan 2021 05:25:07 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612 keep-alive txn heartbeating for Java client

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16967 )

Change subject: KUDU-2612 keep-alive txn heartbeating for Java client
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java:

http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java@628
PS3, Line 628:failure like this. The idea is to avoid missing
 : //heartbeats in situations where the 
second attempt
 : //after keepaliveMillis/2 would as well 
due to a network
 : //issue, but immediate retry could 
succeed.
 : LOG.debug("continuing keepalive heartbeating (txn ID 
{}): {}",
 : txnId, e.toString());
> Nope, no test case for this one yet.  I guess that's conditional upon imple
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6326a5452223d8173b2da004bb5f3ab0c1e6ae4e
Gerrit-Change-Number: 16967
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Jan 2021 05:24:09 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: add a TxnSystemClient to the tservers

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16974 )

Change subject: KUDU-2612: add a TxnSystemClient to the tservers
..


Patch Set 4:

(7 comments)

Overall looks good to me, just some nits. Thanks a lot Andrew!

http://gerrit.cloudera.org:8080/#/c/16974/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16974/4//COMMIT_MSG@23
PS4, Line 23: connecting to
nit: connecting to master?


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc
File src/kudu/rpc/sasl_common.cc:

http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/rpc/sasl_common.cc@a280
PS4, Line 280:
Why this is disabled? Sorry,I think you briefly mentioned offline, but I didn't 
quite get it.


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc
File src/kudu/transactions/txn_system_client.cc:

http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@62
PS4, Line 62: disable_txn_system_client_init
Should it be tagged as hidden as well? Do you expect this to be used outside 
test?


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@383
PS4, Line 383: if (PREDICT_FALSE(FLAGS_disable_txn_system_client_init)) {
 :   KLOG_EVERY_N_SECS(WARNING, 60) <<
 :   Substitute("initialization of TxnSystemClient 
disabled, will retry in $0",
 :  kRetryInterval.ToString());
 :   SleepFor(kRetryInterval);
 :   continue;
 : }
As we are retrying, do you expect FLAGS_disable_txn_system_client_init to 
change runtime?


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@390
PS4, Line 390: // HACK: if the master addresses are all totally unreachable,
 : // KuduClientBuilder::Build() will hang, attempting 
fruitlessly to
 : // retry, in the below call to Create(). So first, make 
sure we can at
 : // least reach the masters; if not, try again.
 : DnsResolver dns_resolver;
 : Status s;
 : for (const auto& hp : master_addrs) {
 :   vector addrs;
 :   s = dns_resolver.ResolveAddresses(hp, 
).AndThen([&] {
 : unique_ptr proxy(
 : new MasterServiceProxy(messenger, addrs[0], 
hp.host()));
 : PingRequestPB req;
 : PingResponsePB resp;
 : RpcController rpc;
 : rpc.set_timeout(MonoDelta::FromSeconds(1));
 : return proxy->Ping(req, , );
 :   });
 :   if (s.ok()) {
 : break;
 :   }
 : }
I guess the sanitize check is sufficient enough, even later when we actually 
call Create() all masters become unreachable?


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@435
PS4, Line 435: txn_client_
Do we want to check(dcheck) txn_client_ is not null here?


http://gerrit.cloudera.org:8080/#/c/16974/4/src/kudu/transactions/txn_system_client.cc@443
PS4, Line 443: Wait
nit: why not Shutdown() after Wait()?



--
To view, visit http://gerrit.cloudera.org:8080/16974
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I33b5a2bb5c56ae4bb4b42069af5813e2780fc4bc
Gerrit-Change-Number: 16974
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 27 Jan 2021 00:10:01 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612 keep-alive txn heartbeating for Java client

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16967 )

Change subject: KUDU-2612 keep-alive txn heartbeating for Java client
..


Patch Set 3: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KeepTransactionAliveRequest.java
File 
java/kudu-client/src/main/java/org/apache/kudu/client/KeepTransactionAliveRequest.java:

http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KeepTransactionAliveRequest.java@46
PS3, Line 46: txnId
nit: check txnId > INVALID_TXN_ID?


http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java:

http://gerrit.cloudera.org:8080/#/c/16967/3/java/kudu-client/src/main/java/org/apache/kudu/client/KuduTransaction.java@628
PS3, Line 628: TODO(aserbin): should we send next heartbeat sooner? E.g., retry
 : //immediately, and do such retry only 
once after a
 : //failure like this. The idea is to 
avoid missing
 : //heartbeats in situations where the 
second attempt
 : //after keepaliveMillis/2 would as well 
due to a network
 : //issue, but immediate retry could 
succeed.
Do we have a test case for this?



--
To view, visit http://gerrit.cloudera.org:8080/16967
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6326a5452223d8173b2da004bb5f3ab0c1e6ae4e
Gerrit-Change-Number: 16967
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 26 Jan 2021 21:22:12 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3239: [build] Disable errorprone on Java 11

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16983 )

Change subject: KUDU-3239: [build] Disable errorprone on Java 11
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16983
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic13f771222e67ac754f464f8015caf0a4901831d
Gerrit-Change-Number: 16983
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 26 Jan 2021 21:10:44 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612 Java client transaction implementation

2021-01-26 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16929 )

Change subject: KUDU-2612 Java client transaction implementation
..


Patch Set 8: Code-Review+1

(2 comments)

Overall looks good to me, thanks a lot Alexey!

http://gerrit.cloudera.org:8080/#/c/16929/5/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java
File java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java:

http://gerrit.cloudera.org:8080/#/c/16929/5/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java@418
PS5, Line 418: Handle for various kinds of TxnManager errors.
nit: mind to extend it a bit about how errors are handled? (e.g. recoverable vs 
nonrecoverable)


http://gerrit.cloudera.org:8080/#/c/16929/5/java/kudu-client/src/main/java/org/apache/kudu/client/RpcProxy.java@439
PS5, Line 439: try sending request to other TxnManager instance, if possible
Is this necessarily to be better?



--
To view, visit http://gerrit.cloudera.org:8080/16929
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0236875e7264877c3f7a13da4a5a3da6423786b
Gerrit-Change-Number: 16929
Gerrit-PatchSet: 8
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 26 Jan 2021 20:31:19 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-19 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 11: Verified+1

TSAN build failure doesn't seem to relate to the patch.


--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 11
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 19 Jan 2021 18:06:06 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-19 Thread Hao Hao (Code Review)
Hao Hao has removed a vote on this change.

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 11
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-18 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#11).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/master/txn_manager-test.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
18 files changed, 766 insertions(+), 144 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/11
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 11
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] wip KUDU-2612: background task to commit transaction

2021-01-18 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16952 )

Change subject: wip KUDU-2612: background task to commit transaction
..


Patch Set 1:

(3 comments)

Just a quick walk through, haven't looked into the detail yet.

http://gerrit.cloudera.org:8080/#/c/16952/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16952/1//COMMIT_MSG@24
PS1, Line 24: There are some nuances here
> I haven't looked deep into the code yet, just a few top-level questions at
Another error handling question is maybe a little bit early to think about 
since we only want to support REPEATED READ isolation level first: what happens 
if one of the two (or more) transactions have concurrent write to the same 
range of data based on the read which can cause write skew. We may want to 
abort one of the affected transaction and roll back? Or should we just send 
error back to client indicate something going wrong with the transaction and to 
retry.


http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/transactions/txn_status_entry.cc
File src/kudu/transactions/txn_status_entry.cc:

http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/transactions/txn_status_entry.cc@a45
PS1, Line 45:
This condition no longer hold?


http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16952/1/src/kudu/tserver/ts_tablet_manager.cc@1258
PS1, Line 1258:   txn_status_manager_pool_->Shutdown();
shut down the txn_commit_pool_ when tserver is shutting down?



--
To view, visit http://gerrit.cloudera.org:8080/16952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie2258dded3ab3d527cb5d0abdc7d5e7deb4da15e
Gerrit-Change-Number: 16952
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 19 Jan 2021 01:59:07 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-18 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 9:

(7 comments)

> Patch Set 9:
>
> (7 comments)
>
> IWYU isn't happy yet:
>
> >>> Fixing #includes in 
> >>> '/home/jenkins-slave/workspace/kudu-master/2/src/kudu/transactions/txn_status_manager.cc'
> @@ -32,6 +32,7 @@
>  #include "kudu/common/wire_protocol.h"
>  #include "kudu/consensus/metadata.pb.h"
>  #include "kudu/consensus/raft_consensus.h"
> +#include "kudu/gutil/casts.h"
>  #include "kudu/gutil/macros.h"
>  #include "kudu/gutil/map-util.h"
>  #include "kudu/gutil/port.h"
> IWYU would have edited 1 files on your behalf.

Thanks a lot for looking into the failure!

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@304
PS9, Line 304: WaitUntilTxnCoordinatorTasksFinished();
> How do we protect against the race when right after this 'txn_coordinator_t
Are you asking after L304 which we waited 'txn_coordinator_task_counter_' to 
become zero. and then it again becomes non-zero and tablet is shutdown state. 
what happens if TxnStatusManager tries to access the tablet? In that case,  the 
task will not be enabled/ran when the tablet is shutting down. So that 
ShouldRunTxnCoordinatorStalenessTask() will return false and no new tasks will 
run.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@471
PS9, Line 471:   if (state_ == STOPPING || state_ == STOPPED) {
 : return false;
 :   }
 :   return true;
> This looks a bit strange: it returns 'true' even if 'state_' is RUNNING?
Ah, good catch. Sorry it is the opposite meaning as what the method name 
implies here. Adjust the caller as well.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@997
PS9, Line 997: state_
> What about SHUTDOWN state?
SHUTDOWN is only set after the tablet is fully stopped. Which the raft 
consensus has been stopped and the call back should no longer be triggered.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tablet/tablet_replica.cc@1009
PS9, Line 1009: txn_coordinator_task_counter_
> nit: is it expected that this counter goes negative?  If not, maybe add CHE
Done


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h
File src/kudu/transactions/txn_status_manager.h:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/transactions/txn_status_manager.h@101
PS9, Line 101: txn_coordinator
> nit: it would be great to add a comment about the expectations about the ty
Done


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1439
PS9, Line 1439: !r->CheckRunning().ok()
> Does it mean that the TxnStalenessTrackerTask() will cease to run at this t
Hmm, yeah, I guess we wouldn't want to stop the TxnStalenessTrackerTask() if 
some txn status tablets is shutdown. We may want to resume on checking the 
other tablets.


http://gerrit.cloudera.org:8080/#/c/16648/9/src/kudu/tserver/ts_tablet_manager.cc@1457
PS9, Line 1457: continue
> Maybe, we should jump not just to the next element 'replicas' container, bu
I am not sure why we want to do that? Why don't we check for the rest tablets?



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 19 Jan 2021 01:15:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-18 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#10).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/master/txn_manager-test.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
18 files changed, 766 insertions(+), 144 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/10
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 10
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-14 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/16648/6/src/kudu/client/client-test.cc@426
PS6, Line 426: dynamic_cast
> nit: I guess we prefer to use down_cast<> for this type of casts in the cod
Done


http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/tablet_service.cc@1262
PS7, Line 1262: dynamic_cast
> nit: use down_cast<> here as well?
Done


http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc@1453
PS7, Line 1453:   TxnStatusManager* txn_status_manager =
  :   dynamic_cast(coordinator);
  :   TxnStatusManager::ScopedLeaderSharedLock 
l(txn_status_manager);
> Sounds good to me, will update it.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 8
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 15 Jan 2021 05:19:06 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-14 Thread Hao Hao (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16648

to look at the new patch set (#9).

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..

KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

Currently, even though a non-leader TxnStatusManager will be unable to
write to the underlying table (in the Raft subsystem the write would be
aborted), we may want to restrict calls to be made by the leader
TxnStatusManagers only. The motivation is to provide a more robust system,
which avoids cases when the request was sent to a laggy follower, we may
end up failing the request with an error.

This patch introduces ScopedLeaderSharedLock (similar to the one in Catalog
Manager) to be used to ensure the requests were sent to leaders only and
to block all other operations while reloading the persistent transaction
status metadata upon leadership changes. Note that during failover the
leader replica will wait until in-flight ops in the previous consensus
term to be applied before reloading the metadata.

Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
---
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/integration-tests/txn_status_manager-itest.cc
M src/kudu/integration-tests/txn_status_table-itest.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/master/txn_manager-test.cc
M src/kudu/tablet/tablet_replica-test-base.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/txn_coordinator.h
M src/kudu/transactions/txn_status_manager-test.cc
M src/kudu/transactions/txn_status_manager.cc
M src/kudu/transactions/txn_status_manager.h
M src/kudu/transactions/txn_status_tablet.h
M src/kudu/tserver/tablet_copy_source_session-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
18 files changed, 760 insertions(+), 145 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/16648/9
--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 9
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-2612: restrict TxnStatusManager calls to be made by the leader only

2021-01-14 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16648 )

Change subject: KUDU-2612: restrict TxnStatusManager calls to be made by the 
leader only
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16648/7/src/kudu/tserver/ts_tablet_manager.cc@1453
PS7, Line 1453:   TxnStatusManager* txn_status_manager =
  :   dynamic_cast(coordinator);
  :   TxnStatusManager::ScopedLeaderSharedLock 
l(txn_status_manager);
> The rationale is to prevent code duplication: as I could see, all call site
Sounds good to me, will update it.



--
To view, visit http://gerrit.cloudera.org:8080/16648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42c1ad095dcb4bdffcbe0ecf9631a60bac208c2a
Gerrit-Change-Number: 16648
Gerrit-PatchSet: 8
Gerrit-Owner: Hao Hao 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 14 Jan 2021 21:02:44 +
Gerrit-HasComments: Yes


[kudu-CR] [client] updated multi-row transaction API

2021-01-14 Thread Hao Hao (Code Review)
Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16948 )

Change subject: [client] updated multi-row transaction API
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16948
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id646c41d60f385f079ebfaec2daf7c2a108306b2
Gerrit-Change-Number: 16948
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 14 Jan 2021 21:01:43 +
Gerrit-HasComments: No


  1   2   3   4   5   6   7   8   9   10   >