[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

2020-09-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
..


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@17
PS3, Line 17: this can be used to store the owner of
:   the transaction
> We currently use the owner on the TxnStatusManager to ensure that only the
Thank you for the clarification!


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@231
PS7, Line 231: FLAGS_flush_threshold_mb = 1;
 : FLAGS_flush_threshold_secs = 1;
 : FLAGS_log_segment_size_mb = 1;
nit: it would be nice to document the reasoning behind the customization of 
these flags


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@234
PS7, Line 234: NO_FATALS(TxnParticipantITest::SetUp());
nit: should we also set FLAGS_maintenance_manager_polling_interval_ms to some 
lower value (default is 250) ?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc@181
PS7, Line 181: op.finalized_commit_timestamp();
nit: does it make sense to add

  DCHECK(op.has_finalized_commit_timestamp())

?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h@122
PS7, Line 122: std::unordered_set
nit: could we get away with just

  const std::unordered_set& in_flight_txn_ids = {}

?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h@181
PS3, Line 181: Begins
> I don't think it's worth updating all such instances in such a large file,
Yep, my point was to keep the tense as with the majority of comments in the 
file.  OK, I guess it's not a big deal anyways :)


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@68
PS3, Line 68: ent state
> See https://kudu.apache.org/docs/contributing.html#_pointers
SGTM: it seems memory usage benefits (i.e. using 8 instead of 16 bytes) is the 
crux here.


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc
File src/kudu/tablet/txn_participant-test.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@448
PS7, Line 448:   ASSERT_OK(tablet_replica_->GetGCableDataSize(_size));
nit: do you mind adding an assert just before performing any txn-related 
participant ops to make sure GC-able size was 0 at that point?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@462
PS7, Line 462: ASSERT_EQ(0, gcable_size);
Not sure whether whether I'm missing it, but I cannot see the call to 
tablet_replica_->GetGCableDataSize() before this assertion to update 
'gcable_size'.

Also, if performing such a verification, should we set 
--log_min_segments_to_retain to unnatural low value of 0 to make sure we don't 
hit the case of non-finished segment in 
Log::AllocateSegmentAndRollOverForTests()?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@468
PS7, Line 468: ASSERT_EQ(0, gcable_size);
Not sure I see how 'gcable_size' could be updated before this assertion.  Am I 
missing something?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@506
PS7, Line 506: Test that rebuilding transaction state from the WALs and metadata
An incomplete sentence?



--
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 30 Sep 2020 04:35:43 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

2020-09-29 Thread Andrew Wong (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#7).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
..

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 680 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/7
--
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3192: [hms] Leverage the cluster ID in HMS sync

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16494 )

Change subject: KUDU-3192: [hms] Leverage the cluster ID in HMS sync
..


Patch Set 1:

(8 comments)

I am working on adding more tests, but pushing my rebased updates for now.

http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_catalog.h
File src/kudu/hms/hms_catalog.h:

http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_catalog.h@66
PS1, Line 66:  // Creates a new table entry in the HMS.
:   //
:   // If 'owner' is omitted the table will be created without an 
owner. This is
:   // useful in circumstances where the owner is not known, for 
example when
:   // creating an HMS table entry for an existing Kudu table.
:   //
:   // Fails the HMS is unreachable, or a table with the same name 
is already present.
> nit: update the comment with the cluster_id? The same for other similar met
This is internal API and doesn't mention all the other fields.
I imagine the docs would be fairly self evident.


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_client-test.cc
File src/kudu/hms/hms_client-test.cc:

http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_client-test.cc@60
PS1, Line 60:   Status CreateTable(HmsClient* client,
> warning: method 'CreateTable' can be made static [readability-convert-membe
Done


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_client-test.cc@169
PS1, Line 169:   ASSERT_TRUE(CreateTable(, database_name, table_name,
 :   table_id, cluster_id).IsAlreadyPresent());
> What if supplying the same table_id, but different cluster_id?  What should
CreateTable in the HMS client is HMS specific. Only the table_name matters for 
an already present table. This is the same behavior as all other fields and 
defined by the HMS itself.


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/hms/hms_client-test.cc@179
PS1, Line 179:   EXPECT_EQ(HmsClient::kManagedTable, my_table.tableType);
> I guess GetTable() should return HmsClient::kKuduClusterIdKey property alon
Done


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/master/hms_notification_log_listener.cc
File src/kudu/master/hms_notification_log_listener.cc:

http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/master/hms_notification_log_listener.cc@356
PS1, Line 356: This is safe because we still validate the table ID
 :   // which is universally unique
> Does it mean the table ID will never be the same even in across different c
Right, the ID is a UUID.


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/master/hms_notification_log_listener.cc@363
PS1, Line 363: before_table.tableName, cluster_id);
> Should dereference cluster_id
Done


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/tools/tool_action_hms.cc@245
PS1, Line 245: row.resize
> nit: samer here.
Done


http://gerrit.cloudera.org:8080/#/c/16494/1/src/kudu/tools/tool_action_hms.cc@256
PS1, Line 256: adjust row.resize below
> nit: adjust 'row.resize' below.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I865b418a3cc4e11c889cc4757cd940831c43af17
Gerrit-Change-Number: 16494
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 29 Sep 2020 21:54:05 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3192: [hms] Leverage the cluster ID in HMS sync

2020-09-29 Thread Grant Henke (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16494

to look at the new patch set (#2).

Change subject: KUDU-3192: [hms] Leverage the cluster ID in HMS sync
..

KUDU-3192: [hms] Leverage the cluster ID in HMS sync

This patch propagates the cluster ID to the HMS entries for
Kudu tables and leverages the cluster ID to ignore irrelevant
notifications from the HMS.

Additionally the `kudu hms` tool is updated to identify
and repair tables that do not have the cluster ID set.

Change-Id: I865b418a3cc4e11c889cc4757cd940831c43af17
---
M src/kudu/hms/hms_catalog-test.cc
M src/kudu/hms/hms_catalog.cc
M src/kudu/hms/hms_catalog.h
M src/kudu/hms/hms_client-test.cc
M src/kudu/hms/hms_client.cc
M src/kudu/hms/hms_client.h
M src/kudu/integration-tests/hms_itest-base.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/hms_notification_log_listener.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_hms.cc
12 files changed, 185 insertions(+), 110 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/16494/2
--
To view, visit http://gerrit.cloudera.org:8080/16494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I865b418a3cc4e11c889cc4757cd940831c43af17
Gerrit-Change-Number: 16494
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16493 )

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..

KUDU-3192: [client] Expose the cluster ID in the client KuduTable

This patch adds a cluster ID field to the C++ and Java Kudu
clients. This field will be used in a follow on change to better
enable the HMS integration to use the cluster ID.

Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Reviewed-on: http://gerrit.cloudera.org:8080/16493
Reviewed-by: Andrew Wong 
Tested-by: Grant Henke 
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/master/master-test.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
14 files changed, 119 insertions(+), 15 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Grant Henke: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has removed a vote on this change.

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16493 )

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 21:15:19 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

2020-09-29 Thread Andrew Wong (Code Review)
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#6).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
..

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 678 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/6
--
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-2612 p10: have timestamp assignment account for commit timestamps

2020-09-29 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16470

to look at the new patch set (#5).

Change subject: KUDU-2612 p10: have timestamp assignment account for commit 
timestamps
..

KUDU-2612 p10: have timestamp assignment account for commit timestamps

One of the hurdles to performing a transaction's commit on a participant
is that the commit process must ensure repeatable reads. Without
multi-op transactions, this is done via a dance between the TimeManager
(the entity that tracks and assigns timestamps) and the MvccManager (the
entity that tracks the lifecycle of ops).

This patch extends the dance between the TimeManager and MvccManager to
ensure that when a participant commits a transaction, all future ops
will be assigned a higher timestamp.

I found it time-consuming to peruse the existing codebase for how
timestamp assignment ensured repeatable reads, so I added a block
comment to time_manager.h describing how it does so. I also added a
section about how this patch extends it to ensure repeatable reads in
the context of transactions.

Change-Id: I0412fd0cf778d96f3fe6b14624d8d06942f40e72
---
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/time_manager-test.cc
M src/kudu/consensus/time_manager.cc
M src/kudu/consensus/time_manager.h
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/ops/write_op.cc
M src/kudu/tablet/ops/write_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 436 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/70/16470/5
--
To view, visit http://gerrit.cloudera.org:8080/16470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0412fd0cf778d96f3fe6b14624d8d06942f40e72
Gerrit-Change-Number: 16470
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16493 )

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java@1328
PS3, Line 1328: assertTrue(client.getClusterId().isEmpty());
  : // Do something that will cause the client to connect to 
the cluster.
  : client.listTabletServers();
  : assertFalse(client.getClusterId().isEmpty());
> It is auto-generated server side and there isn't a great way to get it from
Ugh, I totally misread this test -- as you deduced I was referring to the 
cluster ID before the client even connected to the cluster, which I agree is 
fine to not test, given that's tested elsewhere.



--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 20:47:45 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16493 )

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java@1328
PS3, Line 1328: assertTrue(client.getClusterId().isEmpty());
  : // Do something that will cause the client to connect to 
the cluster.
  : client.listTabletServers();
  : assertFalse(client.getClusterId().isEmpty());
> nit: maybe assert the strings are equal?
It is auto-generated server side and there isn't a great way to get it from the 
server (without using the RPC this is testing).


http://gerrit.cloudera.org:8080/#/c/16493/3/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

http://gerrit.cloudera.org:8080/#/c/16493/3/src/kudu/integration-tests/cluster_itest_util.h@465
PS3, Line 465:  const MonoDelta& timeout,
 :  std::string* cluster_id);
> nit: spacing
Done



--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 20:31:58 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Hello Kudu Jenkins, Andrew Wong, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16493

to look at the new patch set (#4).

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..

KUDU-3192: [client] Expose the cluster ID in the client KuduTable

This patch adds a cluster ID field to the C++ and Java Kudu
clients. This field will be used in a follow on change to better
enable the HMS integration to use the cluster ID.

Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/master/master-test.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
14 files changed, 119 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/16493/4
--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16493 )

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..


Patch Set 3: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

http://gerrit.cloudera.org:8080/#/c/16493/3/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java@1328
PS3, Line 1328: assertTrue(client.getClusterId().isEmpty());
  : // Do something that will cause the client to connect to 
the cluster.
  : client.listTabletServers();
  : assertFalse(client.getClusterId().isEmpty());
nit: maybe assert the strings are equal?


http://gerrit.cloudera.org:8080/#/c/16493/3/src/kudu/integration-tests/cluster_itest_util.h
File src/kudu/integration-tests/cluster_itest_util.h:

http://gerrit.cloudera.org:8080/#/c/16493/3/src/kudu/integration-tests/cluster_itest_util.h@465
PS3, Line 465:  const MonoDelta& timeout,
 :  std::string* cluster_id);
nit: spacing



--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 19:31:42 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3192: [client] Expose the cluster ID in the client KuduTable

2020-09-29 Thread Grant Henke (Code Review)
Hello Kudu Jenkins, Andrew Wong, Hao Hao,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16493

to look at the new patch set (#3).

Change subject: KUDU-3192: [client] Expose the cluster ID in the client 
KuduTable
..

KUDU-3192: [client] Expose the cluster ID in the client KuduTable

This patch adds a cluster ID field to the C++ and Java Kudu
clients. This field will be used in a follow on change to better
enable the HMS integration to use the cluster ID.

Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduClient.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M src/kudu/client/client-internal.cc
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/master_failover-itest.cc
M src/kudu/master/master-test.cc
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
14 files changed, 118 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/16493/3
--
To view, visit http://gerrit.cloudera.org:8080/16493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia1f5a451aaa44834534d2387ee1c9aa9cf95dd37
Gerrit-Change-Number: 16493
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Hao Hao 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [thirdparty] Fix LLVM build on macOS with Xcode 12

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16520 )

Change subject: [thirdparty] Fix LLVM build on macOS with Xcode 12
..

[thirdparty] Fix LLVM build on macOS with Xcode 12

After upgrading Xcode I started seeing failures when building
thirdparty. An example is below. It looks like the issue
stems from a change in the default architecture to include
arm64. Full details on the issue can be seen here:
https://gitlab.kitware.com/cmake/cmake/-/issues/20893

To work around this issue this patch sets
`-DCMAKE_OSX_ARCHITECTURES="x86_64”` in the
LLVM build.

Example failure log:
ld: warning: building for iOS, but linking in .tbd file 
(/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator14.0.sdk/usr/lib/system/libsystem_coreservices.tbd)
 built for iOS Simulator
…
fatal error: 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/lipo:
 
/Users/ghenke/Source/kudu/thirdparty/build/llvm-9.0.0/lib/libclang_rt.builtins_arm64_ios.a
 and 
/Users/ghenke/Source/kudu/thirdparty/build/llvm-9.0.0/lib/libclang_rt.builtins_arm64_iossim.a
 have the same architectures (arm64) and can't be in the same fat output file

Change-Id: I0e7808fcd92227cd83b35f5bf1b9320c7887b603
Reviewed-on: http://gerrit.cloudera.org:8080/16520
Reviewed-by: Alexey Serbin 
Tested-by: Kudu Jenkins
---
M thirdparty/build-definitions.sh
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0e7808fcd92227cd83b35f5bf1b9320c7887b603
Gerrit-Change-Number: 16520
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [thirdparty] Fix LLVM build on macOS with Xcode 12

2020-09-29 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16520 )

Change subject: [thirdparty] Fix LLVM build on macOS with Xcode 12
..


Patch Set 1: Code-Review+2

Thank you for addressing this.


--
To view, visit http://gerrit.cloudera.org:8080/16520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e7808fcd92227cd83b35f5bf1b9320c7887b603
Gerrit-Change-Number: 16520
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 17:06:26 +
Gerrit-HasComments: No


[kudu-CR] [thirdparty] Fix LLVM build on macOS with Xcode 12

2020-09-29 Thread Grant Henke (Code Review)
Grant Henke has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16520


Change subject: [thirdparty] Fix LLVM build on macOS with Xcode 12
..

[thirdparty] Fix LLVM build on macOS with Xcode 12

After upgrading Xcode I started seeing failures when building
thirdparty. An example is below. It looks like the issue
stems from a change in the default architecture to include
arm64. Full details on the issue can be seen here:
https://gitlab.kitware.com/cmake/cmake/-/issues/20893

To work around this issue this patch sets
`-DCMAKE_OSX_ARCHITECTURES="x86_64”` in the
LLVM build.

Example failure log:
ld: warning: building for iOS, but linking in .tbd file 
(/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator14.0.sdk/usr/lib/system/libsystem_coreservices.tbd)
 built for iOS Simulator
…
fatal error: 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/lipo:
 
/Users/ghenke/Source/kudu/thirdparty/build/llvm-9.0.0/lib/libclang_rt.builtins_arm64_ios.a
 and 
/Users/ghenke/Source/kudu/thirdparty/build/llvm-9.0.0/lib/libclang_rt.builtins_arm64_iossim.a
 have the same architectures (arm64) and can't be in the same fat output file

Change-Id: I0e7808fcd92227cd83b35f5bf1b9320c7887b603
---
M thirdparty/build-definitions.sh
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/16520/1
--
To view, visit http://gerrit.cloudera.org:8080/16520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0e7808fcd92227cd83b35f5bf1b9320c7887b603
Gerrit-Change-Number: 16520
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke