[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1491
PS6, Line 1491: } // namespace kudu
Also, does it make sense to add a few test cases to cover situations when range 
partitions consist just of a single element in one of its PK dimension?  E.g., 
for PK (int32, string) have a table range partitioned by int32, and hash 
bucketed in different number of buckets based on string per range?



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 31 Jul 2021 02:23:36 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1490
PS6, Line 1490: }
I'm not sure whether I saw testcases to cover situations when 
scan_spec.CanShortCircuit() in the very beginning of the 
PartitionPruner::Init() method would return 'true' in case of using custom hash 
bucket schemas per range partition.  Does it make sense to add a few to cover 
the cases to cover various cases when ScanSpec::CanShortCircuit() returns 
'true'?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@241
PS6, Line 241: ColumnId
nit: could use 'auto' here as well?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@243
PS6, Line 243: ColumnPredicate *predicate
nit: per code style, the asterisk should stick to the type, not variable


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521
PS6, Line 521: range
nit: for better readability, maybe name this iterator 'range_it'?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521
PS6, Line 521: partition_key_range.second
nit: could establish a reference variable to refer to 
partition_key_range.second for brevity:

  auto& key_range = partition_key_range.second;


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@523
PS6, Line 523: range++
nit: prefer using prefix increment operators -- they might be faster because 
they don't need to store the intermediate result in a temporary variable (you 
could check the implementation of prefix and postfix operators for STD 
containers' iterators)


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@557
PS6, Line 557: &&
nit: move '&&' to the previous line



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 31 Jul 2021 02:12:04 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366
PS3, Line 1366:
> Bad asserts result in the test being marked as failed, while a CHECK failur
Yep, I agree that ASSERT_OK() is a better choice here.

AFAIK, the only case where resorting to CHECK_OK() is a good option is if 
ASSERT_OK() isn't working as expected if the code is run by other than the main 
test thread.


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h
File src/kudu/common/partition_pruner.h:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@46
PS6, Line 46: LowerUpperRangeBounds
nit: why not just RangeBounds -- could there be any other than lower and upper 
bounds for a range?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@48
PS6, Line 48: std::pair
What is semantic significance of using std::pair instead of struct here?  Does 
the first pair's component somehow complements the second and vice versa?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@71
PS6, Line 71: size_t num_ranges = 0;
: for (const auto& range: partition_key_ranges_) {
:   num_ranges += range.second.size();
: }
: return num_ranges;
I see this method is indirectly called in NextPartitionKey() method, and that 
would result in n^2 complexity while building, scan tokens.  Is it possible to 
compute this once in Init() method and then use the pre-computed result 
elsewhere?



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 31 Jul 2021 01:18:04 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3304: [Alter] Support to alter table's replication factor

2021-07-30 Thread Yingchun Lai (Code Review)
Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17674 )

Change subject: KUDU-3304: [Alter] Support to alter table's replication factor
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h@1721
PS4, Line 1721:
  :   /// Set the table's replication factor.
  :   ///
  :   /// @note The replication factor should be a odd number and 
range in
  :   /// [1, --max_num_replicas]
  :   ///
  :   /// @param [in] replication_factor
  :   ///   The table's new replication factor.
> Alexey and I discussed this offline at length. Introducing this method as a
Thanks for your suggestion, not expose to C++ client API and just use internal 
methods maybe a better way to resolve this issue, I‘ll try to update the patch.



--
To view, visit http://gerrit.cloudera.org:8080/17674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aa2d5b12c508ba761fa9410ad1a465cf31fb7d7
Gerrit-Change-Number: 17674
Gerrit-PatchSet: 7
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Sat, 31 Jul 2021 00:59:30 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 6: Verified+1

Test failures were unrelated to this patch.


--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 31 Jul 2021 00:06:01 +
Gerrit-HasComments: No


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has removed a vote on this change.

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-3304: [Alter] Support to alter table's replication factor

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17674 )

Change subject: KUDU-3304: [Alter] Support to alter table's replication factor
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h@1721
PS4, Line 1721:
  :   /// Set the table's replication factor.
  :   ///
  :   /// @note The replication factor should be a odd number and 
range in
  :   /// [1, --max_num_replicas]
  :   ///
  :   /// @param [in] replication_factor
  :   ///   The table's new replication factor.
> Right -- in some aspects it's close to CreateTable() and DDL operations per
Alexey and I discussed this offline at length. Introducing this method as a 
part of public API has deeper repercussions that are worth pondering. We strive 
to have our programatic public interface be as stable as possible, so ironing 
out these details up front would help alleviate much of the concern (or see 
below for an alternate path).
- As pointed out, the semantics of altering the replication factor are a bit 
fuzzy, and if thinking about replication factor as a first class citizen of 
metadata that can be altered, there are things we should probably iron out 
before pushing this into the public client API. Namely, how long should the 
wait() method actually wait (majority replicated? fully replicated?)? And how 
should this metadata be synchronized with the HMS, if using Impala and/or Hive? 
These are certainly questions we can answer now (just spitballing, but we could 
update IsAlterTableDone() to check replica count), but my point is that these 
are questions that should be thought through before committing to public API.
- Given the expectations of altering the replication factor may be different 
from other kinds of alter DDL, should this be a part of the KuduTableAlterer 
API at all, or should we introduce some other API that encompasses only 
changing replication factor? That way, we could build something more tailored 
replication factor (e.g. wait for full replication, majority replication, 
etc.). And beyond that, there may be other use cases to consider, like writing 
a single-replica table and then increasing the number of replicas once fully 
compacted as a means to potentially faster replication.

Public API aside, if the main goal is a means to bump the replication factor 
via tooling, an alternate route would be to not expose this as C++ client API 
just yet, and instead rely on friendship and internal methods to alter the 
replication factor. We do something similar to set table types for the 
transactions system table, since table types are also not exposed to public API 
yet. See [1] for an example. Going this route would still allow us to introduce 
tooling still, but not lock us into a client API. Then, as we fix any rough 
edges with tooling, we can learn and deliberate on what the semantics for the 
public API ought to be, without releasing anything too unstable.

[1] 
https://github.com/apache/kudu/blob/add7cb99fc8d6ab586e27d0ecd7d1698f0f45ae3/src/kudu/transactions/txn_system_client.cc#L130



--
To view, visit http://gerrit.cloudera.org:8080/17674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aa2d5b12c508ba761fa9410ad1a465cf31fb7d7
Gerrit-Change-Number: 17674
Gerrit-PatchSet: 7
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Fri, 30 Jul 2021 23:58:37 +
Gerrit-HasComments: Yes


[kudu-CR] [common] style update on schema.{cc,h} and partition.cc

2021-07-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17742 )

Change subject: [common] style update on schema.{cc,h} and partition.cc
..

[common] style update on schema.{cc,h} and partition.cc

While I was updating the code around, I decided to separate changes
related to the project's style guide and comparison of signed/unsigned
integers into a separate patch.  In addition, I updated the code to
consistently use dedicated constants (like kColumnNotFound) throughout
the code in schema.cc.

This changelist doesn't contain any functional changes.

Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60
Reviewed-on: http://gerrit.cloudera.org:8080/17742
Reviewed-by: Andrew Wong 
Tested-by: Kudu Jenkins
---
M src/kudu/common/partition.cc
M src/kudu/common/schema.cc
M src/kudu/common/schema.h
3 files changed, 82 insertions(+), 86 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved
  Kudu Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60
Gerrit-Change-Number: 17742
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 


[kudu-CR] [common] style update on schema.{cc,h} and partition.cc

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17742 )

Change subject: [common] style update on schema.{cc,h} and partition.cc
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60
Gerrit-Change-Number: 17742
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Fri, 30 Jul 2021 21:11:39 +
Gerrit-HasComments: No


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17643/5/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/17643/5/src/kudu/common/partition_pruner.cc@269
PS5, Line 269: constrained_index = hash_bucket_schemas.size() -
> warning: narrowing conversion from 'unsigned long' to signed type 'int' is
Done



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 20:36:51 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17643

to look at the new patch set (#6).

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..

[pruning] KUDU-2671: Pruning compatible with custom hash schemas.

This patch introduces changes to the PartitionPruner class
to be compatible with custom hash bucket schemas per range.

There are three ways to set bounds on a scan.

- Adding predicates (e.g. range/equality)
- Setting lower and upper bound primary keys
- Setting lower and upper bound partition keys

This patch introduces changes that make the first two methods
of setting bounds on a scan compatible with custom hash bucket
schemas per range. The last way using partition keys is unstable
and for internal use only. While it's not necessary for the last
way to be compatible with per range hash bucket schemas, the
entire pruning functionality will not be complete until
PartitionPruner::RemovePartitionKeyRange() is modified.
That work will be done in a follow up patch.

Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
---
M src/kudu/common/partial_row.h
M src/kudu/common/partition_pruner-test.cc
M src/kudu/common/partition_pruner.cc
M src/kudu/common/partition_pruner.h
4 files changed, 918 insertions(+), 373 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/6
--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [common] style update on schema.{cc,h} and partition.cc

2021-07-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17742


Change subject: [common] style update on schema.{cc,h} and partition.cc
..

[common] style update on schema.{cc,h} and partition.cc

While I was updating the code around, I decided to separate changes
related to the project's style guide and comparison of signed/unsigned
integers into a separate patch.  In addition, I updated the code to
consistently use dedicated constants (like kColumnNotFound) throughout
the code in schema.cc.

This changelist doesn't contain any functional changes.

Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60
---
M src/kudu/common/partition.cc
M src/kudu/common/schema.cc
M src/kudu/common/schema.h
3 files changed, 82 insertions(+), 86 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/17742/1
--
To view, visit http://gerrit.cloudera.org:8080/17742
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60
Gerrit-Change-Number: 17742
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366
PS3, Line 1366:
> Bad asserts result in the test being marked as failed, while a CHECK failur
Thanks for the clarification.


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428
PS3, Line 1428:
> Just to clarify the suggestion is:
Done


http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc@481
PS4, Line 481: auto&
> nit: ampersand should be right after "auto"
Done



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 5
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 20:25:52 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17643

to look at the new patch set (#5).

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..

[pruning] KUDU-2671: Pruning compatible with custom hash schemas.

This patch introduces changes to the PartitionPruner class
to be compatible with custom hash bucket schemas per range.

There are three ways to set bounds on a scan.

- Adding predicates (e.g. range/equality)
- Setting lower and upper bound primary keys
- Setting lower and upper bound partition keys

This patch introduces changes that make the first two methods
of setting bounds on a scan compatible with custom hash bucket
schemas per range. The last way using partition keys is unstable
and for internal use only. While it's not necessary for the last
way to be compatible with per range hash bucket schemas, the
entire pruning functionality will not be complete until
PartitionPruner::RemovePartitionKeyRange() is modified.
That work will be done in a follow up patch.

Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
---
M src/kudu/common/partial_row.h
M src/kudu/common/partition_pruner-test.cc
M src/kudu/common/partition_pruner.cc
M src/kudu/common/partition_pruner.h
4 files changed, 918 insertions(+), 373 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/5
--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 5
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers

2021-07-30 Thread Abhishek Chennaka (Code Review)
Abhishek Chennaka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17730 )

Change subject: [WIP] KUDU-1959 - Implement server startup progress page for 
tablet and master servers
..


Patch Set 2:

The patch set 2 has all the previously pointed out unaddressed things addressed


--
To view, visit http://gerrit.cloudera.org:8080/17730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4
Gerrit-Change-Number: 17730
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 19:02:42 +
Gerrit-HasComments: No


[kudu-CR] [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers

2021-07-30 Thread Abhishek Chennaka (Code Review)
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17730

to look at the new patch set (#2).

Change subject: [WIP] KUDU-1959 - Implement server startup progress page for 
tablet and master servers
..

[WIP] KUDU-1959 - Implement server startup progress page for tablet and
master servers

This task is broken down into the below tasks:
1. Adjust server startup to start webserver displaying static startup page
2. Implement startup WebUI using existing data
3. Enhance the startup WebUI with missing data
4. Expose startup WebUI data to metrics
5. Expose the metrics to CM for health monitoring

This patch implements the step 1 from above - Adjust the server startup
sequence to startup the web server and load an empty startup page right
before the filesystem is read, tablets are bootstrapped and rpc server
is started. This startup page will be used to display the startup
progress which will be implemented in following patches. Manually
validated the startup when security is enabled and filesystem is not
present.

The footer of the WebUI has the UUID by default. In case of new
instances, since we are starting the Webserver before the UUID is
generated the footer doesn’t contain UUID but it is set later when
all the other pages are loaded.

Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4
---
M src/kudu/server/CMakeLists.txt
M src/kudu/server/server_base.cc
A src/kudu/server/startup_path_handler.cc
A src/kudu/server/startup_path_handler.h
M src/kudu/server/tracing_path_handlers.h
5 files changed, 127 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/17730/2
--
To view, visit http://gerrit.cloudera.org:8080/17730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4
Gerrit-Change-Number: 17730
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 4:

I'll look into the test -- it seems pretty flaky (the test dashboard says 11%!).


--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:59:53 +
Gerrit-HasComments: No


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 4: Verified+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366
PS3, Line 1366:
> I just reused the existing test code that uses CHECK_OK(). What's the diffe
Bad asserts result in the test being marked as failed, while a CHECK failure 
results in the process crashing. A failed test is much easier to diagnose and 
report compared to a crash. And depending on the test, e.g. in a multi-process 
test like one that uses an ExternalMiniCluster, a crash may not even result in 
a test failure without explicit checking for crashed processes.


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428
PS3, Line 1428:  int8_t>(4, 4, 0), 7, 7);
> nope, have to explicitly use `make_tuple`.
Just to clarify the suggestion is:

 CreatePartitionSchemaPB({"A"}, { { columns, 3, 0 } }, );
or
 CreatePartitionSchemaPB({"A"}, { { { "A", "B" }, 3, 0 } }, );

which _should_ work. I think compilers have issues with inferring types for 
optionals (e.g. in the check() lambdas), but that shouldn't be the case here or 
with AddRangePartitionWithSchema().


http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc@481
PS4, Line 481: auto &
nit: ampersand should be right after "auto"



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:57:22 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has removed a vote on this change.

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17739 )

Change subject: [helm] fix usage of multiple directories
..

[helm] fix usage of multiple directories

This patch updates the /helm implementation to use the new FS_WAL_DIR and
FS_DATA_DIRS environment variables. This patch also updates the README with
steps I used to test the changes.

I verified that I could update the number of directories in kudu/values.yaml,
confirming that the servers used multiple directories. Below is a snippet of
the logs from one tserver with multiple directories (5, set via
kudu/values.yaml). Previously, even with multiple storage devices set,
each server would use a single directory /var/lib/kudu for storage.

$ kubectl logs kudu-tserver-1
I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server 
non-default flags:
--use_hybrid_clock=false
--fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4
--fs_wal_dir=/mnt/disk0
--webserver_doc_root=/opt/kudu/www
--tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local
--heap_profile_path=/tmp/kudu.1
--stderrthreshold=0

Tablet server version:
kudu 1.16.0-SNAPSHOT
revision 2ceec7749
build type RELEASE
...

A rendered README can be found here:
https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc

Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
---
M kubernetes/helm/README.adoc
M kubernetes/helm/kudu/templates/_helmutils.tpl
M kubernetes/helm/kudu/templates/kudu-service.yaml
M kubernetes/helm/kudu/values.yaml
4 files changed, 109 insertions(+), 23 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Bankim Bhavsar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [docker] KUDU-3307: allow using multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17738 )

Change subject: [docker] KUDU-3307: allow using multiple directories
..

[docker] KUDU-3307: allow using multiple directories

Previously, the Docker entrypoint would by default use a single root
directory, suffixed with either 'master' or 'tserver'. This makes it a
poor fit for production systems that may want to use Docker, since most
reasonable deployments use multple data directories, and don't colocate
WALs and data.

This patch adds a couple of environment variables to the entrypoint,
FS_WAL_DIR and FS_DATA_DIRS, that map directly to the --fs_wal_dir and
--fs_data_dirs gflags that Kudu masters and tservers know and love.

To maintain compatibility with existing guides, the existing single-root
DATA_DIR remains, but the new environment variables, if set, take
precedence.

Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Reviewed-on: http://gerrit.cloudera.org:8080/17738
Tested-by: Kudu Jenkins
Reviewed-by: Bankim Bhavsar 
---
M docker/kudu-entrypoint.sh
1 file changed, 43 insertions(+), 6 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Bankim Bhavsar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17738
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Gerrit-Change-Number: 17738
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17739 )

Change subject: [helm] fix usage of multiple directories
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:18:31 +
Gerrit-HasComments: No


[kudu-CR] [docker] KUDU-3307: allow using multiple directories

2021-07-30 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17738 )

Change subject: [docker] KUDU-3307: allow using multiple directories
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17738
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Gerrit-Change-Number: 17738
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:15:46 +
Gerrit-HasComments: No


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17739

to look at the new patch set (#3).

Change subject: [helm] fix usage of multiple directories
..

[helm] fix usage of multiple directories

This patch updates the /helm implementation to use the new FS_WAL_DIR and
FS_DATA_DIRS environment variables. This patch also updates the README with
steps I used to test the changes.

I verified that I could update the number of directories in kudu/values.yaml,
confirming that the servers used multiple directories. Below is a snippet of
the logs from one tserver with multiple directories (5, set via
kudu/values.yaml). Previously, even with multiple storage devices set,
each server would use a single directory /var/lib/kudu for storage.

$ kubectl logs kudu-tserver-1
I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server 
non-default flags:
--use_hybrid_clock=false
--fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4
--fs_wal_dir=/mnt/disk0
--webserver_doc_root=/opt/kudu/www
--tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local
--heap_profile_path=/tmp/kudu.1
--stderrthreshold=0

Tablet server version:
kudu 1.16.0-SNAPSHOT
revision 2ceec7749
build type RELEASE
...

A rendered README can be found here:
https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc

Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
---
M kubernetes/helm/README.adoc
M kubernetes/helm/kudu/templates/_helmutils.tpl
M kubernetes/helm/kudu/templates/kudu-service.yaml
M kubernetes/helm/kudu/values.yaml
4 files changed, 109 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/17739/3
--
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17739 )

Change subject: [helm] fix usage of multiple directories
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG@22
PS1, Line 22: I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Ta
> Just to be sure you modified storage.server.count in values.yaml to 5 while
Yeah, updated the commit message to mention that.


http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl
File kubernetes/helm/kudu/templates/_helmutils.tpl:

http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl@78
PS1, Line 78: {{range $index := untilStep 1 $num_dirs 1 -}}{{if ne $index 
1}},{{end}}/mnt/disk{{ $index }}{{end}}
> If storage.master.count happens to be 1 then the docker entry point script
The logic here works since it leaves the field empty, and so the entrypoint 
should fall back on using the WAL directory.

That said, I did find an issue with the entrypoint, fixed in the other patch.



--
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:11:19 +
Gerrit-HasComments: Yes


[kudu-CR] [docker] KUDU-3307: allow using multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17738 )

Change subject: [docker] KUDU-3307: allow using multiple directories
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh
File docker/kudu-entrypoint.sh:

http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@42
PS2, Line 42: Defines the root directory to use. Subdirectories are added 
depending on whether a "
:   echo "  Kudu master or a Kudu tablet server is being deployed. 
Ignored if the FS_WAL_DIR "
:   echo "  environment variable is set."
> Nit: Might be worth adding a note about deprecating this variable.
Done


http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@52
PS2, Line 52:  s
> Nit: extra "to"
Done



--
To view, visit http://gerrit.cloudera.org:8080/17738
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Gerrit-Change-Number: 17738
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 18:11:10 +
Gerrit-HasComments: Yes


[kudu-CR] [docker] KUDU-3307: allow using multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17738

to look at the new patch set (#3).

Change subject: [docker] KUDU-3307: allow using multiple directories
..

[docker] KUDU-3307: allow using multiple directories

Previously, the Docker entrypoint would by default use a single root
directory, suffixed with either 'master' or 'tserver'. This makes it a
poor fit for production systems that may want to use Docker, since most
reasonable deployments use multple data directories, and don't colocate
WALs and data.

This patch adds a couple of environment variables to the entrypoint,
FS_WAL_DIR and FS_DATA_DIRS, that map directly to the --fs_wal_dir and
--fs_data_dirs gflags that Kudu masters and tservers know and love.

To maintain compatibility with existing guides, the existing single-root
DATA_DIR remains, but the new environment variables, if set, take
precedence.

Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
---
M docker/kudu-entrypoint.sh
1 file changed, 43 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/38/17738/3
--
To view, visit http://gerrit.cloudera.org:8080/17738
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Gerrit-Change-Number: 17738
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Andrew Wong (Code Review)
Hello Kudu Jenkins, Bankim Bhavsar,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17739

to look at the new patch set (#2).

Change subject: [helm] fix usage of multiple directories
..

[helm] fix usage of multiple directories

This patch updates the /helm implementation to use the new FS_WAL_DIR and
FS_DATA_DIRS environment variables. This patch also updates the README with
steps I used to test the changes.

I verified that I could update the number of directories in kudu/values.yaml,
confirming that the servers used multiple directories. Below is a snippet of
the logs from one tserver with multiple directories (5, set via
kudu/values.yaml). Previously, even with multiple storage devices set,
each server would use a single directory /var/lib/kudu for storage.

Below is a snippet of the new behavior:

$ kubectl logs kudu-tserver-1
I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server 
non-default flags:
--use_hybrid_clock=false
--fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4
--fs_wal_dir=/mnt/disk0
--webserver_doc_root=/opt/kudu/www
--tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local
--heap_profile_path=/tmp/kudu.1
--stderrthreshold=0

Tablet server version:
kudu 1.16.0-SNAPSHOT
revision 2ceec7749
build type RELEASE
...

A rendered README can be found here:
https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc

Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
---
M kubernetes/helm/README.adoc
M kubernetes/helm/kudu/templates/_helmutils.tpl
M kubernetes/helm/kudu/templates/kudu-service.yaml
M kubernetes/helm/kudu/values.yaml
4 files changed, 109 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/17739/2
--
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [helm] fix usage of multiple directories

2021-07-30 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17739 )

Change subject: [helm] fix usage of multiple directories
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG@22
PS1, Line 22: --fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4
Just to be sure you modified storage.server.count in values.yaml to 5 while 
testing your changes, right?


http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl
File kubernetes/helm/kudu/templates/_helmutils.tpl:

http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl@78
PS1, Line 78: {{range $index := untilStep 1 $num_dirs 1 -}}{{if ne $index 
1}},{{end}}/mnt/disk{{ $index }}{{end}}
If storage.master.count happens to be 1 then the docker entry point script will 
be okay, right?



-- 
To view, visit http://gerrit.cloudera.org:8080/17739
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa
Gerrit-Change-Number: 17739
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 17:10:12 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] Add a tool to recover master data from tablet servers

2021-07-30 Thread Andrew Wong (Code Review)
Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9490 )

Change subject: [tools] Add a tool to recover master data from tablet servers
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc
File src/kudu/tools/master_rebuilder.cc:

http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc@175
PS5, Line 175:
> Wouldn't this brick single-replica tables? It would think it's under-replic
Ah that's fair. It should actually be max(FLAGS_min_num_replicas, 
odd_reported_replicas). Still, let's do that in a follow-up.



--
To view, visit http://gerrit.cloudera.org:8080/9490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6
Gerrit-Change-Number: 9490
Gerrit-PatchSet: 6
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Fri, 30 Jul 2021 17:05:35 +
Gerrit-HasComments: Yes


[kudu-CR] [docker] KUDU-3307: allow using multiple directories

2021-07-30 Thread Bankim Bhavsar (Code Review)
Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17738 )

Change subject: [docker] KUDU-3307: allow using multiple directories
..


Patch Set 2:

(2 comments)

Thanks for taking this up, Andrew. Change looks good, only minor comments.

http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh
File docker/kudu-entrypoint.sh:

http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@42
PS2, Line 42: Defines the root directory to use. Subdirectories are added 
depending on whether a "
:   echo "  Kudu master or a Kudu tablet server is being deployed. 
Ignored if the FS_WAL_DIR "
:   echo "  environment variable is set."
Nit: Might be worth adding a note about deprecating this variable.


http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@52
PS2, Line 52: to
Nit: extra "to"



--
To view, visit http://gerrit.cloudera.org:8080/17738
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1
Gerrit-Change-Number: 17738
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong 
Gerrit-Reviewer: Bankim Bhavsar 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 30 Jul 2021 16:54:25 +
Gerrit-HasComments: Yes


[kudu-CR] [tools] Add a tool to recover master data from tablet servers

2021-07-30 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9490 )

Change subject: [tools] Add a tool to recover master data from tablet servers
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc
File src/kudu/tools/master_rebuilder.cc:

http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc@175
PS5, Line 175:
> It's good, tables in the cluster may have different RFs, FLAGS_default_num_
Wouldn't this brick single-replica tables? It would think it's under-replicated 
with no majority, so it couldn't re-replicate to 3 automatically, or am I 
missing something?



--
To view, visit http://gerrit.cloudera.org:8080/9490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6
Gerrit-Change-Number: 9490
Gerrit-PatchSet: 6
Gerrit-Owner: Will Berkeley 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Will Berkeley 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Fri, 30 Jul 2021 13:13:59 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3204: Add a metric for amount of available space

2021-07-30 Thread Attila Bukor (Code Review)
Attila Bukor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17725 )

Change subject: KUDU-3204: Add a metric for amount of available space
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17725/2/src/kudu/server/server_base.cc
File src/kudu/server/server_base.cc:

http://gerrit.cloudera.org:8080/#/c/17725/2/src/kudu/server/server_base.cc@446
PS2, Line 446: // Calculates the total free space on all the data directories 
combined accounting for
> So if I am understanding it right, you are suggesting a new function which
Yes. It could be inlined too if you want to avoid the overhead of calling a 
function.



--
To view, visit http://gerrit.cloudera.org:8080/17725
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a7419847d080498aaf431d1dab098e1af71ad0
Gerrit-Change-Number: 17725
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 30 Jul 2021 13:12:09 +
Gerrit-HasComments: Yes


[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..


Patch Set 4:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/17643/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17643/3//COMMIT_MSG@13
PS3, Line 13:
: - Adding predicates (e.g. range/equality)
: - Setting lower and upper bound primary keys
: - Setting lower and upper bound partition keys
:
: This patch introduces changes that make the first two methods
:
> nit: remove the extra newlines between each bullet?
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc@1109
PS3, Line 1109: y_end(),
> nit: parameter's name is different between the declaration and the definiti
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc@1147
PS3, Line 1147: Status PartitionSchema::EncodeColumns(const ConstContiguousRow& 
row,
  :   const vector& 
column_ids,
  :   string* buf) {
  :   for (int i = 0; i < column_ids.size(); i++) {
  : ColumnId column_id = column_ids[i];
  :
> +1
Will move these changes to a separate patch.


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1056
PS3, Line 1056: .column(0), ),
> nit: consider a typedef for this? Maybe ColumnNamesNumBucketsAndSeed or som
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1079
PS3, Line 1079:   check({ ColumnPredicate::Equality
> nit: the indent for this line is off
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1100
PS3, Line 1100: Part
> nit: for pointers, prefer being explicit and use auto*
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1103
PS3, Line 1103: nge_
> nit: same here
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1103
PS3, Line 1103: chema->add_col
> nit: to avoid any ambiguity with hash_bucket_schemas, maybe call this hash_
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1304
PS3, Line 1304:
  :   // C < "a"
  :   check({ ColumnPredicate::Range(schema.column(2), nullptr, 
)}, "", "", 0, 0);
  :
  :   // C >= "m"
  :   check({ ColumnPredicate::Range(schema.column(2), , 
nullptr)}, "", "", 0, 0);
  :
  :   // partition key >= (hash=1, hash=0)
  :   check({}, string("\0\0\0\1\0\0\0\0", 8), "", 8, 8);
  :
  :   // partition key
> Do you mind adding a scenario for a scheme like this but with custom hash b
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1326
PS3, Line 1326:
> nit here and below where std::make_pair() is used: would it be enough to ha
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1355
PS3, Line 1355:
> nit: add 'const' and turn into all-lowercase
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366
PS3, Line 1366:
> Why not to use ASSERT_OK() here?
I just reused the existing test code that uses CHECK_OK(). What's the 
difference between two?


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1381
PS3, Line 1381: size_
> nit: add NO_FATALS() here?
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1411
PS3, Line 1411:   // PK < (2, 2, min)
> Do you mind adding a scenario where the range for the primary key should le
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428
PS3, Line 1428:  int8_t>(4, 4, 0), 7, 7);
> nit: would {columns, 3, 0} be enough here?
nope, have to explicitly use `make_tuple`.


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1478
PS3, Line 1478:
  :   ASSERT_OK(PartitionSchema::FromPB(pb, schema, 
_schema));
  :
> nit: make these constexpr?
Done


http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner.h
File src/kudu/common/partition_pruner.h:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner.h@49
PS3, Line 49:
> nit: it should be 4 spaces here per style guide
Done



[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.

2021-07-30 Thread Mahesh Reddy (Code Review)
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17643

to look at the new patch set (#4).

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
..

[pruning] KUDU-2671: Pruning compatible with custom hash schemas.

This patch introduces changes to the PartitionPruner class
to be compatible with custom hash bucket schemas per range.

There are three ways to set bounds on a scan.

- Adding predicates (e.g. range/equality)
- Setting lower and upper bound primary keys
- Setting lower and upper bound partition keys

This patch introduces changes that make the first two methods
of setting bounds on a scan compatible with custom hash bucket
schemas per range. The last way using partition keys is unstable
and for internal use only. While it's not necessary for the last
way to be compatible with per range hash bucket schemas, the
entire pruning functionality will not be complete until
PartitionPruner::RemovePartitionKeyRange() is modified.
That work will be done in a follow up patch.

Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
---
M src/kudu/common/partial_row.h
M src/kudu/common/partition_pruner-test.cc
M src/kudu/common/partition_pruner.cc
M src/kudu/common/partition_pruner.h
4 files changed, 913 insertions(+), 315 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/4
--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Andrew Wong 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)