[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1491 PS6, Line 1491: } // namespace kudu Also, does it make sense to add a few test cases to cover situations when range partitions consist just of a single element in one of its PK dimension? E.g., for PK (int32, string) have a table range partitioned by int32, and hash bucketed in different number of buckets based on string per range? -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 31 Jul 2021 02:23:36 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 6: (7 comments) http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1490 PS6, Line 1490: } I'm not sure whether I saw testcases to cover situations when scan_spec.CanShortCircuit() in the very beginning of the PartitionPruner::Init() method would return 'true' in case of using custom hash bucket schemas per range partition. Does it make sense to add a few to cover the cases to cover various cases when ScanSpec::CanShortCircuit() returns 'true'? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@241 PS6, Line 241: ColumnId nit: could use 'auto' here as well? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@243 PS6, Line 243: ColumnPredicate *predicate nit: per code style, the asterisk should stick to the type, not variable http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521 PS6, Line 521: range nit: for better readability, maybe name this iterator 'range_it'? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521 PS6, Line 521: partition_key_range.second nit: could establish a reference variable to refer to partition_key_range.second for brevity: auto& key_range = partition_key_range.second; http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@523 PS6, Line 523: range++ nit: prefer using prefix increment operators -- they might be faster because they don't need to store the intermediate result in a temporary variable (you could check the implementation of prefix and postfix operators for STD containers' iterators) http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@557 PS6, Line 557: && nit: move '&&' to the previous line -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 31 Jul 2021 02:12:04 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366 PS3, Line 1366: > Bad asserts result in the test being marked as failed, while a CHECK failur Yep, I agree that ASSERT_OK() is a better choice here. AFAIK, the only case where resorting to CHECK_OK() is a good option is if ASSERT_OK() isn't working as expected if the code is run by other than the main test thread. http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@46 PS6, Line 46: LowerUpperRangeBounds nit: why not just RangeBounds -- could there be any other than lower and upper bounds for a range? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@48 PS6, Line 48: std::pair What is semantic significance of using std::pair instead of struct here? Does the first pair's component somehow complements the second and vice versa? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@71 PS6, Line 71: size_t num_ranges = 0; : for (const auto& range: partition_key_ranges_) { : num_ranges += range.second.size(); : } : return num_ranges; I see this method is indirectly called in NextPartitionKey() method, and that would result in n^2 complexity while building, scan tokens. Is it possible to compute this once in Init() method and then use the pre-computed result elsewhere? -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 31 Jul 2021 01:18:04 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3304: [Alter] Support to alter table's replication factor
Yingchun Lai has posted comments on this change. ( http://gerrit.cloudera.org:8080/17674 ) Change subject: KUDU-3304: [Alter] Support to alter table's replication factor .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h File src/kudu/client/client.h: http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h@1721 PS4, Line 1721: : /// Set the table's replication factor. : /// : /// @note The replication factor should be a odd number and range in : /// [1, --max_num_replicas] : /// : /// @param [in] replication_factor : /// The table's new replication factor. > Alexey and I discussed this offline at length. Introducing this method as a Thanks for your suggestion, not expose to C++ client API and just use internal methods maybe a better way to resolve this issue, I‘ll try to update the patch. -- To view, visit http://gerrit.cloudera.org:8080/17674 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aa2d5b12c508ba761fa9410ad1a465cf31fb7d7 Gerrit-Change-Number: 17674 Gerrit-PatchSet: 7 Gerrit-Owner: Yingchun Lai Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Sat, 31 Jul 2021 00:59:30 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 6: Verified+1 Test failures were unrelated to this patch. -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 31 Jul 2021 00:06:01 + Gerrit-HasComments: No
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Andrew Wong has removed a vote on this change. Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] KUDU-3304: [Alter] Support to alter table's replication factor
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17674 ) Change subject: KUDU-3304: [Alter] Support to alter table's replication factor .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h File src/kudu/client/client.h: http://gerrit.cloudera.org:8080/#/c/17674/4/src/kudu/client/client.h@1721 PS4, Line 1721: : /// Set the table's replication factor. : /// : /// @note The replication factor should be a odd number and range in : /// [1, --max_num_replicas] : /// : /// @param [in] replication_factor : /// The table's new replication factor. > Right -- in some aspects it's close to CreateTable() and DDL operations per Alexey and I discussed this offline at length. Introducing this method as a part of public API has deeper repercussions that are worth pondering. We strive to have our programatic public interface be as stable as possible, so ironing out these details up front would help alleviate much of the concern (or see below for an alternate path). - As pointed out, the semantics of altering the replication factor are a bit fuzzy, and if thinking about replication factor as a first class citizen of metadata that can be altered, there are things we should probably iron out before pushing this into the public client API. Namely, how long should the wait() method actually wait (majority replicated? fully replicated?)? And how should this metadata be synchronized with the HMS, if using Impala and/or Hive? These are certainly questions we can answer now (just spitballing, but we could update IsAlterTableDone() to check replica count), but my point is that these are questions that should be thought through before committing to public API. - Given the expectations of altering the replication factor may be different from other kinds of alter DDL, should this be a part of the KuduTableAlterer API at all, or should we introduce some other API that encompasses only changing replication factor? That way, we could build something more tailored replication factor (e.g. wait for full replication, majority replication, etc.). And beyond that, there may be other use cases to consider, like writing a single-replica table and then increasing the number of replicas once fully compacted as a means to potentially faster replication. Public API aside, if the main goal is a means to bump the replication factor via tooling, an alternate route would be to not expose this as C++ client API just yet, and instead rely on friendship and internal methods to alter the replication factor. We do something similar to set table types for the transactions system table, since table types are also not exposed to public API yet. See [1] for an example. Going this route would still allow us to introduce tooling still, but not lock us into a client API. Then, as we fix any rough edges with tooling, we can learn and deliberate on what the semantics for the public API ought to be, without releasing anything too unstable. [1] https://github.com/apache/kudu/blob/add7cb99fc8d6ab586e27d0ecd7d1698f0f45ae3/src/kudu/transactions/txn_system_client.cc#L130 -- To view, visit http://gerrit.cloudera.org:8080/17674 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3aa2d5b12c508ba761fa9410ad1a465cf31fb7d7 Gerrit-Change-Number: 17674 Gerrit-PatchSet: 7 Gerrit-Owner: Yingchun Lai Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Fri, 30 Jul 2021 23:58:37 + Gerrit-HasComments: Yes
[kudu-CR] [common] style update on schema.{cc,h} and partition.cc
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17742 ) Change subject: [common] style update on schema.{cc,h} and partition.cc .. [common] style update on schema.{cc,h} and partition.cc While I was updating the code around, I decided to separate changes related to the project's style guide and comparison of signed/unsigned integers into a separate patch. In addition, I updated the code to consistently use dedicated constants (like kColumnNotFound) throughout the code in schema.cc. This changelist doesn't contain any functional changes. Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60 Reviewed-on: http://gerrit.cloudera.org:8080/17742 Reviewed-by: Andrew Wong Tested-by: Kudu Jenkins --- M src/kudu/common/partition.cc M src/kudu/common/schema.cc M src/kudu/common/schema.h 3 files changed, 82 insertions(+), 86 deletions(-) Approvals: Andrew Wong: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/17742 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60 Gerrit-Change-Number: 17742 Gerrit-PatchSet: 2 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy
[kudu-CR] [common] style update on schema.{cc,h} and partition.cc
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17742 ) Change subject: [common] style update on schema.{cc,h} and partition.cc .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17742 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60 Gerrit-Change-Number: 17742 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Comment-Date: Fri, 30 Jul 2021 21:11:39 + Gerrit-HasComments: No
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/17643/5/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17643/5/src/kudu/common/partition_pruner.cc@269 PS5, Line 269: constrained_index = hash_bucket_schemas.size() - > warning: narrowing conversion from 'unsigned long' to signed type 'int' is Done -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 20:36:51 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17643 to look at the new patch set (#6). Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. [pruning] KUDU-2671: Pruning compatible with custom hash schemas. This patch introduces changes to the PartitionPruner class to be compatible with custom hash bucket schemas per range. There are three ways to set bounds on a scan. - Adding predicates (e.g. range/equality) - Setting lower and upper bound primary keys - Setting lower and upper bound partition keys This patch introduces changes that make the first two methods of setting bounds on a scan compatible with custom hash bucket schemas per range. The last way using partition keys is unstable and for internal use only. While it's not necessary for the last way to be compatible with per range hash bucket schemas, the entire pruning functionality will not be complete until PartitionPruner::RemovePartitionKeyRange() is modified. That work will be done in a follow up patch. Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 --- M src/kudu/common/partial_row.h M src/kudu/common/partition_pruner-test.cc M src/kudu/common/partition_pruner.cc M src/kudu/common/partition_pruner.h 4 files changed, 918 insertions(+), 373 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/6 -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [common] style update on schema.{cc,h} and partition.cc
Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17742 Change subject: [common] style update on schema.{cc,h} and partition.cc .. [common] style update on schema.{cc,h} and partition.cc While I was updating the code around, I decided to separate changes related to the project's style guide and comparison of signed/unsigned integers into a separate patch. In addition, I updated the code to consistently use dedicated constants (like kColumnNotFound) throughout the code in schema.cc. This changelist doesn't contain any functional changes. Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60 --- M src/kudu/common/partition.cc M src/kudu/common/schema.cc M src/kudu/common/schema.h 3 files changed, 82 insertions(+), 86 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/42/17742/1 -- To view, visit http://gerrit.cloudera.org:8080/17742 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I55d02fed53a39d311b43ea7bd956316abc998f60 Gerrit-Change-Number: 17742 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366 PS3, Line 1366: > Bad asserts result in the test being marked as failed, while a CHECK failur Thanks for the clarification. http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428 PS3, Line 1428: > Just to clarify the suggestion is: Done http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc@481 PS4, Line 481: auto& > nit: ampersand should be right after "auto" Done -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 5 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 20:25:52 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17643 to look at the new patch set (#5). Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. [pruning] KUDU-2671: Pruning compatible with custom hash schemas. This patch introduces changes to the PartitionPruner class to be compatible with custom hash bucket schemas per range. There are three ways to set bounds on a scan. - Adding predicates (e.g. range/equality) - Setting lower and upper bound primary keys - Setting lower and upper bound partition keys This patch introduces changes that make the first two methods of setting bounds on a scan compatible with custom hash bucket schemas per range. The last way using partition keys is unstable and for internal use only. While it's not necessary for the last way to be compatible with per range hash bucket schemas, the entire pruning functionality will not be complete until PartitionPruner::RemovePartitionKeyRange() is modified. That work will be done in a follow up patch. Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 --- M src/kudu/common/partial_row.h M src/kudu/common/partition_pruner-test.cc M src/kudu/common/partition_pruner.cc M src/kudu/common/partition_pruner.h 4 files changed, 918 insertions(+), 373 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/5 -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 5 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers
Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/17730 ) Change subject: [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers .. Patch Set 2: The patch set 2 has all the previously pointed out unaddressed things addressed -- To view, visit http://gerrit.cloudera.org:8080/17730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4 Gerrit-Change-Number: 17730 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Chennaka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 19:02:42 + Gerrit-HasComments: No
[kudu-CR] [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers
Hello Tidy Bot, Attila Bukor, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17730 to look at the new patch set (#2). Change subject: [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers .. [WIP] KUDU-1959 - Implement server startup progress page for tablet and master servers This task is broken down into the below tasks: 1. Adjust server startup to start webserver displaying static startup page 2. Implement startup WebUI using existing data 3. Enhance the startup WebUI with missing data 4. Expose startup WebUI data to metrics 5. Expose the metrics to CM for health monitoring This patch implements the step 1 from above - Adjust the server startup sequence to startup the web server and load an empty startup page right before the filesystem is read, tablets are bootstrapped and rpc server is started. This startup page will be used to display the startup progress which will be implemented in following patches. Manually validated the startup when security is enabled and filesystem is not present. The footer of the WebUI has the UUID by default. In case of new instances, since we are starting the Webserver before the UUID is generated the footer doesn’t contain UUID but it is set later when all the other pages are loaded. Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4 --- M src/kudu/server/CMakeLists.txt M src/kudu/server/server_base.cc A src/kudu/server/startup_path_handler.cc A src/kudu/server/startup_path_handler.h M src/kudu/server/tracing_path_handlers.h 5 files changed, 127 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/17730/2 -- To view, visit http://gerrit.cloudera.org:8080/17730 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1db1fcf16261d4ced1b3657a697766a5335271b4 Gerrit-Change-Number: 17730 Gerrit-PatchSet: 2 Gerrit-Owner: Abhishek Chennaka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 4: I'll look into the test -- it seems pretty flaky (the test dashboard says 11%!). -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 4 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:59:53 + Gerrit-HasComments: No
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 4: Verified+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366 PS3, Line 1366: > I just reused the existing test code that uses CHECK_OK(). What's the diffe Bad asserts result in the test being marked as failed, while a CHECK failure results in the process crashing. A failed test is much easier to diagnose and report compared to a crash. And depending on the test, e.g. in a multi-process test like one that uses an ExternalMiniCluster, a crash may not even result in a test failure without explicit checking for crashed processes. http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428 PS3, Line 1428: int8_t>(4, 4, 0), 7, 7); > nope, have to explicitly use `make_tuple`. Just to clarify the suggestion is: CreatePartitionSchemaPB({"A"}, { { columns, 3, 0 } }, ); or CreatePartitionSchemaPB({"A"}, { { { "A", "B" }, 3, 0 } }, ); which _should_ work. I think compilers have issues with inferring types for optionals (e.g. in the check() lambdas), but that shouldn't be the case here or with AddRangePartitionWithSchema(). http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17643/4/src/kudu/common/partition_pruner.cc@481 PS4, Line 481: auto & nit: ampersand should be right after "auto" -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 4 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:57:22 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Andrew Wong has removed a vote on this change. Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 4 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241)
[kudu-CR] [helm] fix usage of multiple directories
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17739 ) Change subject: [helm] fix usage of multiple directories .. [helm] fix usage of multiple directories This patch updates the /helm implementation to use the new FS_WAL_DIR and FS_DATA_DIRS environment variables. This patch also updates the README with steps I used to test the changes. I verified that I could update the number of directories in kudu/values.yaml, confirming that the servers used multiple directories. Below is a snippet of the logs from one tserver with multiple directories (5, set via kudu/values.yaml). Previously, even with multiple storage devices set, each server would use a single directory /var/lib/kudu for storage. $ kubectl logs kudu-tserver-1 I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server non-default flags: --use_hybrid_clock=false --fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4 --fs_wal_dir=/mnt/disk0 --webserver_doc_root=/opt/kudu/www --tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local --heap_profile_path=/tmp/kudu.1 --stderrthreshold=0 Tablet server version: kudu 1.16.0-SNAPSHOT revision 2ceec7749 build type RELEASE ... A rendered README can be found here: https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc Change-Id: I583dff69c809e0851052c98267740a3e70c60efa --- M kubernetes/helm/README.adoc M kubernetes/helm/kudu/templates/_helmutils.tpl M kubernetes/helm/kudu/templates/kudu-service.yaml M kubernetes/helm/kudu/values.yaml 4 files changed, 109 insertions(+), 23 deletions(-) Approvals: Kudu Jenkins: Verified Bankim Bhavsar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [docker] KUDU-3307: allow using multiple directories
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17738 ) Change subject: [docker] KUDU-3307: allow using multiple directories .. [docker] KUDU-3307: allow using multiple directories Previously, the Docker entrypoint would by default use a single root directory, suffixed with either 'master' or 'tserver'. This makes it a poor fit for production systems that may want to use Docker, since most reasonable deployments use multple data directories, and don't colocate WALs and data. This patch adds a couple of environment variables to the entrypoint, FS_WAL_DIR and FS_DATA_DIRS, that map directly to the --fs_wal_dir and --fs_data_dirs gflags that Kudu masters and tservers know and love. To maintain compatibility with existing guides, the existing single-root DATA_DIR remains, but the new environment variables, if set, take precedence. Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Reviewed-on: http://gerrit.cloudera.org:8080/17738 Tested-by: Kudu Jenkins Reviewed-by: Bankim Bhavsar --- M docker/kudu-entrypoint.sh 1 file changed, 43 insertions(+), 6 deletions(-) Approvals: Kudu Jenkins: Verified Bankim Bhavsar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/17738 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Gerrit-Change-Number: 17738 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [helm] fix usage of multiple directories
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17739 ) Change subject: [helm] fix usage of multiple directories .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:18:31 + Gerrit-HasComments: No
[kudu-CR] [docker] KUDU-3307: allow using multiple directories
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17738 ) Change subject: [docker] KUDU-3307: allow using multiple directories .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17738 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Gerrit-Change-Number: 17738 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:15:46 + Gerrit-HasComments: No
[kudu-CR] [helm] fix usage of multiple directories
Hello Kudu Jenkins, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17739 to look at the new patch set (#3). Change subject: [helm] fix usage of multiple directories .. [helm] fix usage of multiple directories This patch updates the /helm implementation to use the new FS_WAL_DIR and FS_DATA_DIRS environment variables. This patch also updates the README with steps I used to test the changes. I verified that I could update the number of directories in kudu/values.yaml, confirming that the servers used multiple directories. Below is a snippet of the logs from one tserver with multiple directories (5, set via kudu/values.yaml). Previously, even with multiple storage devices set, each server would use a single directory /var/lib/kudu for storage. $ kubectl logs kudu-tserver-1 I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server non-default flags: --use_hybrid_clock=false --fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4 --fs_wal_dir=/mnt/disk0 --webserver_doc_root=/opt/kudu/www --tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local --heap_profile_path=/tmp/kudu.1 --stderrthreshold=0 Tablet server version: kudu 1.16.0-SNAPSHOT revision 2ceec7749 build type RELEASE ... A rendered README can be found here: https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc Change-Id: I583dff69c809e0851052c98267740a3e70c60efa --- M kubernetes/helm/README.adoc M kubernetes/helm/kudu/templates/_helmutils.tpl M kubernetes/helm/kudu/templates/kudu-service.yaml M kubernetes/helm/kudu/values.yaml 4 files changed, 109 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/17739/3 -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [helm] fix usage of multiple directories
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17739 ) Change subject: [helm] fix usage of multiple directories .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG@22 PS1, Line 22: I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Ta > Just to be sure you modified storage.server.count in values.yaml to 5 while Yeah, updated the commit message to mention that. http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl File kubernetes/helm/kudu/templates/_helmutils.tpl: http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl@78 PS1, Line 78: {{range $index := untilStep 1 $num_dirs 1 -}}{{if ne $index 1}},{{end}}/mnt/disk{{ $index }}{{end}} > If storage.master.count happens to be 1 then the docker entry point script The logic here works since it leaves the field empty, and so the entrypoint should fall back on using the WAL directory. That said, I did find an issue with the entrypoint, fixed in the other patch. -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:11:19 + Gerrit-HasComments: Yes
[kudu-CR] [docker] KUDU-3307: allow using multiple directories
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17738 ) Change subject: [docker] KUDU-3307: allow using multiple directories .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh File docker/kudu-entrypoint.sh: http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@42 PS2, Line 42: Defines the root directory to use. Subdirectories are added depending on whether a " : echo " Kudu master or a Kudu tablet server is being deployed. Ignored if the FS_WAL_DIR " : echo " environment variable is set." > Nit: Might be worth adding a note about deprecating this variable. Done http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@52 PS2, Line 52: s > Nit: extra "to" Done -- To view, visit http://gerrit.cloudera.org:8080/17738 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Gerrit-Change-Number: 17738 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 18:11:10 + Gerrit-HasComments: Yes
[kudu-CR] [docker] KUDU-3307: allow using multiple directories
Hello Kudu Jenkins, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17738 to look at the new patch set (#3). Change subject: [docker] KUDU-3307: allow using multiple directories .. [docker] KUDU-3307: allow using multiple directories Previously, the Docker entrypoint would by default use a single root directory, suffixed with either 'master' or 'tserver'. This makes it a poor fit for production systems that may want to use Docker, since most reasonable deployments use multple data directories, and don't colocate WALs and data. This patch adds a couple of environment variables to the entrypoint, FS_WAL_DIR and FS_DATA_DIRS, that map directly to the --fs_wal_dir and --fs_data_dirs gflags that Kudu masters and tservers know and love. To maintain compatibility with existing guides, the existing single-root DATA_DIR remains, but the new environment variables, if set, take precedence. Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 --- M docker/kudu-entrypoint.sh 1 file changed, 43 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/38/17738/3 -- To view, visit http://gerrit.cloudera.org:8080/17738 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Gerrit-Change-Number: 17738 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [helm] fix usage of multiple directories
Hello Kudu Jenkins, Bankim Bhavsar, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17739 to look at the new patch set (#2). Change subject: [helm] fix usage of multiple directories .. [helm] fix usage of multiple directories This patch updates the /helm implementation to use the new FS_WAL_DIR and FS_DATA_DIRS environment variables. This patch also updates the README with steps I used to test the changes. I verified that I could update the number of directories in kudu/values.yaml, confirming that the servers used multiple directories. Below is a snippet of the logs from one tserver with multiple directories (5, set via kudu/values.yaml). Previously, even with multiple storage devices set, each server would use a single directory /var/lib/kudu for storage. Below is a snippet of the new behavior: $ kubectl logs kudu-tserver-1 I0730 00:36:07.404302 1 tablet_server_runner.cc:78] Tablet server non-default flags: --use_hybrid_clock=false --fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4 --fs_wal_dir=/mnt/disk0 --webserver_doc_root=/opt/kudu/www --tserver_master_addrs=kudu-master-0.kudu-masters.default.svc.cluster.local,kudu-master-1.kudu-masters.default.svc.cluster.local,kudu-master-2.kudu-masters.default.svc.cluster.local --heap_profile_path=/tmp/kudu.1 --stderrthreshold=0 Tablet server version: kudu 1.16.0-SNAPSHOT revision 2ceec7749 build type RELEASE ... A rendered README can be found here: https://github.com/andrwng/kudu/blob/helm_multidir/kubernetes/helm/README.adoc Change-Id: I583dff69c809e0851052c98267740a3e70c60efa --- M kubernetes/helm/README.adoc M kubernetes/helm/kudu/templates/_helmutils.tpl M kubernetes/helm/kudu/templates/kudu-service.yaml M kubernetes/helm/kudu/values.yaml 4 files changed, 109 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/17739/2 -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120)
[kudu-CR] [helm] fix usage of multiple directories
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17739 ) Change subject: [helm] fix usage of multiple directories .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17739/1//COMMIT_MSG@22 PS1, Line 22: --fs_data_dirs=/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4 Just to be sure you modified storage.server.count in values.yaml to 5 while testing your changes, right? http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl File kubernetes/helm/kudu/templates/_helmutils.tpl: http://gerrit.cloudera.org:8080/#/c/17739/1/kubernetes/helm/kudu/templates/_helmutils.tpl@78 PS1, Line 78: {{range $index := untilStep 1 $num_dirs 1 -}}{{if ne $index 1}},{{end}}/mnt/disk{{ $index }}{{end}} If storage.master.count happens to be 1 then the docker entry point script will be okay, right? -- To view, visit http://gerrit.cloudera.org:8080/17739 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I583dff69c809e0851052c98267740a3e70c60efa Gerrit-Change-Number: 17739 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 17:10:12 + Gerrit-HasComments: Yes
[kudu-CR] [tools] Add a tool to recover master data from tablet servers
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9490 ) Change subject: [tools] Add a tool to recover master data from tablet servers .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc File src/kudu/tools/master_rebuilder.cc: http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc@175 PS5, Line 175: > Wouldn't this brick single-replica tables? It would think it's under-replic Ah that's fair. It should actually be max(FLAGS_min_num_replicas, odd_reported_replicas). Still, let's do that in a follow-up. -- To view, visit http://gerrit.cloudera.org:8080/9490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6 Gerrit-Change-Number: 9490 Gerrit-PatchSet: 6 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Will Berkeley Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Fri, 30 Jul 2021 17:05:35 + Gerrit-HasComments: Yes
[kudu-CR] [docker] KUDU-3307: allow using multiple directories
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/17738 ) Change subject: [docker] KUDU-3307: allow using multiple directories .. Patch Set 2: (2 comments) Thanks for taking this up, Andrew. Change looks good, only minor comments. http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh File docker/kudu-entrypoint.sh: http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@42 PS2, Line 42: Defines the root directory to use. Subdirectories are added depending on whether a " : echo " Kudu master or a Kudu tablet server is being deployed. Ignored if the FS_WAL_DIR " : echo " environment variable is set." Nit: Might be worth adding a note about deprecating this variable. http://gerrit.cloudera.org:8080/#/c/17738/2/docker/kudu-entrypoint.sh@52 PS2, Line 52: to Nit: extra "to" -- To view, visit http://gerrit.cloudera.org:8080/17738 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I20e5f75c1f1f8280dca60de75255a4a948e44be1 Gerrit-Change-Number: 17738 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Wong Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Fri, 30 Jul 2021 16:54:25 + Gerrit-HasComments: Yes
[kudu-CR] [tools] Add a tool to recover master data from tablet servers
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/9490 ) Change subject: [tools] Add a tool to recover master data from tablet servers .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc File src/kudu/tools/master_rebuilder.cc: http://gerrit.cloudera.org:8080/#/c/9490/5/src/kudu/tools/master_rebuilder.cc@175 PS5, Line 175: > It's good, tables in the cluster may have different RFs, FLAGS_default_num_ Wouldn't this brick single-replica tables? It would think it's under-replicated with no majority, so it couldn't re-replicate to 3 automatically, or am I missing something? -- To view, visit http://gerrit.cloudera.org:8080/9490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If29e421d466a531ebad72e281ae27e74e458f8c6 Gerrit-Change-Number: 9490 Gerrit-PatchSet: 6 Gerrit-Owner: Will Berkeley Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Will Berkeley Gerrit-Reviewer: Yingchun Lai Gerrit-Comment-Date: Fri, 30 Jul 2021 13:13:59 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-3204: Add a metric for amount of available space
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17725 ) Change subject: KUDU-3204: Add a metric for amount of available space .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17725/2/src/kudu/server/server_base.cc File src/kudu/server/server_base.cc: http://gerrit.cloudera.org:8080/#/c/17725/2/src/kudu/server/server_base.cc@446 PS2, Line 446: // Calculates the total free space on all the data directories combined accounting for > So if I am understanding it right, you are suggesting a new function which Yes. It could be inlined too if you want to avoid the overhead of calling a function. -- To view, visit http://gerrit.cloudera.org:8080/17725 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58a7419847d080498aaf431d1dab098e1af71ad0 Gerrit-Change-Number: 17725 Gerrit-PatchSet: 3 Gerrit-Owner: Abhishek Chennaka Gerrit-Reviewer: Abhishek Chennaka Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 30 Jul 2021 13:12:09 + Gerrit-HasComments: Yes
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 ) Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. Patch Set 4: (23 comments) http://gerrit.cloudera.org:8080/#/c/17643/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17643/3//COMMIT_MSG@13 PS3, Line 13: : - Adding predicates (e.g. range/equality) : - Setting lower and upper bound primary keys : - Setting lower and upper bound partition keys : : This patch introduces changes that make the first two methods : > nit: remove the extra newlines between each bullet? Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc File src/kudu/common/partition.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc@1109 PS3, Line 1109: y_end(), > nit: parameter's name is different between the declaration and the definiti Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition.cc@1147 PS3, Line 1147: Status PartitionSchema::EncodeColumns(const ConstContiguousRow& row, : const vector& column_ids, : string* buf) { : for (int i = 0; i < column_ids.size(); i++) { : ColumnId column_id = column_ids[i]; : > +1 Will move these changes to a separate patch. http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1056 PS3, Line 1056: .column(0), ), > nit: consider a typedef for this? Maybe ColumnNamesNumBucketsAndSeed or som Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1079 PS3, Line 1079: check({ ColumnPredicate::Equality > nit: the indent for this line is off Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1100 PS3, Line 1100: Part > nit: for pointers, prefer being explicit and use auto* Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1103 PS3, Line 1103: nge_ > nit: same here Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1103 PS3, Line 1103: chema->add_col > nit: to avoid any ambiguity with hash_bucket_schemas, maybe call this hash_ Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1304 PS3, Line 1304: : // C < "a" : check({ ColumnPredicate::Range(schema.column(2), nullptr, )}, "", "", 0, 0); : : // C >= "m" : check({ ColumnPredicate::Range(schema.column(2), , nullptr)}, "", "", 0, 0); : : // partition key >= (hash=1, hash=0) : check({}, string("\0\0\0\1\0\0\0\0", 8), "", 8, 8); : : // partition key > Do you mind adding a scenario for a scheme like this but with custom hash b Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1326 PS3, Line 1326: > nit here and below where std::make_pair() is used: would it be enough to ha Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1355 PS3, Line 1355: > nit: add 'const' and turn into all-lowercase Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366 PS3, Line 1366: > Why not to use ASSERT_OK() here? I just reused the existing test code that uses CHECK_OK(). What's the difference between two? http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1381 PS3, Line 1381: size_ > nit: add NO_FATALS() here? Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1411 PS3, Line 1411: // PK < (2, 2, min) > Do you mind adding a scenario where the range for the primary key should le Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1428 PS3, Line 1428: int8_t>(4, 4, 0), 7, 7); > nit: would {columns, 3, 0} be enough here? nope, have to explicitly use `make_tuple`. http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1478 PS3, Line 1478: : ASSERT_OK(PartitionSchema::FromPB(pb, schema, _schema)); : > nit: make these constexpr? Done http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner.h@49 PS3, Line 49: > nit: it should be 4 spaces here per style guide Done
[kudu-CR] [pruning] KUDU-2671: Pruning compatible with custom hash schemas.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Andrew Wong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17643 to look at the new patch set (#4). Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. .. [pruning] KUDU-2671: Pruning compatible with custom hash schemas. This patch introduces changes to the PartitionPruner class to be compatible with custom hash bucket schemas per range. There are three ways to set bounds on a scan. - Adding predicates (e.g. range/equality) - Setting lower and upper bound primary keys - Setting lower and upper bound partition keys This patch introduces changes that make the first two methods of setting bounds on a scan compatible with custom hash bucket schemas per range. The last way using partition keys is unstable and for internal use only. While it's not necessary for the last way to be compatible with per range hash bucket schemas, the entire pruning functionality will not be complete until PartitionPruner::RemovePartitionKeyRange() is modified. That work will be done in a follow up patch. Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 --- M src/kudu/common/partial_row.h M src/kudu/common/partition_pruner-test.cc M src/kudu/common/partition_pruner.cc M src/kudu/common/partition_pruner.h 4 files changed, 913 insertions(+), 315 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/17643/4 -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 4 Gerrit-Owner: Mahesh Reddy Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy Gerrit-Reviewer: Tidy Bot (241)