Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14306 )
Change subject: add a tool to create table ...................................................................... Patch Set 16: (13 comments) http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h File src/kudu/client/schema.h: http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h@170 PS16, Line 170: /// @param [in] String representation of the column encoding type : /// Column encoding type. : /// @return Encoding type of the column. : static Status StringToEncodingType(const std::string& encoding, : EncodingType* type); : : /// @param [in] String representation of the column compression type : /// Column compression type. : /// @return Compression type of the column. : static Status StringToCompressionType(const std::string& compression, : CompressionType* type); Need to update the Doxygen comments. http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h@212 PS16, Line 212: /// @param [in] String representation of the column data type : /// Column data type. : /// @return Data type of the column. Update this too. http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.cc File src/kudu/client/schema.cc: PS16: You can simplify (*type) to just *type http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.cc@210 PS16, Line 210: Status s = Status::OK(); Status::OK() is the default value for Status; you don't need the assignment. Below too. http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool.proto File src/kudu/tools/tool.proto: http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool.proto@342 PS16, Line 342: // The encoding type of column. Default type is AUTO_ENCODING, the types : // PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE, DICT_ENCODING, : // BIT_SHUFFLE are also supported. : optional string encoding = 6; : // The compression of column. Default type is DEFAULT_COMPRESSION, the types : // NO_COMPRESSION, SNAPPY, LZ4, ZLIB are also supported. : optional string compression = 7; Hmm, we should actually use PB enums for these, as we did for RangePartitionPB.BoundPB.Type. http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc File src/kudu/tools/tool_action_table.cc: http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc@752 PS15, Line 752: *value = KuduValue::FromDecimal(precision, scale); > Here I'm not sure if decimal type supports setting default value. And no si Did you see what I wrote earlier? https://gerrit.cloudera.org/c/14306/10/src/kudu/tools/tool_action_table.cc#753 http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc@1290 PS15, Line 1290: "\"lower_bound\": {\"bound_type\":1,\"bound_values\": [\"2\"]},\"" > Here I just list an example, maybe too simple. I think that's too verbose (and hard to format) to be useful help. Instead just point people at tool.proto. We'll make sure the comments there are understandable and clear. http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc File src/kudu/tools/tool_action_table.cc: http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@806 PS16, Line 806: KuduColumnStorageAttributes::CompressionType compression_type = You don't need to initialize this; if we make it to L810, we're guaranteed to have a value written into it. Same below, and for encoding type. http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@888 PS16, Line 888: { return json_value; }, ","); Ah, I was hoping L878-886 would be in the body of JoinMapped. Like this: if (values.empty() || values.size() != range_columns.size()) { return Status::InvalidArgument(Substitute( "Invalid range value size, value size should be equal to number of range keys.")); } int i = 0; string joined = JoinMapped(values, [&](const string& v) { auto data_type = range_columns[i++].second; if (data_type == KuduColumnSchema::STRING || data_type == KuduColumnSchema::BINARY) { return "\"" + v + "\""; } return v; }, ","); *json_value = "[" + joined + "]"; return Status::OK(); http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@940 PS16, Line 940: vector<string> primary_keys; : for (const auto& primary_key : schema.key_column_names()) { : primary_keys.push_back(primary_key); : } : b->SetPrimaryKey(primary_keys); Does this work instead? b->SetPrimaryKey(schema.key_column_names().begin(), schema.key_column_names().end()); Or maybe: b->SetPrimaryKey(vector<string>(schema.key_column_names().begin(), schema.key_column_names().end())); http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@988 PS16, Line 988: KuduTableCreator::RangePartitionBound upper_bound_type = : KuduTableCreator::INCLUSIVE_BOUND; This should default to EXCLUSIVE_BOUND (as per add_range_partition() in client.h). http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@1000 PS16, Line 1000: } else if (bound.lower_bound().bound_type() == I think it's OK to assume that if the bound type wasn't INCLUSIVE, it is EXCLUSIVE. Like this: if (bound_type == INCLUSIVE) { ... } else { DCHECK_EQ(EXCLUSIVE, bound_type); ... } Then you can avoid duplicating the ConvertToKuduPartialRow calls. In fact, you could simplify this further: if (has_lower_bound) { RETURN_NOT_OK(ToJsonPartialRow(...)); RETURN_NOT_OK(ConvertToKuduPartialRow(...)); lower_bound_type = bound_type == INCLUSIVE ? INCLUSIVE_BOUND : EXCLUSIVE_BOUND; } http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@1057 PS16, Line 1057: RETURN_NOT_OK(b.Build(&kudu_schema)); Could you move this (and the definition of KuduSchemaBuilder) into ParseTableSchema, so that its output is a KuduSchema? -- To view, visit http://gerrit.cloudera.org:8080/14306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1 Gerrit-Change-Number: 14306 Gerrit-PatchSet: 16 Gerrit-Owner: YangSong <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: YangSong <[email protected]> Gerrit-Comment-Date: Tue, 15 Oct 2019 22:57:26 +0000 Gerrit-HasComments: Yes
