Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14306 )

Change subject: add a tool to create table
......................................................................


Patch Set 16:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h
File src/kudu/client/schema.h:

http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h@170
PS16, Line 170:   /// @param [in] String representation of the column encoding 
type
              :   ///   Column encoding type.
              :   /// @return Encoding type of the column.
              :   static Status StringToEncodingType(const std::string& 
encoding,
              :       EncodingType* type);
              :
              :   /// @param [in] String representation of the column 
compression type
              :   ///   Column compression type.
              :   /// @return Compression type of the column.
              :   static Status StringToCompressionType(const std::string& 
compression,
              :       CompressionType* type);
Need to update the Doxygen comments.


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.h@212
PS16, Line 212:   /// @param [in] String representation of the column data type
              :   ///   Column data type.
              :   /// @return Data type of the column.
Update this too.


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.cc
File src/kudu/client/schema.cc:

PS16:
You can simplify (*type) to just *type


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/client/schema.cc@210
PS16, Line 210:   Status s = Status::OK();
Status::OK() is the default value for Status; you don't need the assignment.

Below too.


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:

http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool.proto@342
PS16, Line 342:   // The encoding type of column. Default type is 
AUTO_ENCODING, the types
              :   // PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE, 
DICT_ENCODING,
              :   // BIT_SHUFFLE are also supported.
              :   optional string encoding = 6;
              :   // The compression of column. Default type is 
DEFAULT_COMPRESSION, the types
              :   // NO_COMPRESSION, SNAPPY, LZ4, ZLIB are also supported.
              :   optional string compression = 7;
Hmm, we should actually use PB enums for these, as we did for 
RangePartitionPB.BoundPB.Type.


http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc@752
PS15, Line 752:     *value = KuduValue::FromDecimal(precision, scale);
> Here I'm not sure if decimal type supports setting default value. And no si
Did you see what I wrote earlier? 
https://gerrit.cloudera.org/c/14306/10/src/kudu/tools/tool_action_table.cc#753


http://gerrit.cloudera.org:8080/#/c/14306/15/src/kudu/tools/tool_action_table.cc@1290
PS15, Line 1290:                         "\"lower_bound\": 
{\"bound_type\":1,\"bound_values\": [\"2\"]},\""
> Here I just list an example, maybe too simple.
I think that's too verbose (and hard to format) to be useful help. Instead just 
point people at tool.proto. We'll make sure the comments there are 
understandable and clear.


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:

http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@806
PS16, Line 806:   KuduColumnStorageAttributes::CompressionType compression_type 
=
You don't need to initialize this; if we make it to L810, we're guaranteed to 
have a value written into it.

Same below, and for encoding type.


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@888
PS16, Line 888:       { return json_value; }, ",");
Ah, I was hoping L878-886 would be in the body of JoinMapped. Like this:

  if (values.empty() || values.size() != range_columns.size()) {
    return Status::InvalidArgument(Substitute(
        "Invalid range value size, value size should be equal to number of 
range keys."));
  }
  int i = 0;
  string joined = JoinMapped(values, [&](const string& v) {
    auto data_type = range_columns[i++].second;
    if (data_type == KuduColumnSchema::STRING ||
        data_type == KuduColumnSchema::BINARY) {
      return "\"" + v + "\"";
    }
    return v;
  }, ",");
  *json_value = "[" + joined + "]";
  return Status::OK();


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@940
PS16, Line 940:   vector<string> primary_keys;
              :   for (const auto& primary_key : schema.key_column_names()) {
              :     primary_keys.push_back(primary_key);
              :   }
              :   b->SetPrimaryKey(primary_keys);
Does this work instead?

  b->SetPrimaryKey(schema.key_column_names().begin(),
                   schema.key_column_names().end());

Or maybe:

  b->SetPrimaryKey(vector<string>(schema.key_column_names().begin(),
                                  schema.key_column_names().end()));


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@988
PS16, Line 988:     KuduTableCreator::RangePartitionBound upper_bound_type =
              :         KuduTableCreator::INCLUSIVE_BOUND;
This should default to EXCLUSIVE_BOUND (as per add_range_partition() in 
client.h).


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@1000
PS16, Line 1000:       } else if (bound.lower_bound().bound_type() ==
I think it's OK to assume that if the bound type wasn't INCLUSIVE, it is 
EXCLUSIVE. Like this:

  if (bound_type == INCLUSIVE) {
    ...
  } else {
    DCHECK_EQ(EXCLUSIVE, bound_type);
    ...
  }

Then you can avoid duplicating the ConvertToKuduPartialRow calls.

In fact, you could simplify this further:

  if (has_lower_bound) {
    RETURN_NOT_OK(ToJsonPartialRow(...));
    RETURN_NOT_OK(ConvertToKuduPartialRow(...));
    lower_bound_type = bound_type == INCLUSIVE ? INCLUSIVE_BOUND : 
EXCLUSIVE_BOUND;
  }


http://gerrit.cloudera.org:8080/#/c/14306/16/src/kudu/tools/tool_action_table.cc@1057
PS16, Line 1057:   RETURN_NOT_OK(b.Build(&kudu_schema));
Could you move this (and the definition of KuduSchemaBuilder) into 
ParseTableSchema, so that its output is a KuduSchema?



--
To view, visit http://gerrit.cloudera.org:8080/14306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1
Gerrit-Change-Number: 14306
Gerrit-PatchSet: 16
Gerrit-Owner: YangSong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <[email protected]>
Gerrit-Comment-Date: Tue, 15 Oct 2019 22:57:26 +0000
Gerrit-HasComments: Yes

Reply via email to