Adar Dembo has posted comments on this change. (
http://gerrit.cloudera.org:8080/14306 )
Change subject: add a tool to create table
......................................................................
Patch Set 20:
(5 comments)
I experimented with JSON serialization and enums, and I learned that you can
specify an enum as a integer value _or_ as a string. So we should definitely
use enums throughout, as this allows both the ease of use of specifying a
string value or the terseness of specifying an integer value, and the enum
definition itself provides a good source of documentation.
Here's my code:
test.proto:
syntax = "proto2";
message Foo {
enum Color {
BLUE = 0;
RED = 1;
GREEN = 2;
}
optional Color c = 1;
}
test.cc:
#include <iostream>
#include <google/protobuf/stubs/status.h>
#include <google/protobuf/stubs/stringpiece.h>
#include <google/protobuf/util/json_util.h>
#include "test.pb.h"
using namespace std;
int main(int argc, char* argv[]) {
if (argc != 2) {
cout << "Usage: " << argv[0] << " <json>" << endl;
return 1;
}
cout << "Attempting to parse JSON: \'" << argv[1] << "\'" << endl;
Foo message;
const auto& google_status =
google::protobuf::util::JsonStringToMessage(argv[1], &message);
if (!google_status.ok()) {
cout << "unable to parse JSON: " <<
google_status.error_message().ToString() << endl;
return 1;
}
cout << message.DebugString();
return 0;
}
I compiled it with:
- protoc --cpp_out . test.proto
- g++ -o test test.cc test.pb.cc -I. -lprotobuf
Then I ran it a few times:
$ ./test "{ c : 1 }"
Attempting to parse JSON: '{ c : 1 }'
c: RED
$ ./test "{ c : red }"
Attempting to parse JSON: '{ c : red }'
unable to parse JSON: Unexpected token.
{ c : red }
^
$ ./test "{ c : 'red' }"
Attempting to parse JSON: '{ c : 'red' }'
c: RED
$ ./test "{ c : 'green' }"
Attempting to parse JSON: '{ c : 'green' }'
c: GREEN
$ ./test "{ c : 'pink' }"
Attempting to parse JSON: '{ c : 'pink' }'
unable to parse JSON: c: invalid value "pink" for type TYPE_ENUM
$ ./test "{ c : 2 }"
Attempting to parse JSON: '{ c : 2 }'
c: GREEN
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto
File src/kudu/tools/tool.proto:
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@335
PS18, Line 335: // GROUP_VARINT encoding is deprecated and no longer
implemented.
: GROUP_VARINT = 3;
Should be removed in new code.
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@359
PS18, Line 359: Default type is AUTO_ENCODING, the types
: // PLAIN_ENCODING, PREFIX_ENCODING, GROUP_VARINT, RLE,
: // DICT_ENCODING, BIT_SHUFFLE are also supported.
Not necessary; it's obvious from the enum definition.
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool.proto@363
PS18, Line 363: Default type is DEFAULT_COMPRESSION, the
: // types NO_COMPRESSION, SNAPPY, LZ4, ZLIB are also supported.
Not necessary; it's obvious from the enum definition.
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:
http://gerrit.cloudera.org:8080/#/c/14306/18/src/kudu/tools/tool_action_table.cc@910
PS18, Line 910: column.encoding()));
> Here if the user uses an incorrect value, such as '"encoding" : 100', the v
Right, the way in which we usually handle that is by defining an
UNKNOWN_ENCODING (or UNKNOWN_COMPRESSION or whatever) enum entry that's first
in the PB enum. You could set it to a high value (like 999) to ensure that
users don't accidentally specify it, or just set it to 0. Either way, its
presence means that invalid values will be set to UNKNOWN_ENCODING, which you
can then filter for in the code and return an error if you see them.
You could do the same for INCLUSIVE/EXCLUSIVE if you want to catch mistakes
there too (vs. converting a mistake into INCLUSIVE).
Oh, I understand the problem now: you want these new enum values to match up
with the values defined in schema.h so you can pass them through 1-1. I'm not
sure if we should do that; it'll make it difficult to evolve one side without
also changing the other side.
http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc
File src/kudu/tools/tool_action_table.cc:
http://gerrit.cloudera.org:8080/#/c/14306/20/src/kudu/tools/tool_action_table.cc@1017
PS20, Line 1017: unique_ptr<KuduTableCreator>
table_creator(client->NewTableCreator());
> 'table_creator' must be defined behind 'kudu_schema', otherwise in ASAN and
Makes sense; client.h says " /// Schema to use. Must remain valid for the
lifetime of the builder.".
--
To view, visit http://gerrit.cloudera.org:8080/14306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0bce4733a504f8ef5f024089a16fe3c6f1e493f1
Gerrit-Change-Number: 14306
Gerrit-PatchSet: 20
Gerrit-Owner: YangSong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: YangSong <[email protected]>
Gerrit-Comment-Date: Wed, 16 Oct 2019 17:05:12 +0000
Gerrit-HasComments: Yes