Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12247 )

Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet
......................................................................


Patch Set 9:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/12247/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12247/9//COMMIT_MSG@17
PS9, Line 17: INT96_NANOS
I think the NANOS suffix is somewhat misleading as it suggests that this is 
just ns precision with a larger range. Since we also want to discourage use of 
this I suggest to name it INT96_DEPRECATED (or INT96_IMPALA since this is where 
it came from).


http://gerrit.cloudera.org:8080/#/c/12247/9//COMMIT_MSG@42
PS9, Line 42: net
typo


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/exec/parquet/parquet-metadata-utils.h
File be/src/exec/parquet/parquet-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/exec/parquet/parquet-metadata-utils.h@61
PS9, Line 61: that the type
nit: that the input type..


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/exec/parquet/parquet-metadata-utils.cc@150
PS9, Line 150:   parquet::TimestampType timestamp_type;
nit: declare these before using them?


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-test.cc@713
PS9, Line 713:   int64_t tm_min_micros, tm_min_millis;
I'd declare that when you use it


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-test.cc@721
PS9, Line 721:   // Add 250ns and check the value is rounded down
Maybe prefix each of these comments with a comment or assert of the current 
value of min_time. Otherwise one has to count how many times we added 250ns all 
the time.


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-test.cc@737
PS9, Line 737:   EXPECT_TRUE(min_date.FloorUtcToUnixTimeMicros(&tm_min_micros));
This test does not correspond to the comment above, does it? I.e., it is 
rounding towards -?


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-test.cc@740
PS9, Line 740: 250us
The switch between us and ns confused me for a bit, not sure if there's a good 
way to make it stand out more.

Also, some of these blocks add to the current value of min_date, while other 
blocks reset it. Maybe you can make more clear by not writing to it and using 
new variables instead (here and in the other blocks)


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.h@239
PS9, Line 239: 2262
The famous Impala Year-2262 problem being conceived and I get to be part of it. 
:)


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.h@375
PS9, Line 375: 1970.01.01
nit: 1970-01-01


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.inline.h@98
PS9, Line 98: 24 * 60 * 60
While you're here, this could probably be a constant, too, given it's used in 
other places.


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.inline.h@133
PS9, Line 133:
Nit: I think some of these newlines could go.


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.inline.h@156
PS9, Line 156:     static_cast<kudu::int128_t>(unixtime_seconds) * NANOS_PER_SEC
nit: indent 4 spaces


http://gerrit.cloudera.org:8080/#/c/12247/9/be/src/runtime/timestamp-value.inline.h@160
PS9, Line 160:       || nanos128 >  std::numeric_limits<int64_t>::max()) return 
false;
nit: If it doesn't fit on one line, it should have braces {}


http://gerrit.cloudera.org:8080/#/c/12247/9/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12247/9/tests/query_test/test_insert_parquet.py@848
PS9, Line 848:       ColumnStats('ts', -1001001, 1001001, 0)
Can you add a test for the case that values don't fit into 64bit nanos?


http://gerrit.cloudera.org:8080/#/c/12247/9/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/12247/9/tests/util/get_parquet_metadata.py@180
PS9, Line 180: normpath
Why is "normpath" needed here?


http://gerrit.cloudera.org:8080/#/c/12247/9/tests/util/get_parquet_metadata.py@180
PS9, Line 180:   table_dir = tmp_dir + "/" + 
os.path.basename(os.path.normpath(hdfs_path))
use os.path.join



--
To view, visit http://gerrit.cloudera.org:8080/12247
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f
Gerrit-Change-Number: 12247
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-Comment-Date: Wed, 20 Mar 2019 17:40:25 +0000
Gerrit-HasComments: Yes

Reply via email to