Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12247 )

Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet
......................................................................


Patch Set 3:

(7 comments)

About 'parquet-tools dump': it would be great to use it during Impala EE tests, 
but adding it as a dependency would also mean +1 way Impala builds can break, 
so I am not sure at the moment.

http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc
File be/src/exec/parquet/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@400
PS3, Line 400: !=
> Shouldn't it be '=='?
It is intentionally !=. -1 means variable length, and plain_encoded_value_size_ 
should remain -1 in that case.

Actually this code led to DCHECK error during the writing of some decimals, 
which was not caught by the tests I ran locally. After running to issues on 
jenkins, I moved this logic Int64TimestampColumnWriterBase.


http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@583
PS3, Line 583: Overrides of this function are expected to set 'result_'
> I think an output parameter would be more conventional.
I prefer the current solution - my guess is that adding an additional argument 
to a non-inline functions will make it slower, while writing to result_ is 
cheaper as 'this' is used during the function call anyway.


http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc@158
PS3, Line 158:   // converted_type is not set because older readers that do not 
use logical types
> Ah, yes, good point, I haven't noticed we are only dealing with LocalDateTi
Done


http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h
File be/src/runtime/timestamp-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h@135
PS3, Line 135: (NANOS_PER_MILLI / 2)
> Might be worth to add a short comment about that we do the rounding here.
The whole rounding logic is under discussion - Impala uses rounding to microsec 
when writing Kudu, but I think that truncating towards negative infinity would 
be better. I sent an email to [email protected] about this topic, I plan 
to wait for an agreement there before changing these functions.


http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h@151
PS3, Line 151:   kudu::int128_t nanos128 =
             :     static_cast<kudu::int128_t>(unixtime_seconds) * NANOS_PER_SEC
             :     + time_.fractional_seconds();
             :
             :   if (nanos128 <  std::numeric_limits<int64_t>::min()
             :       || nanos128 >  std::numeric_limits<int64_t>::max()) return 
false;
> I think we can avoid using int128_t here. We now what are the min and max u
I agree, but I am waiting for a decision in rounding vs truncating, see line 
135.


http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc@728
PS3, Line 728:         if (iequals(value, "INT96_NANO")) {
> You are right, I think that I messed something up like running the tests wi
Done


http://gerrit.cloudera.org:8080/#/c/12247/3/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/12247/3/common/thrift/ImpalaInternalService.thrift@1
PS3, Line 1:
> nit: accidental new line
Done



--
To view, visit http://gerrit.cloudera.org:8080/12247
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f
Gerrit-Change-Number: 12247
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Zoltan Ivanfi <[email protected]>
Gerrit-Comment-Date: Wed, 23 Jan 2019 20:50:38 +0000
Gerrit-HasComments: Yes

Reply via email to