Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 )
Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet ...................................................................... Patch Set 3: (7 comments) About 'parquet-tools dump': it would be great to use it during Impala EE tests, but adding it as a dependency would also mean +1 way Impala builds can break, so I am not sure at the moment. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc File be/src/exec/parquet/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@400 PS3, Line 400: != > Shouldn't it be '=='? It is intentionally !=. -1 means variable length, and plain_encoded_value_size_ should remain -1 in that case. Actually this code led to DCHECK error during the writing of some decimals, which was not caught by the tests I ran locally. After running to issues on jenkins, I moved this logic Int64TimestampColumnWriterBase. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/hdfs-parquet-table-writer.cc@583 PS3, Line 583: Overrides of this function are expected to set 'result_' > I think an output parameter would be more conventional. I prefer the current solution - my guess is that adding an additional argument to a non-inline functions will make it slower, while writing to result_ is cheaper as 'this' is used during the function call anyway. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc@158 PS3, Line 158: // converted_type is not set because older readers that do not use logical types > Ah, yes, good point, I haven't noticed we are only dealing with LocalDateTi Done http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h File be/src/runtime/timestamp-value.inline.h: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h@135 PS3, Line 135: (NANOS_PER_MILLI / 2) > Might be worth to add a short comment about that we do the rounding here. The whole rounding logic is under discussion - Impala uses rounding to microsec when writing Kudu, but I think that truncating towards negative infinity would be better. I sent an email to [email protected] about this topic, I plan to wait for an agreement there before changing these functions. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/runtime/timestamp-value.inline.h@151 PS3, Line 151: kudu::int128_t nanos128 = : static_cast<kudu::int128_t>(unixtime_seconds) * NANOS_PER_SEC : + time_.fractional_seconds(); : : if (nanos128 < std::numeric_limits<int64_t>::min() : || nanos128 > std::numeric_limits<int64_t>::max()) return false; > I think we can avoid using int128_t here. We now what are the min and max u I agree, but I am waiting for a decision in rounding vs truncating, see line 135. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc@728 PS3, Line 728: if (iequals(value, "INT96_NANO")) { > You are right, I think that I messed something up like running the tests wi Done http://gerrit.cloudera.org:8080/#/c/12247/3/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/12247/3/common/thrift/ImpalaInternalService.thrift@1 PS3, Line 1: > nit: accidental new line Done -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Zoltan Ivanfi <[email protected]> Gerrit-Comment-Date: Wed, 23 Jan 2019 20:50:38 +0000 Gerrit-HasComments: Yes
