[Impala-CR](cdh5-trunk) IMPALA-1903: Add support for partitioning by TIMESTAMP

Dimitris Tsirogiannis (Code Review) Thu, 10 Mar 2016 16:42:07 -0800

Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-1903: Add support for partitioning by TIMESTAMP
......................................................................

Patch Set 3:

(6 comments)

Wanted to flush some responses to previous comments. Starting a new round now.

http://gerrit.cloudera.org:8080/#/c/1621/3/be/src/exprs/expr.cc
File be/src/exprs/expr.cc:

Line 209: else {
: *expr = pool->Add(new NullLiteral(texpr_node));
: }
> Some Hive timestamps are invalid Impala timestamps, like ones before 1400AD
It is debatable whether turning invalid values to NULLs is the correct thing to
do, but I can accept that for consistency reasons. For the same reasons though
I think creating a null partition because a bogus value was passed on the DDL
is not consistent with the existing behavior for other data types.

http://gerrit.cloudera.org:8080/#/c/1621/3/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

Line 358: CheckParse
> Kind of. Parse actually returns a result by modifying parameters 3 and 4. T
Why not just Parse?

http://gerrit.cloudera.org:8080/#/c/1621/3/fe/src/main/java/com/cloudera/impala/analysis/TimestampLiteral.java
File fe/src/main/java/com/cloudera/impala/analysis/TimestampLiteral.java:

Line 84: public int compareTo(LiteralExpr o) {
> I think so, because partition pruning uses the backend. In particular, some
I am not so sure about that. If this Literal implements compareTo, it sends the
wrong message that I can use it in the FE to perform static partitioning which
as we know is not true because parsing is performed in the backend.

http://gerrit.cloudera.org:8080/#/c/1621/3/fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java:

Line 182: slot.getType().isTimestamp() || bindingExpr.getType().isTimestamp()
> It looks to me like BinaryPredicates can compare different types. If that's
But why do you care about the type of the binding expr?

http://gerrit.cloudera.org:8080/#/c/1621/3/testdata/workloads/functional-query/queries/QueryTest/partition-col-types.test
File
testdata/workloads/functional-query/queries/QueryTest/partition-col-types.test:

Line 60: timestamp_col=__HIVE_DEFAULT_PARTITION__
> Hive does not allow the partition to be added.
My main point is that I don't believe we should convert 'invalid timestamp'
into a NULL and create the partition. I believe an error should be thrown in a
case like this.

http://gerrit.cloudera.org:8080/#/c/1621/3/tests/metadata/test_recover_partitions.py
File tests/metadata/test_recover_partitions.py:

Line 196: 1987-05-29 15:45:44.6
> Which one? The two paths in the comment are intentionally different.
There are two paths mentioned in L188-189. The one used here doesn't match any
of them. Is this intentional? Maybe I am missing something.. too many zeroes :P

--
To view, visit http://gerrit.cloudera.org:8080/1621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icad7dcdc1b199cce9483dc414072bbe24efd625c
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) IMPALA-1903: Add support for partitioning by TIMESTAMP

Reply via email to