Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/12481 )
Change subject: IMPALA-7368: Add initial support for DATE type ...................................................................... Patch Set 3: (9 comments) http://gerrit.cloudera.org:8080/#/c/12481/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12481/3//COMMIT_MSG@22 PS3, Line 22: - from BIGINT, INT, SMALLINT, TINYINT to DATE. The source value is : interpreted as a number of days since the epoch. : - from DATE to BIGINT, INT, SMALLINT, TINYINT. The resulting : integer is the number of days since epoch. : - from DOUBLE, FLOAT, DECIMAL to DATE. The source value's : fractional part is ignored, the integer part is interpreted as a : number of days since epoch. : - from DATE to DOUBLE, FLOAT, DECIMAL. The resulting value is the : number of days since epoch. Hive does not support these casts (I checked with Hive 2.1.1), are you sure that it is useful to support them in Impala? My first impression is that these casts are more confusing than useful. http://gerrit.cloudera.org:8080/#/c/12481/3/be/src/udf/udf.h File be/src/udf/udf.h: http://gerrit.cloudera.org:8080/#/c/12481/3/be/src/udf/udf.h@586 PS3, Line 586: /// Date value represented as days since epoch 1970-01-01. The valid range and the fact that the value is interpreted as proleptic Gregorian could be also mentioned here. http://gerrit.cloudera.org:8080/#/c/12481/3/be/src/util/string-parser-test.cc File be/src/util/string-parser-test.cc: http://gerrit.cloudera.org:8080/#/c/12481/3/be/src/util/string-parser-test.cc@526 PS3, Line 526: TestDateValue("0000-01-01", DateValue(0, 1, 1), StringParser::PARSE_SUCCESS); : TestDateValue("9999-12-31", DateValue(9999, 12, 31), StringParser::PARSE_SUCCESS); Please add the "wrong side" of the edge values: -0001-12-31 (Hive had a bug with these kinds of values) 10000-01-01 http://gerrit.cloudera.org:8080/#/c/12481/3/be/src/util/string-parser-test.cc@531 PS3, Line 531: TestDateValue("2-11-10", invalid_date, StringParser::PARSE_FAILURE); Please add tests where month/day have the correct number of characters but the value is invalid, e.g 2018-00-10, 2018-13-10, 2018-01-0, 2018-01-32, 2019-02-39 http://gerrit.cloudera.org:8080/#/c/12481/3/common/thrift/Exprs.thrift File common/thrift/Exprs.thrift: http://gerrit.cloudera.org:8080/#/c/12481/3/common/thrift/Exprs.thrift@56 PS3, Line 56: // string representation of date formatted as yyyy-MM-dd. : 1: required string value; Why do you use string instead of i32 to represent the date? TTimestampLiteral uses binary representation instead of string. http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/main/java/org/apache/impala/analysis/DateLiteral.java File fe/src/main/java/org/apache/impala/analysis/DateLiteral.java: http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/main/java/org/apache/impala/analysis/DateLiteral.java@32 PS3, Line 32: yyy Isn't it yyyy? http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java File fe/src/main/java/org/apache/impala/analysis/StringLiteral.java: http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java@155 PS3, Line 155: isDateType By implementing the Date type the name of this function became very misleading. Can you rename it to something like isDateOrTimeType()? http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java: http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java@194 PS3, Line 194: AnalyzesOk("select cast (cast ('1970-10-10 10:00:00.123' as timestamp) as date)"); What will happen if some writes select cast('1970-10-10 10:00:00.123' as date) ? Will it return NULL without warning? http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java: http://gerrit.cloudera.org:8080/#/c/12481/3/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@368 PS3, Line 368: assertToSqlWithImplicitCasts(ctx, "select * from functional.alltypes, " : + "functional.date_tbl where timestamp_col = date_col", : "SELECT * FROM functional.alltypes, functional.date_tbl " : + "WHERE timestamp_col = CAST(date_col AS TIMESTAMP)"); I saw some weirdness around this in Hive, it seems to convert everything to string: select cast("1970-01-01" as timestamp) > cast("1970-01-01" as date); return True What happened is that the appended 00:00:00 part made the timestamp version larger. I prefer to compare them as timestamps, but this difference could be documented somewhere. -- To view, visit http://gerrit.cloudera.org:8080/12481 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea8155ef09557e0afa2f8b2d0b2dc9d0896dc30f Gerrit-Change-Number: 12481 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 15 Feb 2019 17:03:12 +0000 Gerrit-HasComments: Yes