Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12267 )

Change subject: IMPALA-4018 Part1: Add FORMAT clause in CAST()
......................................................................


Patch Set 3:

(1 comment)

Key question here is the syntax of the format string. There are two questions.

First, does our syntax here have to be identical to that of Hive? As I 
understand it, Impala can write view definitions to HMS that Hive should be 
able to read and visa/versa. If we used a different format string than Hive, 
would we know which format to use for a given view?

The other consideration is that we'd like to be ISO SQL compliant. The SQL 
formats are similar to, but distinct from, the Java8 date/time formats. If we 
go the ISO SQL route, we'd have to parse the SQL format into the Impala (that 
is, Java 8) format.

Finally, if we use ISO SQL formats in CAST, we should also create new functions 
to do the same tasks using the SQL format strings. (These would be in addition 
to the existing Hive-compatible functions.) Unfortunately, unless Hive adds the 
same functions, these will break view compatibility if used in views.

http://gerrit.cloudera.org:8080/#/c/12267/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/12267/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@220
PS2, Line 220:     }
> It would also be doable. I know because I went for that approach initially
Probably doable. As you noted, we load functions on startup. Cast functions are 
declared in CastExpr.initBuiltins(). You would have to specially handle the 
String --> Timestamp and Timestamp --> String functions. The code already has a 
number of type-specific rules.

Then, when analyzing the cast, the FORMAT string would be moved into the third 
argument position in the AST. From here on, the normal function mechanism 
should take you the rest of the way.

You would need corresponding changes on the BE. Basically, if the format is 
null, you'd treat the call as a two-arg cast.

In fact, you may even be able to register both two- and three- arg overloads of 
the function and choose between them depending on whether a FORMAT string 
exists.



--
To view, visit http://gerrit.cloudera.org:8080/12267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia514aaa9e8f5487d396587d5ed24c7348a492697
Gerrit-Change-Number: 12267
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Greg Rahn <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Paul Rogers <[email protected]>
Gerrit-Comment-Date: Fri, 01 Feb 2019 06:10:44 +0000
Gerrit-HasComments: Yes

Reply via email to