Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5338: Fix Kudu timestamp column default values ......................................................................
Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/6936/2/be/src/exprs/timestamp-functions.h File be/src/exprs/timestamp-functions.h: Line 106: static TimestampVal TimestampFromUnixMicros(FunctionContext* context, > the other functions don't have a "Timestamp" prefix, leave it out here as w The difference is that this returns a TimestampVal, those return a StringVal. I was hoping to differentiate. Does that seem OK or would you still prefer no prefix? http://gerrit.cloudera.org:8080/#/c/6936/2/be/src/runtime/timestamp-value.inline.h File be/src/runtime/timestamp-value.inline.h: Line 46: tm temp_tm = boost::posix_time::to_tm(temp); > i pointed this out as a potential performance problem and asked you to leav The problem with that approach [1] is that (t - epoch) is a boost::time_duration and that class internally stores ticks which are nanosecond-resolution. Therefore you cannot represent our entire supported date range with a time_duration. i.e. dates will only work within 1677 to 2262, even though our supported range is 1400-9999. I spent a while looking for a way around this boost issue (see IMPALA-5357 which is the same underlying issue I think) because the stdlib functions are indeed expensive, but I couldn't find anything yet. Even newer versions of boost will still suffer the same issue: time_duration cannot represent as large durations as the gregorian dates we want to represent. That said, I will add a comment now that this does have perf impact and we should look for an alternative solution. I don't have perf measurements yet specifically for this code because this isn't the bottleneck at the moment, the partition/sort is -- we've been studying that so far. 1: https://stackoverflow.com/questions/4461586/how-do-i-convert-boostposix-timeptime-to-time-t http://gerrit.cloudera.org:8080/#/c/6936/2/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java: Line 271: "cast to a default value for a TIMESTAMP column.", defaultValue_.toSql())); > i'd say "to a TIMESTAMP literal', just to be absolutely clear Done Line 274: > trailing Done Line 296: "Error converting timestamp default value: " + defaultValue_.toSql(), e); > so you're converting an internal exception to an analysis exception here? a Good point, though in this case I think the exception is actionable: the user specified an expression that isn't a valid Impala TIMESTAMP. How about I change the wording of this message? Or what would you prefer I do? http://gerrit.cloudera.org:8080/#/c/6936/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: Line 168: * Returns the actual value of the specified defaultValue literal. Checks that > i still don't understand what the return types could be (looks like T<>Lite Ok, hopefully I made this more clear http://gerrit.cloudera.org:8080/#/c/6936/2/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 650: def test_timestamp_default_value(self, cursor): > can't you do this with a simple .test file? that would be easier to read an Good point, I'm not sure why this was originally done this way (this just mirrors the existing code) - I think this was added by Martin or Casey. Would you mind if I moved all of these to .test files in a separate change so this change doesn't get held up? I filed IMPALA-5406 -- To view, visit http://gerrit.cloudera.org:8080/6936 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I655910fb4805bb204a999627fa9f68e43ea8aaf2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
