Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8508 )
Change subject: IMPALA-5237: Support a quoted string in date/time format ...................................................................... Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/exprs/expr-test.cc@5746 PS3, Line 5746: TestValue("unix_timestamp('\\'Epoch time: \\'1970|01|01 00|00|00',\ I checked and Hive in fact returns NULL for this input. To be honest I was expecting a different result containing the given string plus the time in the desired format. (instead of NULL or zero) e.g. in this case: "Epoch time: 1970|01|01 00|00|00" Am I missing something? Shouldn't this be the expected result? http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/exprs/expr-test.cc@5756 PS3, Line 5756: // 2) Not allowd unclosed quoted string nit: typo: allowed http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/exprs/expr-test.cc@5764 PS3, Line 5764: TestValue("unix_timestamp('\\'Epoch time: \\'1970|01|01 00|00|00',\ There is one more thing I don't understand: Regardless of giving this format correctly or with a mismatch with the string literals we still expect the query to return the same result (bigint zero) without any of them returning an error. Shouldn't this one return an error? http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/runtime/timestamp-parse-util.h File be/src/runtime/timestamp-parse-util.h: http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/runtime/timestamp-parse-util.h@77 PS3, Line 77: /// IMPALA-5237: Support a quoted string in date/time format I have the feeling that copying the whole commit message here is a bit of an overkill. Would it be possible to wrap this up and have a few lines of description about the new functionality. In addition I think it's not necessary to mention here how Hive handles these cases. We should focus here on how Impala handles them. http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/runtime/timestamp-parse-util.cc File be/src/runtime/timestamp-parse-util.cc: http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/runtime/timestamp-parse-util.cc@163 PS3, Line 163: if (*str == '\'') { I would find this part of code a bit more self-descriptive if it was in the form of a series of function calls. It might be just myself but for me figuring out the intent behind these pointer manipulations is something that takes some time. This is what I mean: if(*str == '\'') { GetClosingQuote(str); /* +some error handling if closing not found */ SaveTokenWhaterver(string_literal, str); /* Or whatever parameter is essential to this function in order to calculate len, pos, and val*/ } What is your opinion? http://gerrit.cloudera.org:8080/#/c/8508/3/be/src/runtime/timestamp-parse-util.cc@173 PS3, Line 173: if (len == 0) len = 1; The naming of len, val and pos aren't that self-explanatory for me. Would it be possible to rename them for something more informative? -- To view, visit http://gerrit.cloudera.org:8080/8508 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie34055ac695748bcfb110bfa6ed5308f469ea178 Gerrit-Change-Number: 8508 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul <jinc...@gmail.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Kim Jin Chul <jinc...@gmail.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 05 Dec 2017 14:52:43 +0000 Gerrit-HasComments: Yes