Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 )
Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 ...................................................................... Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-parser.cc@225 PS6, Line 225: switch (**format) { : case 'b': return '\b'; : case 'n': return '\n'; : case 'r': return '\r'; : case 't': return '\t'; : } Please check what escape sequences are supported by ANSI SQL standard. (same as datetime-parser-common.cc:226) http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@109 PS6, Line 109: *current_pos != nullptr nit: current_pos != nullptr && *current_pos != nullptr http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@112 PS6, Line 112: *current_pos < str_end nit: str_begin <= *current_pos && *current_pos < str_end http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@240 PS6, Line 240: strncmp(current_pos, "\\\"", 2) == 0) This is safe only if 'current_pos' is zero-terminated, but I don't think it is. Probably you should check first that there are at least 2 characters between current_pos and end_pos. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@273 PS6, Line 273: strncmp(*current_pos, "\\\"", 2) same as L240. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@277 PS6, Line 277: strncmp(*current_pos, "\\\\\\\"", 4) same as L240. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@282 PS6, Line 282: strncmp(*current_pos, "\\\\", 2) same as L240. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc File be/src/runtime/datetime-parser-common.cc: http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@201 PS6, Line 201: continue nit: can be removed, here and below. http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-parser-common.cc@202 PS6, Line 202: } else if (!tok.is_double_escaped && strncmp(text_it, "\\\"", 2) == 0) { : result.append("\""); : ++text_it; : continue; : } else if (strncmp(text_it, "\\\\", 2) == 0) { : result.append("\\"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\b", 2) == 0) { : result.append("\b"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\n", 2) == 0) { : result.append("\n"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\r", 2) == 0) { : result.append("\r"); : ++text_it; : continue; : } else if (strncmp(text_it, "\\t", 2) == 0) { : result.append("\t"); : ++text_it; : continue; : } Are these the only escape sequences supported by ANSI SQL? C, for instance supports a bunch of others too: https://en.wikipedia.org/wiki/Escape_sequences_in_C http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py File tests/query_test/test_cast_with_format.py: http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@663 PS6, Line 663: # Format part is surrounded by double quotes so the quotes indicating the start and : # end of the text token has to be escaped. I did some testing around when the format string is surrounded by ' and the text token is surrounded by \" Date to string conversion: >> select cast(date"1985-12-08" as string format 'YYYY-MM-DD \"X\"'); returns: 1985-12-08 X I'm not sure if this should work or return an error instead. also if only the opening " to the text token is escaped: >> select cast(date"1985-12-08" as string format 'YYYY-MM-DD \"X"'); returns 1985-12-08 This should probably return an error. On the other hand when doing string to date conversion: >> select cast("1985-12-08 X" as date format 'YYYY-MM-DD \"X\"'); returns: 1985-12-08 >> select cast("1985-12-08 X" as date format 'YYYY-MM-DD \"X"'); returns an error. >> select cast("1985-12-08" as date format 'YYYY-MM-DD \"X"'); returns an error too. http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@768 PS6, Line 768: covered surrounded -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Tue, 15 Oct 2019 15:42:17 +0000 Gerrit-HasComments: Yes