Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 )
Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 ...................................................................... Patch Set 10: (10 comments) Thanks for making the changes. Few more comments: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h File be/src/runtime/datetime-iso-sql-format-parser.h: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS10, Line 81: As a side effect moves '*format' to the next character in the : // format. It doesn't move *format to the next character, it moves it to the last character of the escape sequence. If *format doesn't point at an escape sequence, *format is not changed. Maybe something like this: " If '*format' points at a beginning of an escape sequence, '*format' is moved to the last character of the escape sequence. Otherwise, '*format' is not changed. " http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@57 PS10, Line 57: == '>=' might be safer to use here http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@251 PS10, Line 251: // If we reached the end of input or the end of token sequence, we can return. : if (*current_pos >= end_pos || *current_tok_idx >= dt_ctx.toks.size()) { : return (*current_pos >= end_pos && *current_tok_idx >= dt_ctx.toks.size()); : } What if we reached the end of input but dt_ctx.toks still contains some empty TEXT tokens? select cast('1985-12-09-' as date format 'YYYY-MM-DD-""'); I think this corner-case should be handled here, instead of just returning false. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h File be/src/runtime/datetime-iso-sql-format-tokenizer.h: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@91 PS10, Line 91: function string functions http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@128 PS10, Line 128: bool IsStartOfTextToken(const char* current_pos) const; This should probably be a static function instead of const. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@137 PS10, Line 137: start_str str_start, here and elsewhere in the comment. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@141 PS10, Line 141: const char* FindEndOfTextToken(const char* str_start, const char* str_end, : bool is_escaped); This should be a static function too. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124 PS10, Line 124: if (token->second.type == FX_MODIFIER) { : if (used_tokens_.size() > 0) return MISPLACED_FX_MODIFIER_ERROR; : dt_ctx_->fx_modifier = true; : *current_pos += curr_token_size; : return SUCCESS; : } : if (token->second.type == FM_MODIFIER) { : fm_modifier_active_ = true; : *current_pos += curr_token_size; : return SUCCESS; : } This allows weird format strings too, e.g.: 'FXFMFMFXYYYY-MM-DD' Probably these should return an error. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@251 PS10, Line 251: DCHECK(str_begin < str_end); nit: DCHECK(str_begin <= *current_pos && *current_pos < str_end); http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@254 PS10, Line 254: (is_escaped) nit: no need to put is_ecaped inside parentheses. -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 10 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Fri, 18 Oct 2019 12:05:26 +0000 Gerrit-HasComments: Yes
