Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 )
Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 ...................................................................... Patch Set 11: (10 comments) http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h File be/src/runtime/datetime-iso-sql-format-parser.h: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS10, Line 81: If '*format' points at a beginning of an escape sequence, : // '*forma > It doesn't move *format to the next character, it moves it to the last char Indeed this reflect better what happens for 'format'. Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@57 PS10, Line 57: >= > '>=' might be safer to use here Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@251 PS10, Line 251: // If we reached the end of input or the end of token sequence, we can return. : if (*current_pos >= end_pos || *current_tok_idx >= dt_ctx.toks.size()) { : // Skip trailing empty text tokens in format. : > What if we reached the end of input but dt_ctx.toks still contains some emp Thanks for spotting this. Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h File be/src/runtime/datetime-iso-sql-format-tokenizer.h: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@91 PS10, Line 91: string > string functions Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@128 PS10, Line 128: static bool IsStartOfTextToken(const char* current_pos) > This should probably be a static function instead of const. Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@137 PS10, Line 137: str_start > str_start, here and elsewhere in the comment. Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@141 PS10, Line 141: static const char* FindEndOfTextToken(const char* str_start, const char* str_end, : bool is_escaped); > This should be a static function too. Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124 PS10, Line 124: if (token->second.type == FX_MODIFIER) { : if (used_tokens_.size() > 0 || dt_ctx_->fx_modifier) { : return MISPLACED_FX_MODIFIER_ERROR; : } : dt_ctx_->fx_modifier = true; : *current_pos += curr_token_size; : return SUCCESS; : } : if (token->second.type == FM_MODIFIER) { : fm_modifier_active_ = true; : > This allows weird format strings too, e.g.: 'FXFMFMFXYYYY-MM-DD' Initially I thought that giving this freedom to the user wouldn't hurt but giving it a second look I feel that this would cause a bit ambiguity. Let me restrict that FX can only be given once and at the very beginning. I could restrict FM to be given only once for a particular token but that wouldn't seem that important. "FXFMFMYYYY-FMFMDD-MM" is not ambiguous at all as it seems that which token is M modified and which isn't. http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@251 PS10, Line 251: DCHECK(str_begin != nullptr) > nit: DCHECK(str_begin <= *current_pos && *current_pos < str_end); Done http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@254 PS10, Line 254: = (**current > nit: no need to put is_ecaped inside parentheses. Done -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 11 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Mon, 28 Oct 2019 13:43:48 +0000 Gerrit-HasComments: Yes
