Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 )
Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 ...................................................................... Patch Set 14: (6 comments) http://gerrit.cloudera.org:8080/#/c/14291/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14291/12//COMMIT_MSG@24 PS12, Line 24: : In a string to datetime conversion when using this modi > Please clarify that this is about datetime to string conversion. That is in fact for string to datetime conversion. Re-phrased this part of the comment to express the point better. http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.h File be/src/runtime/datetime-iso-sql-format-parser.h: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS12, Line 81: t > nit: the Done http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.cc@238 PS12, Line 238: DCHECK(current_tok_idx != nullptr && *current_tok_idx < dt_ctx.toks.size()); > Add Done http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@113 PS12, Line 113: DCHECK(str_begin <= *current_pos && *current_pos < str_end); : if (IsStartOfTextToken(*current_pos)) { : > What if text token is preceded by an FM modifier? Hmm, this is in fact an issue for text tokens. Even though the behaviour of a text token is the same with or without FM/FX modifiers but it's misleading that FM is applied to the token after the text token. For separators, I don't feel this is an issue. FM is applied to the next (non-separator) token in the format. In your example I think it's fine to apply FM to DD. Done the adjustment for TEXT. http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124 PS12, Line 124: const auto token = VALID_TOKENS.find(token_to_probe); : if (token != VALID_TOKENS.end()) { : if (token->second.type == FX_MODIFIER) return MISPLACED_FX_MODIFIER_ERROR; : if (token->second.type == FM_MODIFIER) { : fm_modifier_active_ = true; : *current_pos += curr_token_size; : return SUCCESS; : } > 'FMFXYYYY-MM-DD' is another format string that should be rejected but curre See above http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124 PS12, Line 124: const auto token = VALID_TOKENS.find(token_to_probe); : if (token != VALID_TOKENS.end()) { : if (token->second.type == FX_MODIFIER) return MISPLACED_FX_MODIFIER_ERROR; : if (token->second.type == FM_MODIFIER) { : fm_modifier_active_ = true; : *current_pos += curr_token_size; : return SUCCESS; : } > This still allows format strings like: Agree, to avoid confusion FX tokenization should be right before the loop. Done -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 14 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Thu, 31 Oct 2019 13:29:04 +0000 Gerrit-HasComments: Yes
