Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 )
Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 ...................................................................... Patch Set 12: (5 comments) http://gerrit.cloudera.org:8080/#/c/14291/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14291/12//COMMIT_MSG@24 PS12, Line 24: Using this the value of a token can be : shorter than the max length if followed by a separator. Please clarify that this is about datetime to string conversion. http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.h File be/src/runtime/datetime-iso-sql-format-parser.h: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.h@81 PS12, Line 81: a nit: the http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.cc File be/src/runtime/datetime-iso-sql-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-parser.cc@238 PS12, Line 238: DCHECK(current_tok_idx != nullptr && *current_tok_idx < dt_ctx.toks.size()); Add && dt_ctx.toks[*current_tok_idx].type == SEPARATOR http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@113 PS12, Line 113: if (IsStartOfTextToken(*current_pos)) { : return ProcessTextToken(current_pos, str_begin, str_end); : } What if text token is preceded by an FM modifier? I did some testing and it looks like in a format string like 'FXYYYY-MM-FM"text"DD' the FM-modifier applies to DD token, instead of the "text" token. I think we need to do here something like L141-144. ProcessSeaprators() has a similar problem. In 'FXYYYY-MM-FM-DD' FM applies to DD token instead of the '-' separator. http://gerrit.cloudera.org:8080/#/c/14291/12/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124 PS12, Line 124: if (token->second.type == FX_MODIFIER) { : if (used_tokens_.size() > 0 || dt_ctx_->fx_modifier) { : return MISPLACED_FX_MODIFIER_ERROR; : } : dt_ctx_->fx_modifier = true; : *current_pos += curr_token_size; : return SUCCESS; : } This still allows format strings like: '--FXYYYY-MM-DD' or '"text"FXYYYY-MM-DD' Maybe it would be easier to parse the optional FX modifier before calling ProcessNextToken() in a loop. -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 12 Gerrit-Owner: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Wed, 30 Oct 2019 14:06:30 +0000 Gerrit-HasComments: Yes
