Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14291 )

Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
......................................................................


Patch Set 11:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h
File be/src/runtime/datetime-iso-sql-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h@81
PS10, Line 81:  If '*format' points at a beginning of an escape sequence,
             :   // '*forma
> It doesn't move *format to the next character, it moves it to the last char
Indeed this reflect better what happens for 'format'. Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@57
PS10, Line 57: >=
> '>=' might be safer to use here
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@251
PS10, Line 251:  // If we reached the end of input or the end of token 
sequence, we can return.
              :   if (*current_pos >= end_pos || *current_tok_idx >= 
dt_ctx.toks.size()) {
              :     // Skip trailing empty text tokens in format.
              :
> What if we reached the end of input but dt_ctx.toks still contains some emp
Thanks for spotting this. Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h
File be/src/runtime/datetime-iso-sql-format-tokenizer.h:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@91
PS10, Line 91: string
> string functions
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@128
PS10, Line 128:  static bool IsStartOfTextToken(const char* current_pos)
> This should probably be a static function instead of const.
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@137
PS10, Line 137: str_start
> str_start, here and elsewhere in the comment.
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@141
PS10, Line 141:   static const char* FindEndOfTextToken(const char* str_start, 
const char* str_end,
              :       bool is_escaped);
> This should be a static function too.
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124
PS10, Line 124:       if (token->second.type == FX_MODIFIER) {
              :         if (used_tokens_.size() > 0 || dt_ctx_->fx_modifier) {
              :           return MISPLACED_FX_MODIFIER_ERROR;
              :         }
              :         dt_ctx_->fx_modifier = true;
              :         *current_pos += curr_token_size;
              :         return SUCCESS;
              :       }
              :       if (token->second.type == FM_MODIFIER) {
              :         fm_modifier_active_ = true;
              :
> This allows weird format strings too, e.g.: 'FXFMFMFXYYYY-MM-DD'
Initially I thought that giving this freedom to the user wouldn't hurt but 
giving it a second look I feel that this would cause a bit ambiguity. Let me 
restrict that FX can only be given once and at the very beginning.

I could restrict FM to be given only once for a particular token but that 
wouldn't seem that important. "FXFMFMYYYY-FMFMDD-MM" is not ambiguous at all as 
it seems that which token is M modified and which isn't.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@251
PS10, Line 251: DCHECK(str_begin != nullptr)
> nit: DCHECK(str_begin <= *current_pos && *current_pos < str_end);
Done


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@254
PS10, Line 254: = (**current
> nit: no need to put is_ecaped inside parentheses.
Done



--
To view, visit http://gerrit.cloudera.org:8080/14291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855
Gerrit-Change-Number: 14291
Gerrit-PatchSet: 11
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Mon, 28 Oct 2019 13:43:48 +0000
Gerrit-HasComments: Yes

Reply via email to