Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14291 )

Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
......................................................................


Patch Set 10:

(10 comments)

Thanks for making the changes. Few more comments:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h
File be/src/runtime/datetime-iso-sql-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.h@81
PS10, Line 81:  As a side effect moves '*format' to the next character in the
             :   // format.
It doesn't move *format to the next character, it moves it to the last 
character of the escape sequence. If *format doesn't point at an escape 
sequence, *format is not changed.

Maybe something like this:
"
If '*format' points at a beginning of an escape sequence, '*format' is moved to 
the last character of the escape sequence. Otherwise, '*format' is not changed.
"


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@57
PS10, Line 57: ==
'>=' might be safer to use here


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-parser.cc@251
PS10, Line 251:  // If we reached the end of input or the end of token 
sequence, we can return.
              :   if (*current_pos >= end_pos || *current_tok_idx >= 
dt_ctx.toks.size()) {
              :     return (*current_pos >= end_pos && *current_tok_idx >= 
dt_ctx.toks.size());
              :   }
What if we reached the end of input but dt_ctx.toks still contains some empty 
TEXT tokens?

select cast('1985-12-09-' as date format 'YYYY-MM-DD-""');

I think this corner-case should be handled here, instead of just returning 
false.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h
File be/src/runtime/datetime-iso-sql-format-tokenizer.h:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@91
PS10, Line 91: function
string functions


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@128
PS10, Line 128:  bool IsStartOfTextToken(const char* current_pos) const;
This should probably be a static function instead of const.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@137
PS10, Line 137: start_str
str_start, here and elsewhere in the comment.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.h@141
PS10, Line 141:   const char* FindEndOfTextToken(const char* str_start, const 
char* str_end,
              :       bool is_escaped);
This should be a static function too.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@124
PS10, Line 124:       if (token->second.type == FX_MODIFIER) {
              :         if (used_tokens_.size() > 0) return 
MISPLACED_FX_MODIFIER_ERROR;
              :         dt_ctx_->fx_modifier = true;
              :         *current_pos += curr_token_size;
              :         return SUCCESS;
              :       }
              :       if (token->second.type == FM_MODIFIER) {
              :         fm_modifier_active_ = true;
              :         *current_pos += curr_token_size;
              :         return SUCCESS;
              :       }
This allows weird format strings too, e.g.: 'FXFMFMFXYYYY-MM-DD'
Probably these should return an error.


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@251
PS10, Line 251: DCHECK(str_begin < str_end);
nit: DCHECK(str_begin <= *current_pos && *current_pos < str_end);


http://gerrit.cloudera.org:8080/#/c/14291/10/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@254
PS10, Line 254: (is_escaped)
nit: no need to put is_ecaped inside parentheses.



--
To view, visit http://gerrit.cloudera.org:8080/14291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855
Gerrit-Change-Number: 14291
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Fri, 18 Oct 2019 12:05:26 +0000
Gerrit-HasComments: Yes

Reply via email to