Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14291 )

Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
......................................................................


Patch Set 6:

(14 comments)

Thanks for taking another look!

http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h
File be/src/runtime/datetime-iso-sql-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@78
PS7, Line 78: format
> text token inside the format
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@81
PS7, Line 81: format
> text token
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.h@81
PS7, Line 81: and moves
            :   // '*format' to 'a'.
> I don't think part this is true.
Ideed, thx.
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc
File be/src/runtime/datetime-iso-sql-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@48
PS7, Line 48:     if (tok->type == SEPARATOR && !dt_ctx.fx_modifier) {
            :       bool res = ProcessSeparatorSequence(&current_pos, end_pos, 
dt_ctx, &i);
            :       if (!res || current_pos == end_pos) return res;
            :       DCHECK(i < dt_ctx.toks.size());
            :       // Next token, following the separator sequence.
            :       tok = &dt_ctx.toks[i];
            :     }
> I'd prefer to handle strict separator matching (tok->type == SEPARATOR && d
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@56
PS7, Line 56:     const char* token_end_pos = FindEndOfToken(current_pos, 
end_pos - current_pos, *tok);
            :     if (token_end_pos == nullptr) return false;
            :     int token_len = token_end_pos - current_pos;
            :
            :     if (dt_ctx.fx_modifier && !tok->fm_modifier && tok->type != 
TEXT &&
            :         token_len != tok->len) {
            :       return false;
            :     }
> for TEXT tokens, 'token_end_pos' and 'token_len' aren't set here. 'token_en
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-iso-sql-format-parser.cc@185
PS7, Line 185: token_len == 1
> dt_ctx.fx_modifier && token_len == 1
Merged this with the loose separator handling. This case is no longer needed 
here.


http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/14291/6/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@240
PS6, Line 240: strncmp(current_pos, "\\\"", 2) == 0)
> You're correct, it is null-terminated for iso-sql format strings, because t
I see. I wasn't aware that the format is not null-terminated for the 
SingleDateFormat path. Thanks for letting me know!
I'll add comments about this assumption on the IsoSqlFormat branch for both 
IsoSqlFormatTokenizer::dt_ctx_ and IsoSqlFormatParser::ParseDateTime().

Thanks again for spotting this!


http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-parser-common.h
File be/src/runtime/datetime-parser-common.h:

http://gerrit.cloudera.org:8080/#/c/14291/7/be/src/runtime/datetime-parser-common.h@134
PS7, Line 134: then
> that
Done


http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py
File tests/query_test/test_cast_with_format.py:

http://gerrit.cloudera.org:8080/#/c/14291/6/tests/query_test/test_cast_with_format.py@738
PS6, Line 738: select cast("1985-\a\b\f\n\r\t\v\'12-09" as '''
             :         r'''date format 'YYYY-"\a\b\f\n\r\t\v\'"MM-DD')
> This test is about special escape sequences but non-special chars in the te
Isn't intentional. Remained from a previous approach I had implemented to 
handle special char. I'm removing the non-special chars from this test.
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py
File tests/query_test/test_cast_with_format.py:

http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@782
PS7, Line 782: # Strict separator matching.
             :     result = self.client.execute("select cast('2001-03-02 
03:10:15' as timestamp format"
             :         "'FXYYYY MM-DD HH12:MI:SS')")
             :     assert result.data == ["NULL"]
             :
             :     result = self.client.execute("select cast('2001-03-03 
03:10:15' as timestamp format"
             :         "'FXYYYY-MM-DD HH12::MI:SS')")
             :     assert result.data == ["NULL"]
             :
             :     result = self.client.execute("select cast('2001-03-04    ' 
as timestamp format"
             :         "'FXYYYY-MM-DD ')")
             :     assert result.data == ["NULL"]
             :
             :     # Strict token length matching.
             :     result = self.client.execute("select cast('2001-3-05' as 
timestamp format "
             :         "'FXYYYY-MM-DD')")
             :     assert result.data == ["NULL"]
> Maybe add positive tests too for these.
L778 is a positive test for FX modifier. Does that cover what you ask for?


http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@866
PS7, Line 866: doesn't
> don't
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@872
PS7, Line 872:     result = self.client.execute("select cast(cast('2001-03-09' 
as date) "
             :         "as string format 'YYYY-FMMM-FMDD')")
             :     assert result.data == ["2001-3-9"]
> Maybe add a test were year token is FM-modified
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@882
PS7, Line 882:    result = self.client.execute("select cast(cast('2001-04-09' 
as date) "
             :         "as string format 'FXYYYY-FMMM-FMDD')")
> Test FM-modified year token here too.
Done


http://gerrit.cloudera.org:8080/#/c/14291/7/tests/query_test/test_cast_with_format.py@1045
PS7, Line 1045: "select cast('1985-11-21text' as timestamp format 
'YYYY-MM-DD\"text')"
> Again, please use raw python strings, here and below.
Done



--
To view, visit http://gerrit.cloudera.org:8080/14291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855
Gerrit-Change-Number: 14291
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Thu, 17 Oct 2019 18:11:33 +0000
Gerrit-HasComments: Yes

Reply via email to