Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for YYYY-M-D format
......................................................................


Patch Set 14:

(17 comments)

It looks like we're getting close.

http://gerrit.cloudera.org:8080/#/c/7009/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7009/14//COMMIT_MSG@25
PS14, Line 25: in expr-test.cc to be inline with existing tests for CAST() 
function.
I think we still need some end-to-end tests that exercise the new function in 
the context of a query, and probably also one that scans a text table with this 
format. Let me know if I can help with setting those up.


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.h@182
PS14, Line 182: parse
nit: "to be parsed" here and below.


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@217
PS14, Line 217: const char* TimestampParser::ParseDigitToken(const char* str, 
const char* str_end){
nit: needs space


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@229
PS14, Line 229:   while(tok_end < str_end) {
nit: needs space


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@236
PS14, Line 236: bool 
TimestampParser::ParseFormatTokensByStr(DateTimeFormatContext* dt_ctx){
nit: needs space


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@239
PS14, Line 239:   DCHECK(dt_ctx->fmt_len > 0);
DCHECK_GT(dt_ctx->fmt_len, 0);


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@240
PS14, Line 240: size
DCHECK_EQ(dt_ctx->toks.size(), 0);


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@249
PS14, Line 249:     dt_ctx->toks.push_back(DateTimeFormatToken(YEAR,str - 
str_begin, tok_end - str,
nit: needs space


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@250
PS14, Line 250:         str));
nit: replace tab with spaces


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@263
PS14, Line 263:     dt_ctx->toks.push_back(DateTimeFormatToken(MONTH_IN_YEAR, 
str - str_begin, tok_end - str,
nit: > 90 chars


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@283
PS14, Line 283:  dt_ctx->has_date_toks
We can just return true here, right?


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@340
PS14, Line 340: factional
typo: fractional


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@346
PS14, Line 346:     if (tok_end - str > 9) {
This is really equivalent to the following, right?

      int num_digits = min(9, tok_end - str);
      dt_ctx->toks.push_back(DateTimeFormatToken(FRACTION, str - str_begin, 
num_digits, str));


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@357
PS14, Line 357: dt_ctx->has_time_toks
this can just be 'true', right?


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@457
PS14, Line 457: NULL
nit: nullptr


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@458
PS14, Line 458:       lazy_ctx.Reset(str, lazy_len);
It's annoying that we have to pay the cost of initializing lazy_ctx when we 
don't use it. It's probably not that bad, but I'd rather that the cost of this 
extra functionality was as close to zero as possible. Can we avoid that by 
restructuring this as:

  if (dt_ctx != nullptr) {
    return Parse(str, len, *dt_ctx, d, t);
  } else {
    DateTimeFormatContext lazy_ctx;
    if (ParseFormatTokensByStr(&lazy_ctx)) {
      // Use lazy_ctx
    } else {

      *d = boost::gregorian::date();
      *t = boost::posix_time::time_duration(boost::posix_time::not_a_date_time);
      return false;
    }
  }


http://gerrit.cloudera.org:8080/#/c/7009/14/be/src/runtime/timestamp-parse-util.cc@460
PS14, Line 460:       dt_ctx = &lazy_ctx;
nit: indentation



--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 14
Gerrit-Owner: Vincent Tran <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Vincent Tran <[email protected]>
Gerrit-Comment-Date: Thu, 08 Mar 2018 01:42:29 +0000
Gerrit-HasComments: Yes

Reply via email to