Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/7009

to look at the new patch set (#16).

Change subject: IMPALA-5315: Cast to timestamp fails for YYYY-M-D format
......................................................................

IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

This change allows casting of a string in 'lazy' date/time
format to timestamp. The supported lazy date formats are:
  yyyy-[M]M-[d]d
  yyyy-[M]M-[d]d [H]H:[m]m:[s]s[.SSSSSSSSS]
  [H]H:[m]m:[s]s[.SSSSSSSSS]

We will incur a SCAN performance penalty (approximately 1/2
TotalReadThroughput) when the string is in one of these
lazy date/time format.

Testing:
Benchmarked the performance consequence by executing this SQL on
a private build over 3.8 billion rows:
select min(cast (time_string as timestamp)) from private.impala_5315

Added tests for valid and invalid date/time format strings
in expr-test.cc to be inline with existing tests for CAST() function.

Added end-to-end tests into exprs.test and
cast-lazy-datetime-string.test to exercise the new function within
the context of a query.

Added tests to exercise the leading and trailing white space trimming
behaviour in default and lazy date/time string format (IMPALA-6630).

Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
---
M be/src/exprs/expr-test.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
A testdata/data/lazy_timestamp.csv
A 
testdata/workloads/functional-query/queries/QueryTest/cast-lazy-datetime-string.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M tests/query_test/test_scanners.py
7 files changed, 400 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/7009/16
--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 16
Gerrit-Owner: Vincent Tran <vtt...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vincent Tran <vtt...@cloudera.com>

Reply via email to