Sounds like a good idea to me.

A quick scan of the current  TPC-H spec at
http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.18.0.pdf
suggests to me that such a change is conformant to the spec. Specifically:

In 1.3 Data Types:
"Date is a value whose external representation can be expressed as
YYYY-MM-DD, where all characters are numeric. A date must be able to
express any day within at least 14 consecutive years. There is no
requirement specific to the internal representation of a date."

In 2.1.3 Substitution Parameters and Output Data:
"Comment 1: When dates are part of the substitution parameters, they must
be expressed in a format that includes
the year, month and day in integer form, in that order (e.g., YYYY-MM-DD).
The delimiter between the year,
month and day is not specified. Other date representations, for example the
number of days since 1970-01-01, are
specifically not allowed."

In 2.2.3 Minor Query Modifications:
"Date expressions - For queries that include an expression involving
manipulation of dates (e.g.,
adding/subtracting days/months/years, or extracting years from dates),
vendor-specific syntax may be used
TPC BenchmarkTM H Standard Specification Revision 2.18.0 Page 26
instead of the specified SQL-92 syntax. Replacement syntax must have
equivalent semantic behavior.
Examples of acceptable implementations include "YEAR(<column>)" to extract
the year from a date
column or "DATE(<date>) + 3 MONTHS" to add 3 months to a date."

In 4.2 DBGEN and Database Population
4.2.2 Definition Of Terms:
"4.2.2.8 The term date represents a string of numeric characters separated
by hyphens and comprised of a 4 digit year, 2 digit
month and 2 digit day of the month."

I believe that our DATE implementation conforms to all these requirements,
so we should be OK.

Thanks,

  - Laszlo

On Tue, Jan 28, 2020 at 1:59 PM Gabor Kaszab <gaborkas...@apache.org> wrote:

> Hey,
>
> Recently I have been running some perf tests using the TPCH database and I
> observed that the date columns are still stored as string even though we
> have the date type implemented recently. I'm wondering if there are any
> objections against changing the type of these columns to date. My
> assumption is that none of the queries have to be changed just the SQL that
> creates the table.
>
> Cheers,
> Gabor
>

Reply via email to