Here is my understanding of the relevant specifications.

The SQL standard says that TIMESTAMP values do not have a time zone. So, 
“1970-01-01 00:00:00” means just that; it does not mean “1970-01-01 00:00:00 
UTC” or "“1970-01-01 00:00:00 CET” or anything on your local time zone. The 
time zone is an interpretation placed on the value when they read it.

Internally Calcite represents TIMESTAMP '1970-01-01 00:00:00’ as 0 long.

Now JDBC is another matter. When you read a TIMESTAMP value via JDBC, 
specifically the ResultSet.getTimestamp(int) method, it translates it into the 
local timezone. So, when you read it, it becomes “1970-01-01 00:00:00 CET”, 
whose value is -36000000L (one hour, in milliseconds, before the UTC epoch).

Remember that a Timestamp value has no time zone. It represents a moment in 
time, e.g. the moment that Apollo 11 landed on the moon. Its internal value is 
milliseconds from the UTC epoch. But the JVM will “helpfully” render it in your 
local time.

So, your JVM will render a Timestamp(-36000000L) as “1970-01-01 00:00:00 CET”.

If you want a value to be interpreted in a particular time zone, us the 
ResultSet.getTimestamp(int, Calendar) method. If you pass in a Calendar whose 
timezone is UTC, you will get a Timestamp whose value is 0L.

Net net: if you have “YYYY-MM-DD HH:MM:SS" in a CSV file and read it using 
getTimestamp (i.e. in local timezone), you should expect to that timestamp to 
print as "YYYY-MM-DD HH:MM:SS <your timezone>”.

Julian


> On Sep 25, 2015, at 1:42 AM, Jan Van Besien <[email protected]> wrote:
> 
> I might be misunderstanding something, but I think calcite is wrong
> with respect to how it handles timezones in date, time and timestamp.
> 
> Allow me to use the existing CsvTest#testDateType test as an example,
> in particular the test with a timestamp on line 385.
> 
> The data in the CSV file itself for the timestamp field is the string
> "1996-08-03 00:01:02". According to the TIME_FORMAT_TIMESTAMP in the
> CsvEnumerator, this String is interpreted as being a timestamp in GMT
> timezone. So it is considered "1996-08-03 00:01:02 GMT". So far, so
> good.
> 
> Note that my local timezone is CEST (2 hours east of GMT). When
> querying above timezone, I think the correct thing to do is to return
> the Timezone exactly as is, hence "1996-08-03 00:01:02 GMT" or
> "1996-08-03 02:01:02 CEST". Given that the string representation of a
> Timestamp object returns a String in the local timezone (without
> mentioning that local timezone), the resulting string would be
> "1996-08-03 02:01:02". This might seem wrong (2 hours to late) but it
> is actually correct because it is in the local (CEST) timezone. More
> importantly, the millis inside this Timestamp object will in this case
> be exactly the same as the millis inside the Timestamp object returned
> by the CsvEnumerator.
> 
> However, calcite seems to do a negative correction on the millis value
> with the local timezone offset, such that the result is actually 2
> hours earlier, i.e. "1996-08-02 22:01:02 GMT". Thanks to the way
> string formatting on the Timestamp object works, this ends up as being
> the string "1996-08-03 00:01:02" which looks correct, but is actually
> wrong because it is to be interpreted in the local Timezone which is
> CEST, hence it really means "1996-08-02 22:01:02 GMT" which is not
> what was in the database (CsvEnumerator).
> 
> If you agree with my analysis, this also means that the CsvTest on
> line 385 is wrong, because it checks that the timestamp from the
> resultset is java.sql.Timestamp.valueOf("1996-08-03 00:01:02") while
> (in my timezone) it should really be
> java.sql.Timestamp.valueOf("1996-08-03 02:01:02"). The test will have
> to be rewritten to work in any timezone obviously, which will imply
> creating a millis value from a Calendar with a certain fixed timezone.
> 
> 
> Jan

Reply via email to