Hi,

I believe I've come up against a pretty big issue with Date and Time
handling (and to a certain extent Timestamp handling) in Phoenix.

>From what I understand from the JDBC documentation (mostly API docs),
the way that a java.sql.Date should be handled via JDBC is simply as a
day, month, and year, despite the fact that it is internally
represented as a timestamp (the same kind of thing applies to Time
objects, which are a triple of hours, minutes, and seconds).

Further, my understanding is that it is the responsibility of a JDBC
driver to do normalization of incoming Date and Time (and maybe
Timestamp) objects to interpret them as being in the current time
zone, and remove the extra components (i.e. time components for a
Date, and date components for a Time) before storing the value.

This means that today, if I insert a column value consisting of 'new
Date(System.currentTimeMillis())', then I should be able to retrieve
that same value with a filter on 'Date.valueOf("2014-03-18")'.
Additionally, that filter should work regardless of my own local
timezone where I'm querying from after the data has been written.

It also means that if I store 'Time.valueOf("07:00:00")' in a TIME
field in a database in my current timezone, someone should get
"07:00:00" if they run 'ResultSet#getTime(1).toString()' on that
value, even if they're in a different timezone than me.

>From what I can see right now, Phoenix doesn't currently exhibit this
behavior. Instead, the full long representation of Date, Time, and
Timestamps is stored directly in HBase, without dropping the extra
date fields or doing timezone conversion.

>From the current analysis, what is required for Phoenix to be
JDBC-compliant in terms of time/date/timestamp handling is:
* All incoming time-style values should be interpreted in the local
timezone of the driver, then be normalized and converted to UTC before
serialization (unless a Calendar is supplied) in PreparedStatement
calls
* All outgoing time-style values should be converted from UTC into the
local timezone (unless a Calendar is supplied) in ResultSet calls
* Supplying a Calendar to PreparedStatement methods should cause the
time value to be converted from the local timezone to the timezone of
the calendar (instead of UTC) before being serialized
* Supplying a Calendar to ResultSet methods should cause the time
value from the database to be interpreted as if it was serialized in
the timezone of the Calendar, instead of UTC.

I started on this earlier today as part of a smaller issue that I had
come up against (that the Calendar objects are incorrectly used in
PhoenixPreparedStatement and PhoenixResultSet), but digging deeper has
shown that this is a more fundamental issue in Phoenix.

Making the above changes would mean breaking backwards compatibility
with existing Phoenix installs (unless some kind of
backwards-compatibility mode is introduced or something similar). On
the other hand, not making the above changes will mean that the
Phoenix JDBC driver doesn't treat time-style values in the same way as
(most) other JDBC drivers, which can cause issues with external
tooling that goes via JDBC.

Any thoughts on the direction we want to go with this?

- Gabriel

Reply via email to