[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user jiang-wu commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r184424638 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- The string representation between LocalDateTime and Timestamp are not exactly the same, but that is potentially fixable since we can alter the way the values are displayed via formatters. Though JDBC is not just for getting string representations. It is on the programmatic use cases where we are getting the value objects where one would see the disconnect on the data types. Will try this out with a use case I have with programmatic JDBC access and see what are the impacts on different types for the same expected value. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r184243856 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- Hmm. That takes us back to the original problem, that of the date|time|timestamp field inside a complex object. ``` select t.context.date, t.context from test t; will return a java.sql.Date object for column 1, but a java.time.LocalDate for the same object inside column 2. This doesn't seem like a good thing. ``` Why should that be a bad thing though? Ultimately, the object returned by getObject() is displayed to the end user thru the toString method. The string representation of Local[Date|Time|Timestamp] should be the same as that of java.sql.[Date|Time|Timestamp]. Isn't it? ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181513279 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- Sounds good. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user jiang-wu commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181493581 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- How about we use Java 8 Local[Data|Time|Timestamp] for the public interface methods? That sets things up for the future. Internally, I won't change the logic that is using Joda DateTime, that is doing the various time zone stuff. That behind the scene logic can be separately updated after determine what is the right behavior Drill wants to support. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181472373 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- I think updating to Java8 LocalDate/Time classes would be good choice. And it will be step forward in the resolving of the Drill's Date/Time issues mentioned in different Jiras: [DRILL-5334](https://issues.apache.org/jira/browse/DRILL-5334), [DRILL-5332](https://issues.apache.org/jira/browse/DRILL-5332) etc. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181461481 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- Either one is fine (since java.time is based on Joda). We've switched to Java 8, but just for consistency with the rest of the code, we might as well use Joda. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user jiang-wu commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181256294 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- sure, I can update to the joda version or the java.time version. Which one is preferred? java.time.* is available in java 1.8+ ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r181248969 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- Agree that messing with Timezones is a Bad Thing. Probably an artifact of the way java.util did things. Anyway, I did mean using org.joda.time.Local[Data|Time|TimeStamp] or the corresponding java.time.* classes. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user jiang-wu commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r180933601 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- BTW, Drill internally tries to fool around with the timezone to preserve the "textual representation" look complex. I am not convinced this is the "right" way to handle time. But in any case, that is outside of the scope of this change. I mentioned in the Jira a comment on how such timezone manipulation is dangerous and lead to errors. I ran into that when attempting at creating a unit test for the change made in the pull request. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user jiang-wu commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r180925699 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- Good point. Someone with deeper knowledge should take a look. As far as I can tell, I think the problem with SqlAccessor is that it uses the vector type to know whether to invoke getDate() vs getTimestamp(). However, such vector type knowledge is not there when complex type such as List and Map are materialized in memory to form Java JsonStringArrayList and JsonStringHashMap. In https://issues.apache.org/jira/browse/DRILL-6242, the example describes this scenario: `select t.context.`date`, t.context from test t;` where the `date` field is inside a Map type. And we select the field by itself as well as select the Map on the same query. This query returns: `++-+ ` `| EXPR$0 | context | ` `++-+ ` `| 2018-03-13 | {"date":{"dayOfYear":72,"year":2018, ... |` One can see that the first column shows the date in the right type. But the same date is shown as a different type inside the Map. In the vector package, when reading the [List|Map]Vector, the code produces its nested member values via the generic method "getObject()". Since all three vector type returned the same DataObject type as the representation, there are no distinction. For the type information to be carried within the List | Map, it would seem that the value should be of distinct types. These can be java.sql.[Date|Time|Timestamp] or some other [Date|Time|Timestamp] classes. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1184#discussion_r180914676 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -509,15 +509,15 @@ public long getTwoAsLong(int index) { public ${friendlyType} getObject(int index) { org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC); date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault()); - return date; + return new java.sql.Date(date.getMillis()); --- End diff -- @jiang-wu Thanks for making these changes. Your fix is on the right track. However, I'm not sure if we want to introduce a dependency on JDBC classes in the vectors. Take a look at DateAccessor, TimeAccessor, and TimeStampAccessor. These are generated from [SqlAccessor](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/codegen/templates/SqlAccessors.java). The get methods in these convert from UTC to a Local{Date|Time|TimeStamp}. Subsequently they convert to the JDBC type since they are used by the JDBC driver. The vectors should be able to do the same, just returning the Local{Date|Time|TimeStamp} object. I'm not sure if that might affect tests that depend on timezone though. Perhaps @vdiravka can comment. ---
[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...
GitHub user jiang-wu opened a pull request: https://github.com/apache/drill/pull/1184 DRILL-6242 - Use java.sql.[Date|Time|Timestamp] classes to hold value⦠See Jira ticket for details. Use java.sql.Date, java.sql.Time, and java.sql.Timestamp as the Java representation for their corresponding Drill types. This does not lose any precisions as these classes are just simple subclasses of java.util.Date with millisecond precision. But using these classes allows the command line to properly format the data using org.apache.drill.exec.util.JsonStringArrayList and org.apache.drill.exec.util.JsonStringHashMap. The changes are simple enough. But many Test** methods need to be updated to use java.sql.Date|Time|Timestamp. Opt not to optimize these changes. Places still use joda DateTime to parse date and time as before, but then converted to the java.sql.Date|Time|Timestamp as appropriate. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jiang-wu/drill DRILL-6242-master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1184.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1184 commit 7cbb8b81196732cb223c031cd629d9bc941640d9 Author: Jiang WuDate: 2018-03-22T20:42:20Z DRILL-6242 - Use java.sql.[Date|Time|Timestamp] classes to hold values from corresponding Drill date, time, and timestamp types. ---