[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-26 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r184424638
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

The string representation between LocalDateTime and Timestamp are not 
exactly the same, but that is potentially fixable since we can alter the way 
the values are displayed via formatters.  Though JDBC is not just for getting 
string representations.  It is on the programmatic use cases where we are 
getting the value objects where one would see the disconnect on the data types. 
 Will try this out with a use case I have with programmatic JDBC access and see 
what are the impacts on different types for the same expected value.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-25 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r184243856
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

Hmm. That takes us back to the original problem, that of the 
date|time|timestamp field inside a complex object. 
```
select t.context.date, t.context from test t;

will return a java.sql.Date object for column 1, but a java.time.LocalDate 
for the same object inside column 2. This doesn't seem like a good thing.
```
Why should that be a bad thing though? Ultimately, the object returned by 
getObject() is displayed to the end user thru the toString method. The string 
representation of Local[Date|Time|Timestamp]  should be the same as that of 
java.sql.[Date|Time|Timestamp]. Isn't it?



---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-13 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181513279
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

Sounds good.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-13 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181493581
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

How about we use Java 8 Local[Data|Time|Timestamp] for the public interface 
methods?  That sets things up for the future.

Internally, I won't change the logic that is using Joda DateTime, that is 
doing the various time zone stuff.  That behind the scene logic can be 
separately updated after determine what is the right behavior Drill wants to 
support.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-13 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181472373
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

I think updating to Java8 LocalDate/Time classes would be good choice. And 
it will be step forward in the resolving of the Drill's Date/Time issues 
mentioned in different Jiras: 
[DRILL-5334](https://issues.apache.org/jira/browse/DRILL-5334), 
[DRILL-5332](https://issues.apache.org/jira/browse/DRILL-5332) etc.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-13 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181461481
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

Either one is fine (since java.time is based on Joda). We've switched to 
Java 8, but just for consistency with the rest of the code, we might as well 
use Joda.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-12 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181256294
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

sure, I can update to the joda version or the java.time version.  Which one 
is preferred?  java.time.* is available in java 1.8+


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-12 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r181248969
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

Agree that messing with Timezones is a Bad Thing. Probably an artifact of 
the way java.util did things.
Anyway, I did mean using org.joda.time.Local[Data|Time|TimeStamp]  or the 
corresponding java.time.* classes.



---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-11 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r180933601
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

BTW, Drill internally tries to fool around with the timezone to preserve 
the "textual representation" look complex.  I am not convinced this is the 
"right" way to handle time.  But in any case, that is outside of the scope of 
this change.

I mentioned in the Jira a comment on how such timezone manipulation is 
dangerous and lead to errors.  I ran into that when attempting at creating a 
unit test for the change made in the pull request.


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-11 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r180925699
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

Good point.  Someone with deeper knowledge should take a look.  As far as I 
can tell, I think the problem with SqlAccessor is that it uses the vector type 
to know whether to invoke getDate() vs getTimestamp().  However, such vector 
type knowledge is not there when complex type such as List and Map are 
materialized in memory to form Java JsonStringArrayList and JsonStringHashMap.  

In https://issues.apache.org/jira/browse/DRILL-6242, the example describes 
this scenario:

`select t.context.`date`, t.context from test t;`

where the `date` field is inside a Map type.  And we select the field by 
itself as well as select the Map on the same query.  This query returns:

`++-+ `
`| EXPR$0 | context | `
`++-+ `
`| 2018-03-13 | {"date":{"dayOfYear":72,"year":2018, ... |`

One can see that the first column shows the date in the right type.  But 
the same date is shown as a different type inside the Map.  In the vector 
package, when reading the [List|Map]Vector, the code produces its nested member 
values via the generic method "getObject()".  Since all three vector type 
returned the same DataObject type as the representation, there are no 
distinction.

For the type information to be carried within the List | Map, it would seem 
that the value should be of distinct types.  These can be 
java.sql.[Date|Time|Timestamp] or some other [Date|Time|Timestamp] classes.




---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-04-11 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1184#discussion_r180914676
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -509,15 +509,15 @@ public long getTwoAsLong(int index) {
 public ${friendlyType} getObject(int index) {
   org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
org.joda.time.DateTimeZone.UTC);
   date = 
date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
-  return date;
+  return new java.sql.Date(date.getMillis());
--- End diff --

@jiang-wu Thanks for making these changes. Your fix is on the right track.  
However, I'm not sure if we want to introduce a dependency on JDBC classes in 
the vectors. 
Take a look at DateAccessor, TimeAccessor, and TimeStampAccessor. These are 
generated from 
[SqlAccessor](https://github.com/apache/drill/blob/master/exec/java-exec/src/main/codegen/templates/SqlAccessors.java).
 
The get methods in these convert from UTC to a Local{Date|Time|TimeStamp}.  
Subsequently they convert to the JDBC type since they are used by the JDBC 
driver.
The vectors should be able to do the same, just returning the 
Local{Date|Time|TimeStamp} object.
I'm not sure if that might affect tests that depend on timezone though. 
Perhaps @vdiravka can comment. 


---


[GitHub] drill pull request #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] cl...

2018-03-22 Thread jiang-wu
GitHub user jiang-wu opened a pull request:

https://github.com/apache/drill/pull/1184

DRILL-6242 - Use java.sql.[Date|Time|Timestamp] classes to hold value…

See Jira ticket for details.  Use java.sql.Date, java.sql.Time, and 
java.sql.Timestamp as the Java representation for their corresponding Drill 
types.  This does not lose any precisions as these classes are just simple 
subclasses of java.util.Date with millisecond precision.  But using these 
classes allows the command line to properly format the data using 
org.apache.drill.exec.util.JsonStringArrayList and 
org.apache.drill.exec.util.JsonStringHashMap.

The changes are simple enough.  But many Test** methods need to be updated 
to use java.sql.Date|Time|Timestamp.  Opt not to optimize these changes.  
Places still use joda DateTime to parse date and time as before, but then 
converted to the java.sql.Date|Time|Timestamp as appropriate.  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiang-wu/drill DRILL-6242-master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1184.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1184


commit 7cbb8b81196732cb223c031cd629d9bc941640d9
Author: Jiang Wu 
Date:   2018-03-22T20:42:20Z

DRILL-6242 - Use java.sql.[Date|Time|Timestamp] classes to hold values from 
corresponding Drill date, time, and timestamp types.




---