pwoody commented on a change in pull request #8167:
URL: https://github.com/apache/arrow/pull/8167#discussion_r487010013
##########
File path:
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DateConsumer.java
##########
@@ -85,13 +85,13 @@ public void consume(ResultSet resultSet) throws
SQLException {
Date date = calendar == null ? resultSet.getDate(columnIndexInResultSet)
:
resultSet.getDate(columnIndexInResultSet, calendar);
if (!resultSet.wasNull()) {
- int day = (int) TimeUnit.MILLISECONDS.toDays(date.getTime());
- if (day < 0 || day > MAX_DAY) {
+ long day = TimeUnit.MILLISECONDS.toDays(date.getTime());
+ if (day < Integer.MIN_VALUE || day > MAX_DAY) {
Review comment:
Is there context as to why the date range of -9999-12-31 to 9999-12-31?
I hadn't seen it in the format spec.
##########
File path:
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DateConsumer.java
##########
@@ -85,13 +85,13 @@ public void consume(ResultSet resultSet) throws
SQLException {
Date date = calendar == null ? resultSet.getDate(columnIndexInResultSet)
:
resultSet.getDate(columnIndexInResultSet, calendar);
if (!resultSet.wasNull()) {
- int day = (int) TimeUnit.MILLISECONDS.toDays(date.getTime());
- if (day < 0 || day > MAX_DAY) {
+ long day = TimeUnit.MILLISECONDS.toDays(date.getTime());
+ if (day < Integer.MIN_VALUE || day > MAX_DAY) {
Review comment:
Interesting, I'm mostly inclined to just open the bounds to whatever
epoch days fit within an integer range given that this is simply a reader and
underlying systems have different ranges. Thoughts?
##########
File path:
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DateConsumer.java
##########
@@ -85,13 +85,13 @@ public void consume(ResultSet resultSet) throws
SQLException {
Date date = calendar == null ? resultSet.getDate(columnIndexInResultSet)
:
resultSet.getDate(columnIndexInResultSet, calendar);
if (!resultSet.wasNull()) {
- int day = (int) TimeUnit.MILLISECONDS.toDays(date.getTime());
- if (day < 0 || day > MAX_DAY) {
+ long day = TimeUnit.MILLISECONDS.toDays(date.getTime());
+ if (day < Integer.MIN_VALUE || day > MAX_DAY) {
Review comment:
Interesting, I'm mostly inclined to just open the bounds to whatever
epoch days fit within an integer range given that this is simply a reader and
underlying systems have different ranges. Thoughts?
e.g. postgres has a max year of 5874897
##########
File path:
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/consumer/DateConsumer.java
##########
@@ -85,13 +85,13 @@ public void consume(ResultSet resultSet) throws
SQLException {
Date date = calendar == null ? resultSet.getDate(columnIndexInResultSet)
:
resultSet.getDate(columnIndexInResultSet, calendar);
if (!resultSet.wasNull()) {
- int day = (int) TimeUnit.MILLISECONDS.toDays(date.getTime());
- if (day < 0 || day > MAX_DAY) {
+ long day = TimeUnit.MILLISECONDS.toDays(date.getTime());
+ if (day < Integer.MIN_VALUE || day > MAX_DAY) {
Review comment:
Yeah, but otherwise we can end up with particularly bad data. I've
changed the condition to just be something similar to Math.toExactInt where we
see if the casted value is the same.
Without doing the big int comparison I think we run the risk of wrapping
around into a totally valid positive range (e.g. 2*Integer.MAX_VALUE + x)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]