Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/916
Back to my original question. The premise of this bug seems to be that we
corrupt Parquet dates and convert perfectly valid 4-digit years into invalid
5-digit years. That is clearly a data corruption bug that should never occur.
Why don't we fix that?
Given that we've accepted the data corruption, we need to display
five-digit years which the Java classes for date and time don't support in
`toString()`. The code uses `toString()` because it does not do correct
formatting using the classes provided. That's the second bug. Date display
should make use of format preferences provided by the user, not the default
ones provided by `toString()`. So, that's bug number 2.
Now given the above two bugs, we introduce a third by creating ad-hoc,
Drill-specific date/time classes, violating the JDBC standard, to display the
corrupt five-digit years. So, no longer will Drill return the java.sql.Date
class as specified by the standard, but rather our own subclass. How will this
affect client code that relies on standard behavior?
I feel we are compounding error upon error. Can we go back and fix the
original problem: that users might prefer that we don't corrupt dates in their
data? That is, the problem is not so much that we don't format corrupt data
correctly, but rather that we do, in fact, corrupt data.
---