MaxGekk opened a new pull request #31589:
URL: https://github.com/apache/spark/pull/31589
### What changes were proposed in this pull request?
Modify `RandomDataGenerator.forType()` to allow generation of
dates/timestamps that are valid in both Julian and Proleptic Gregorian
calendars. Currently, the function can produce a date (for example
`1582-10-06`) which is valid in the Proleptic Gregorian calendar. Though it
cannot be saved to ORC files AS IS since ORC format (ORC libs in fact) assumes
Julian calendar. So, Spark shifts `1582-10-06` to the next valid date
`1582-10-15` while saving it to ORC files. And as a consequence of that, the
test fails because it compares original date `1582-10-06` and the date
`1582-10-15` loaded back from the ORC files.
In this PR, I propose to generate valid dates/timestamps in both calendars
for ORC datasource till SPARK-34440 is resolved.
### Why are the changes needed?
The changes fix failures of `HiveOrcHadoopFsRelationSuite`. For instance,
the test "test all data types" fails with the seed **610710213676**:
```
== Results ==
!== Correct Answer - 20 == == Spark Answer - 20 ==
struct<index:int,col:date> struct<index:int,col:date>
...
![9,1582-10-06] [9,1582-10-15]
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
By running the modified test suite:
```
$ build/sbt -Phive -Phive-thriftserver "test:testOnly
*HiveOrcHadoopFsRelationSuite"
```
Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 03161055de0c132070354407160553363175c4d7)
Signed-off-by: Max Gekk <[email protected]>
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]