Hello Kudu Jenkins, Grant Henke,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/17213
to look at the new patch set (#2).
Change subject: [backup] set spark.sql.legacy.parquet.int96RebaseModeInWrite
......................................................................
[backup] set spark.sql.legacy.parquet.int96RebaseModeInWrite
After the bump to Spark 3.1.1, TestKuduBackup.testRandomBackupAndRestore
started failing with errors like the following:
02:04:37.919 [ERROR - Executor task launch worker for task 0.0 in stage 0.0
(TID 0)] (Logging.scala:94) Aborting task
org.apache.spark.SparkUpgradeException: You may get a different result due to
the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps
before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the
files may be read by Spark 2.x or legacy versions of Hive later, which uses a
legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian
calendar. See more details in SPARK-31404. You can set
spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the
datetime values w.r.t. the calendar difference during writing, to get maximum
interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to
'CORRECTED' to write the datetime values as it is, if you are 100% sure that
the written files will only be read by Spark 3.0+ or other systems that use
Proleptic Gregorian calendar.
at
org.apache.spark.sql.execution.datasources.DataSourceUtils$.newRebaseExceptionInWrite(DataSourceUtils.scala:165)
~[spark-sql_2.12-3.1.1.jar:3.1.1]
...
Per their instructions, this sets the int96RebaseModeInWrite option.
Change-Id: Ib9ca4d9e69785dd9d056fa8e62c944d56cf219ed
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
1 file changed, 1 insertion(+), 0 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/17213/2
--
To view, visit http://gerrit.cloudera.org:8080/17213
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib9ca4d9e69785dd9d056fa8e62c944d56cf219ed
Gerrit-Change-Number: 17213
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)