Andrew Wong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17213 )

Change subject: [backup] set spark.sql.legacy.parquet.int96RebaseModeInWrite
......................................................................

[backup] set spark.sql.legacy.parquet.int96RebaseModeInWrite

After the bump to Spark 3.1.1, TestKuduBackup.testRandomBackupAndRestore
started failing with errors like the following:

02:04:37.919 [ERROR - Executor task launch worker for task 0.0 in stage 0.0 
(TID 0)] (Logging.scala:94) Aborting task
org.apache.spark.SparkUpgradeException: You may get a different result due to 
the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps 
before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the 
files may be read by Spark 2.x or legacy versions of Hive later, which uses a 
legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian 
calendar. See more details in SPARK-31404. You can set 
spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the 
datetime values w.r.t. the calendar difference during writing, to get maximum 
interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 
'CORRECTED' to write the datetime values as it is, if you are 100% sure that 
the written files will only be read by Spark 3.0+ or other systems that use 
Proleptic Gregorian calendar.
        at 
org.apache.spark.sql.execution.datasources.DataSourceUtils$.newRebaseExceptionInWrite(DataSourceUtils.scala:165)
 ~[spark-sql_2.12-3.1.1.jar:3.1.1]
...

Per their instructions, this sets the int96RebaseModeInWrite option.

Change-Id: Ib9ca4d9e69785dd9d056fa8e62c944d56cf219ed
Reviewed-on: http://gerrit.cloudera.org:8080/17213
Reviewed-by: Grant Henke <[email protected]>
Tested-by: Andrew Wong <[email protected]>
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Grant Henke: Looks good to me, approved
  Andrew Wong: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17213
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib9ca4d9e69785dd9d056fa8e62c944d56cf219ed
Gerrit-Change-Number: 17213
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)

Reply via email to