Virmaline commented on issue #6278: URL: https://github.com/apache/hudi/issues/6278#issuecomment-1353791597
@alexeykudinkin Hey Alexey, I'm also still getting the same error after updating to 0.12.1. Hudi: 0.12.1-amzn-0-SNAPSHOT Spark: 3.3.0 EMR: 6.9.0 `spark-submit --master yarn --deploy-mode cluster --conf spark.serializer=org.apache.spark.serializer.KryoSerializer,spark.sql.parquet.datetimeRebaseModeInRead=CORRECTED,spark.sql.parquet.datetimeRebaseModeInWrite=CORRECTED,spark.sql.avro.datetimeRebaseModeInWrite=CORRECTED,spark.sql.avro.datetimeRebaseModeInRead=CORRECTED,spark.sql.legacy.parquet.datetimeRebaseModeInRead=CORRECTED,spark.sql.legacy.parquet.datetimeRebaseModeInWrite=CORRECTED,spark.sql.legacy.parquet.int96RebaseModeInRead=CORRECTED,spark.sql.legacy.parquet.int96RebaseModeInWrite=CORRECTED --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer /usr/lib/hudi/hudi-utilities-bundle.jar --table-type COPY_ON_WRITE --source-ordering-field replicadmstimestamp --source-class org.apache.hudi.utilities.sources.ParquetDFSSource --target-base-path s3://bucket/folder/folder/table --target-table table --payload-class org.apache.hudi.common.model.AWSDmsAvroPayload --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.TimestampBasedKeyGenerator --hoodie-conf hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING --hoodie-conf hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy-MM --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-dd HH:mm:ss.SSSSSS" --hoodie-conf hoodie.datasource.write.recordkey.field=_id --hoodie-conf hoodie.datasource.write.partitionpath.field=replicadmstimestamp --hoodie-conf hoodie.deltastreamer.source.dfs.root=s3://bucket/folder/folder/table` I've tried about the every combination of the datetimeRebaseMode I've managed to think of, and the result is always the same. stacktrace included, is there any possible workaround for this? I currently have a separate process to change the timestamp columns, which works, but adds a bunch of overhead to the process. [stacktrace.txt](https://github.com/apache/hudi/files/10241150/stacktrace.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
