[GitHub] [hudi] jonathanmorais edited a comment on issue #2123: Timestamp not parsed correctly on Athena

GitBox Thu, 10 Dec 2020 08:43:09 -0800


jonathanmorais edited a comment on issue #2123:
URL: https://github.com/apache/hudi/issues/2123#issuecomment-742607488



   @satishkotha not exacly, I followed the tip 
`hoodie.datasource.hive_sync.support_timestamp=true` and it's not working for 
me, using partitioned tables, I thought this could be the problem.
   
   here my spark-submit:
   
   ```
   spark-submit --class 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
   --packages 
org.apache.spark:spark-avro_2.11:2.4.4,org.apache.hudi:hudi-utilities-bundle_2.11:0.5.1-incubating
 \
   --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \
   --master yarn --deploy-mode client \
   
/usr/lib/hudi/hudi-spark-bundle.jar,/usr/lib/hudi/hudi-utilities-bundle.jar,/usr/lib/spark/external/lib/spark-avro.jar
 \
   --source-ordering-field <id_field> \
   --source-class org.apache.hudi.utilities.sources.ParquetDFSSource \
   --target-base-path s3://<bucket>/<prefix --target-table <table> --table-type 
COPY_ON_WRITE \
   --transformer-class org.apache.hudi.utilities.transform.AWSDmsTransformer \
   --payload-class org.apache.hudi.payload.AWSDmsAvroPayload \
   --hoodie-conf 
hoodie.datasource.hive_sync.support_timestamp=true,hoodie.datasource.write.recordkey.field=<id_field>,hoodie.datasource.write.partitionpath.field=<partition_field>,hoodie.deltastreamer.source.dfs.root=s3://<bucket>/<preifx>
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] jonathanmorais edited a comment on issue #2123: Timestamp not parsed correctly on Athena

Reply via email to