neerajpadarthi commented on issue #6232:
URL: https://github.com/apache/hudi/issues/6232#issuecomment-1199914603

   @yihua 
   
   Hey, I have verified the same in Hudi 0.10.1 but no luck still precision is 
getting truncated. Below are the configs, spark session details and spark/Hudi 
outputs. Could you please verify and let me know if anything is missing here? 
Thanks 
   
   ===Environment Details
   
   EMR: emr-6.6.0
   Hudi version : 0.10.1
   Spark version : Spark 3.2.0
   Hive version : Hive 3.1.2
   Hadoop version :Storage (HDFS/S3/GCS..) : S3
   Running on Docker? (yes/no) : no
   
   ===Spark Configs
   
   def create_spark_session():
    spark = SparkSession \
    .builder \
    .config(“spark.sql.extensions”, 
“org.apache.spark.sql.hudi.HoodieSparkSessionExtension”) \
    .config(“spark.sql.parquet.writeLegacyFormat”, “true”) \
    .config(“spark.sql.parquet.outputTimestampType”, “TIMESTAMP_MICROS”) \
    .config(“spark.sql.legacy.parquet.datetimeRebaseModeInRead”, “LEGACY”)\
    .config(“spark.sql.legacy.parquet.int96RebaseModeInRead”,“LEGACY”)\
    .enableHiveSupport()\
    .getOrCreate()
   
   return spark
   
   ===Hudi Configs
   
   db_name = <>
   tableName = <>
   pk =<>
   de_dup =<>
   commonConfig = {‘hoodie.datasource.hive_sync.database’: 
db_name,‘hoodie.table.name’: 
tableName,‘hoodie.datasource.hive_sync.support_timestamp’: 
‘true’,‘hoodie.datasource.write.recordkey.field’: 
pk,‘hoodie.datasource.write.precombine.field’: 
de_dup,‘hoodie.datasource.hive_sync.enable’: 
‘true’,‘hoodie.datasource.hive_sync.table’: tableName}
   nonPartitionConfig = 
{‘hoodie.datasource.hive_sync.partition_extractor_class’:‘org.apache.hudi.hive.NonPartitionedExtractor’,‘hoodie.datasource.write.keygenerator.class’:‘org.apache.hudi.keygen.NonpartitionedKeyGenerator’}
   config = {‘hoodie.bulkinsert.shuffle.parallelism’: 
10,‘hoodie.datasource.write.operation’: 
‘bulk_insert’,‘hoodie.parquet.outputtimestamptype’:‘TIMESTAMP_MICROS’,
   #‘hoodie.datasource.write.row.writer.enable’:’false’}
   
   ===Spark DF Output
   +----------+--------------------------+--------------------------+
   |id        |creation_date             |last_updated              |
   +----------+--------------------------+--------------------------+
   |1340225   |2017-01-24 00:02:10       |2022-02-25 07:03:54.000853|
   |722b232f-e|2022-02-22 06:02:32.000481|2022-02-25 08:54:05.00042 |
   |53773de3-9|2022-02-25 07:21:06.000037|2022-02-25 08:35:57.000877|
   +----------+--------------------------+--------------------------+
   
   ===Hudi V0.10.1 Output
   
+-------------------+---------------------+------------------+----------------------+------------------------------------------------------------------------+----------+-------------------+-------------------+
   |_hoodie_commit_time|_hoodie_commit_seqno 
|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name                    
                                   |id        |creation_date      |last_updated 
      |
   
+-------------------+---------------------+------------------+----------------------+------------------------------------------------------------------------+----------+-------------------+-------------------+
   |20220729201157281  |20220729201157281_1_2|53773de3-9        |               
       
|55f7c820-c289-4eb7-aabc-4f079bd44536-0_1-11-10_20220729201157281.parquet|53773de3-9|2022-02-25
 07:21:06|2022-02-25 08:35:57|
   |20220729201157281  |20220729201157281_2_3|722b232f-e        |               
       
|0dd8d6c2-9d64-40d7-a4db-bf7cf95bd02c-0_2-11-11_20220729201157281.parquet|722b232f-e|2022-02-22
 06:02:32|2022-02-25 08:54:05|
   |20220729201157281  |20220729201157281_0_1|1340225           |               
       |2e0cf27b-999d-4d5e-9c4e-52d27c25294e-0_0-9-9_20220729201157281.parquet  
|1340225   |2017-01-24 00:02:10|2022-02-25 07:03:54|
   
+-------------------+---------------------+------------------+----------------------+------------------------------------------------------------------------+----------+-------------------+-------------------+


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to