rajgowtham24 opened a new issue #2075:
URL: https://github.com/apache/hudi/issues/2075


   Hi team,
   
   In one of our tables, we have Version as a Pre combine field and in the 
write option we have used the same and found that it's not working as expected. 
Whereas in some of the tables we have changed on column as pre-combine field 
with timestamp as data and it's working fine for those tables
   
   When we have the precombine key as Numbers in tables, If the numbers is less 
than 10, it's working as expected, if it's more than that it's inserting the 
Version 9 into table and not considering the values greater than 9.
   
   And i'm not sure it's a bug or i'm using incorrect option while writing. 
   
   sample dataframe value
   
   +-------+-------+----------+
   |   NAME|VERSION|CHANGED_BY|
   +-------+-------+----------+
   |T009S50|      3|   USER001|
   |T009S50|      2|   USER002|
   |T009S50|      1|   USER002|
   |T009S50|      5|   USER001|
   |T009S50|      4|   USER001|
   |T009S50|      6|   USER002|
   |T009S50|      7|   USER002|
   |T009S50|      8|   USER001|
   |T009S50|      9|   USER001|
   |T009S50|     10|   USER003|
   +-------+-------+----------+
   
   Write Options used
   
   
input_df.write.format("org.apache.hudi").option("hoodie.datasource.write.recordkey.field",
 
"NAME).option("hoodie.datasource.write.precombine.field","VERSION").option("hoodie.table.name","TABLE1").option("hoodie.datasource.write.storage.type","MERGE_ON_READ").option("hoodie.datasource.hive_sync.enable","true").option("hoodie.datasource.hive_sync.table","TABLE1").option("hoodie.datasource.hive_sync.assume_date_partitioning","false").option("hoodie.datasource.hive_sync.partition_extractor_class","org.apache.hudi.hive.NonPartitionedExtractor").mode("Overwrite").save(s3:\\mybucket\tablepath\)
   
   Inserted Record From Hive
   
   T009S50  9       USER001
   
   I have tested the above scenario for few of the datasets
   
   Environment Details
   emr-6.0.0
   Hudi Version - 0.5.0
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to