abhishekshenoy edited a comment on issue #3313:
URL: https://github.com/apache/hudi/issues/3313#issuecomment-1054225531


   @nsivabalan i see the issue is closed . But in 0.10.1 i still face the 
duplicate issue when i provide a timestamp column as part of composite key.
   ```
       hoodiConfigs.put("hoodie.insert.shuffle.parallelism", "1")
       hoodiConfigs.put("hoodie.upsert.shuffle.parallelism", "1")
       hoodiConfigs.put("hoodie.bulkinsert.shuffle.parallelism", "1")
       hoodiConfigs.put("hoodie.delete.shuffle.parallelism", "1")
       hoodiConfigs.put("hoodie.datasource.write.row.writer.enable", "true")
       hoodiConfigs.put("hoodie.table.keygenerator.class", 
classOf[ComplexKeyGenerator].getName)
       hoodiConfigs.put("hoodie.datasource.write.keygenerator.class", 
classOf[ComplexKeyGenerator].getName)
       hoodiConfigs.put("hoodie.datasource.write.recordkey.field", 
"transactionId,storeNbr,transactionTs")
       hoodiConfigs.put("hoodie.datasource.write.precombine.field", 
"messageMetadata.srcLoadTs")
       hoodiConfigs.put("hoodie.table.precombine.field", 
"messageMetadata.srcLoadTs")
       hoodiConfigs.put("hoodie.datasource.write.partitionpath.field", 
"transactionDt")
       
hoodiConfigs.put("hoodie.datasource.write.payload.class",classOf[DefaultHoodieRecordPayload].getName)
       hoodiConfigs.put("hoodie.datasource.write.hive_style_partitioning", 
"true")
       
hoodiConfigs.put("hoodie.datasource.write.table.type",COW_TABLE_TYPE_OPT_VAL)
       hoodiConfigs.put("hoodie.combine.before.upsert","true")
       hoodiConfigs.put("hoodie.table.name","huditransaction")
       
hoodiConfigs.put("hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled","true")
   ```    
   
   Tried with both BULK_INSERT_OPERATION_OPT_VAL and UPSERT_OPERATION_OPT_VAL
       
   Output dataset after first Insert , If you see Combine before insert did not 
work 
   Key (abc, 4162 , 2022-02-25 05:08:10.73)
   ```
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   |_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key                
                                   |_hoodie_partition_path  |_hoodie_file_name  
                                                    
|transactionId|storeNbr|transactionTs          |messageMetadata                 
                  |prefixes  |dummyInt|dummyLong|dummyObjects    |transactionDt|
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   |20220228183206147  
|20220228183206147_0_1|transactionId:abc,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|abc
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_2|transactionId:abc,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|abc
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_3|transactionId:bcd,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|bcd
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_4|transactionId:cde,storeNbr:4163,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|cde
          |4163    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_5|transactionId:def,storeNbr:4163,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|def
          |4163    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   ```
   
   Republishing with an addition record for the same key (abc, 4162 , 
2022-02-25 05:08:10.73) does not get deduped
   ```
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   |_hoodie_commit_time|_hoodie_commit_seqno |_hoodie_record_key                
                                   |_hoodie_partition_path  |_hoodie_file_name  
                                                    
|transactionId|storeNbr|transactionTs          |messageMetadata                 
                  |prefixes  |dummyInt|dummyLong|dummyObjects    |transactionDt|
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   |20220228183206147  
|20220228183206147_0_1|transactionId:abc,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|abc
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_2|transactionId:abc,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|abc
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_3|transactionId:bcd,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|bcd
          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_4|transactionId:cde,storeNbr:4163,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|cde
          |4163    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183206147  
|20220228183206147_0_5|transactionId:def,storeNbr:4163,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|f86d9a60-8465-410d-bca6-c478bf3a48e9-0_0-10-0_20220228183206147.parquet|def
          |4163    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   |20220228183821299  
|20220228183821299_0_1|transactionId:abc,storeNbr:4162,transactionTs:2022-02-25 
05:08:10.073|transactionDt=2022-02-25|66ee158e-93f3-4ccc-8b2a-1712c3cdf5cf-0_0-2-0_20220228183821299.parquet
 |abc          |4162    |2022-02-25 05:08:10.073|{key, value, 1, 2, 2022-02-25 
05:09:10, 1, { -> }}|[abc, def]|1       |1        |[{a, 1}, {a, 1}]|2022-02-25  
 |
   
+-------------------+---------------------+---------------------------------------------------------------------+------------------------+-----------------------------------------------------------------------+-------------+--------+-----------------------+--------------------------------------------------+----------+--------+---------+----------------+-------------+
   ```
       


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to