rshanmugam1 commented on issue #7244:
URL: https://github.com/apache/hudi/issues/7244#issuecomment-1335229534

   tried with precombine key same behavior. 
   
   ![Screen Shot 2022-12-02 at 5 16 48 
AM](https://user-images.githubusercontent.com/42749351/205301347-7ee66b56-2096-4c49-bb88-2a3a0f00f0ef.png)
   
   
   ```
   {{ config(
       materialized = 'incremental',
       incremental_strategy = 'merge',
       file_format = 'hudi',
       options={
         'type': 'cow',
         'primaryKey': 'id',
         'preCombineKey': 'ts',
       },
       unique_key = 'id',
   ) }}
   {% if not is_incremental() %}
   
   select cast(1 as bigint) as id, 'yo' as msg, current_timestamp() as ts
   union all
   select  cast(2 as bigint) as id, 'anyway' as msg, current_timestamp() as ts
   union all
   select  cast(3 as bigint) as id, 'bye' as msg, current_timestamp() as ts
   
   {% else %}
   
   select  cast(1 as bigint) as id, 'yo_updated' as msg, current_timestamp() as 
ts
   union all
   select cast(2 as bigint) as id, 'anyway_updated' as msg, current_timestamp() 
as ts
   union all
   select  cast(3 as bigint) as id, 'bye_updated' as msg, current_timestamp() 
as ts
   
   {% endif %}
   ```
   
   dbt queries
   ```
   create table analytics.test_merge_3
   using hudi
   options (type "cow" , primaryKey "id" , preCombineKey "ts")
   as
   select cast(1 as bigint) as id, 'yo' as msg, current_timestamp() as ts
   union all
   select  cast(2 as bigint) as id, 'anyway' as msg, current_timestamp() as ts
   union all
   select  cast(3 as bigint) as id, 'bye' as msg, current_timestamp() as ts
   
   
   merge into analytics.test_merge_3 as DBT_INTERNAL_DEST
   using test_merge_3__dbt_tmp as DBT_INTERNAL_SOURCE
   on 
           DBT_INTERNAL_SOURCE.id = DBT_INTERNAL_DEST.id
   
   when matched then update set
      * 
   when not matched then insert *
   ```
   
   spark stages
   ![Screen Shot 2022-12-02 at 5 18 03 
AM](https://user-images.githubusercontent.com/42749351/205301572-d79b7d59-dc29-4f6d-9ee9-e71defa6519c.png)
   
   hoodie.properties
   ```
   #Properties saved on Fri Dec 02 12:15:42 UTC 2022
   #Fri Dec 02 12:15:42 UTC 2022
   hoodie.table.partition.fields=
   hoodie.table.type=COPY_ON_WRITE
   hoodie.archivelog.folder=archived
   hoodie.timeline.layout.version=1
   hoodie.table.version=3
   hoodie.table.recordkey.fields=id
   hoodie.datasource.write.partitionpath.urlencode=false
   hoodie.table.keygenerator.class=org.apache.hudi.keygen.ComplexKeyGenerator
   hoodie.table.name=test_merge_3
   hoodie.datasource.write.hive_style_partitioning=true
   
hoodie.table.create.schema={"type"\:"record","name"\:"topLevelRecord","fields"\:[{"name"\:"_hoodie_commit_time","type"\:["string","null"]},{"name"\:"_hoodie_commit_seqno","type"\:["string","null"]},{"name"\:"_hoodie_record_key","type"\:["string","null"]},{"name"\:"_hoodie_partition_path","type"\:["string","null"]},{"name"\:"_hoodie_file_name","type"\:["string","null"]},{"name"\:"id","type"\:"long"},{"name"\:"msg","type"\:"string"},{"name"\:"ts","type"\:{"type"\:"long","logicalType"\:"timestamp-micros"}}]}
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to