maheshguptags commented on issue #12738:
URL: https://github.com/apache/hudi/issues/12738#issuecomment-2687208952
I did try the same configuration with hudi 0.15 and it is giving me the
expected result unlike hudi 1.0.0.
here is the screenshot for the same along with the table creation schema and
hoodie.properties.
```
#Properties saved on 2025-02-27T07:54:16.016244Z
#Thu Feb 27 07:54:16 UTC 2025
hoodie.table.type=COPY_ON_WRITE
hoodie.table.precombine.field=ts
hoodie.table.partition.fields=x
hoodie.archivelog.folder=archived
hoodie.table.cdc.enabled=false
hoodie.timeline.layout.version=1
hoodie.table.checksum=1292384652
hoodie.datasource.write.drop.partition.columns=false
hoodie.table.recordkey.fields=x,y
hoodie.table.name=customer_profile
hoodie.datasource.write.hive_style_partitioning=false
hoodie.table.keygenerator.class=org.apache.hudi.keygen.ComplexAvroKeyGenerator
hoodie.database.name=default_database
hoodie.datasource.write.partitionpath.urlencode=false
hoodie.table.version=6
```
`
app.streamingDdlQuery=CREATE TABLE IF NOT EXISTS customer_profile(x STRING,y
STRING,created_date TIMESTAMP(3), ts STRING)PARTITIONED BY (`x`)WITH
('connector' = 'hudi','write.task.max.size' = '2048','write.merge.max_memory' =
'1024','path' = '${app.cdp.base.path}/customer_profile_temp_6/','table.type' =
'COPY_ON_WRITE','hoodie.datasource.write.recordkey.field' =
'x,y','payload.class'='com.gupshup.cdp.poc','precombine.field'='ts','hoodie.clean.async'='true','hoodie.cleaner.policy'
= 'KEEP_LATEST_COMMITS','hoodie.clean.automatic' =
'true','hoodie.clean.max.commits'='8','hoodie.clean.trigger.strategy'='NUM_COMMITS','hoodie.cleaner.parallelism'='100','hoodie.cleaner.commits.retained'='6',
'hoodie.index.type'= 'BUCKET','hoodie.index.bucket.engine' =
'SIMPLE','hoodie.bucket.index.num.buckets'='16','hoodie.bucket.index.hash.field'='y','hoodie.parquet.small.file.limit'='104857600','hoodie.parquet.compression.codec'='snappy','hoodie.schema.on.read.enable'=
'true','hoodie.archive.automatic'
='true','hoodie.keep.max.commits'= '45','hoodie.keep.min.commits'= '30')
`
ss: ingested 2.5M record and manually killed TM on 2nd chkpntg and it is
not losing any data and successfully writing to hudi table in 3rd
checkpointing.
**post 3rd chkpnt no data ingested so it completing with MS**
<img width="1180" alt="Image"
src="https://github.com/user-attachments/assets/face9da7-71b7-4af6-be62-514565f2d7b9"
/>
**Not sure but it seem this is the issue with MDT**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]