[GitHub] [hudi] big-doudou commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

via GitHub Mon, 24 Jul 2023 23:07:43 -0700


big-doudou commented on issue #8892:
URL: https://github.com/apache/hudi/issues/8892#issuecomment-1649178251


   > @big-doudou Apologies for the late reply. I was trying to reproduce this 
issue on our end, but was unable to do so.
   > 
   > A little context on what we did:
   > 
   > Using a datagen source, we'll sink the data into a hudi table. Before a 
checkpoint, we'll kill one of the TM's task. Upon doing so, a rollback will be 
triggered when all the TMs restart. I checked with a colleague of mine and they 
mentioned that when hudi is uperforming an upsert, there's a shuffle operation. 
The presence of a shuffle operation will trigger a "global failover".
   > 
   > Here's the Flink-SQL that i used while attempting to reproduce your issue.
   > 
   > ```sql
   > CREATE TEMPORARY TABLE buyer_info (
   >     id bigint, 
   >     dec_col decimal(25, 10),
   >     country string,
   >     age INT,
   >     update_time STRING
   > ) WITH (
   >     'connector' = 'datagen',
   >     'rows-per-second' = '10',
   >     'fields.age.min' = '0',
   >     'fields.age.max' = '7',
   >     'fields.country.length' = '1'
   > );
   > 
   > -- Hudi table to write to
   > CREATE TEMPORARY TABLE dim_buyer_info_test
   > (
   >     id bigint,
   >     dec_col decimal(25, 10),
   >     country string,
   >     age INT,
   >     update_time STRING
   > ) PARTITIONED BY (age)
   > WITH
   > (
   >     -- Hudi settings
   >     'connector' = 'hudi',
   >     'hoodie.datasource.write.recordkey.field' = 'id',
   >     'path' = '/path/to/hudi_table/duplicate_file_id_issue',
   >     'write.operation' = 'UPSERT',
   >     'table.type' = 'MERGE_ON_READ',
   >     'hoodie.compaction.payload.class' = 
'org.apache.hudi.common.model.PartialUpdateAvroPayload',
   >     'hoodie.datasource.write.payload.class' = 
'org.apache.hudi.common.model.PartialUpdateAvroPayload',
   >     'hoodie.table.keygenerator.class' = 
'org.apache.hudi.keygen.ComplexAvroKeyGenerator',
   >     'write.precombine.field' = 'update_time',
   >     'index.type' = 'BUCKET',
   >     'hoodie.bucket.index.num.buckets' = '4',
   >     'write.tasks' = '8',
   >     'hoodie.bucket.index.hash.field' = 'id',
   >     'clean.retain_commits' = '5',
   >     -- Hive sync settings
   >     'hive_sync.enable' = 'false'
   > );
   > 
   > -- Insert into Hudi sink
   > INSERT INTO dim_buyer_info_test
   > SELECT id, dec_col, country, age, update_time
   > FROM buyer_info;
   > ```
   > 
   > Might have butchered the explanation above...
   > 
   > As such, we were unable to reproduce your issue where of a single TM 
restarting.
   > 
   > Can you please share your job configurations and how you're doing your 
tests?
   
   Sorry, didn't see it in time
   My flink job runs on k8s, before checkpoint, after some log files are 
generated, kill the container


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] big-doudou commented on issue #8892: [SUPPORT] [BUG] Duplicate fileID ??? from bucket ?? of partition found during the BucketStreamWriteFunction index bootstrap.

Reply via email to