jjtjiang commented on issue #7057:
URL: https://github.com/apache/hudi/issues/7057#issuecomment-1846470553

   @ad1happy2go 
    i also face this problem  . 
   version : hudi 0.12.3
   how to   reproduce the issue:  just use the insert overwirte sql when  
insert a big table .
   here is my case: 
   row: 1148000000  (if the rows is smaller .eg 1000000 ,there will can't 
reproduce this issue)
    ddl :` create table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208(
     `_hoodie_is_deleted` BOOLEAN,
     `t_pre_combine_field` long,
       order_type int , 
       order_no int , 
       profile_no int , 
       profile_type string , 
       profile_cat string , 
       u_version string , 
       order_line_no int , 
       profile_c string , 
       profile_i int , 
       profile_f decimal(20,8) , 
       profile_d timestamp , 
       active string , 
       entry_datetime timestamp , 
       entry_id int , 
       h_version int )
   USING hudi
   TBLPROPERTIES (
     'hoodie.write.concurrency.mode'='optimistic_concurrency_control' ,
     'hoodie.cleaner.policy.failed.writes'='LAZY',
     
'hoodie.write.lock.provider'='org.apache.hudi.client.transaction.lock.FileSystemBasedLockProvider',
     'hoodie.write.lock.filesystem.expire'= 5,
     'primaryKey' = 'order_no,profile_type,profile_no,order_type,profile_cat',
     'type' = 'cow',
     'preCombineField' = 't_pre_combine_field')
   CLUSTERED BY ( 
     order_no,profile_type,profile_no,order_type,profile_cat) 
   INTO 2 BUCKETS;`
   
   sql: `
   insert
        overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208
   select
        false ,
        1,
        order_type ,
        order_no ,
        profile_no ,
        profile_type ,
        profile_cat ,
        u_version ,
        order_line_no ,
        profile_c ,
        profile_i ,
        profile_f ,
        profile_d ,
        active ,
        entry_datetime ,
        entry_id ,
        h_version
   from
         temp_db.ods_cis_dbo_history_profile_tmp ; `
   insert
        overwrite table temp_db.ods_cis_corp_history_profile_hudi_t1_20231208
   select
        false ,
        1,
        order_type ,
        order_no ,
        profile_no ,
        profile_type ,
        profile_cat ,
        u_version ,
        order_line_no ,
        profile_c ,
        profile_i ,
        profile_f ,
        profile_d ,
        active ,
        entry_datetime ,
        entry_id ,
        h_version
   from
         temp_db.ods_cis_dbo_history_profile_tmp ; 
   ./hoodie  dir file list:
   .hoodie/.aux
   .hoodie/.heartbeat
   .hoodie/.schema
   .hoodie/.temp
   .hoodie/20231207055239027.replacecommit
   .hoodie/20231207055239027.replacecommit.inflight
   .hoodie/20231207055239027.replacecommit.requested
   .hoodie/20231207084620796.replacecommit
   .hoodie/20231207084620796.replacecommit.inflight
   .hoodie/20231207084620796.replacecommit.requested
   .hoodie/20231207100918624.rollback
   .hoodie/20231207100918624.rollback.inflight
   .hoodie/20231207100918624.rollback.requested
   .hoodie/20231207100923823.rollback
   .hoodie/20231207100923823.rollback.inflight
   .hoodie/20231207100923823.rollback.requested
   .hoodie/20231207102003686.replacecommit
   .hoodie/20231207102003686.replacecommit.inflight
   .hoodie/20231207102003686.replacecommit.requested
   .hoodie/archived
   .hoodie/hoodie.properties
   .hoodie/metadata
   
   
   we cant see there is no file 20231207071610343.replacecommit.requested  . 
but  the program needs to find this file. so it failed .this make me wonder.
   
   hoodie.properties:
   hoodie.table.precombine.field=t_pre_combine_field
   hoodie.datasource.write.drop.partition.columns=false
   hoodie.table.type=COPY_ON_WRITE
   hoodie.archivelog.folder=archived
   hoodie.timeline.layout.version=1
   hoodie.table.version=5
   hoodie.table.metadata.partitions=files
   
hoodie.table.recordkey.fields=order_no,profile_type,profile_no,order_type,profile_cat
   hoodie.database.name=temp_db
   hoodie.datasource.write.partitionpath.urlencode=false
   
hoodie.table.keygenerator.class=org.apache.hudi.keygen.NonpartitionedKeyGenerator
   hoodie.table.name=ods_cis_corp_history_profile_hudi_t1_20231207
   hoodie.datasource.write.hive_style_partitioning=true
   hoodie.table.checksum=2702244832
   
hoodie.table.create.schema={"type"\:"record","name"\:"ods_cis_corp_history_profile_hudi_t1_20231207_record","namespace"\:"hoodie.ods_cis_corp_history_profile_hudi_t1_20231207","fields"\:[{"name"\:"_hoodie_commit_time","type"\:["string","null"]},{"name"\:"_hoodie_commit_seqno","type"\:["string","null"]},{"name"\:"_hoodie_record_key","type"\:["string","null"]},{"name"\:"_hoodie_partition_path","type"\:["string","null"]},{"name"\:"_hoodie_file_name","type"\:["string","null"]},{"name"\:"_hoodie_is_deleted","type"\:["boolean","null"]},{"name"\:"t_pre_combine_field","type"\:["long","null"]},{"name"\:"order_type","type"\:["int","null"]},{"name"\:"order_no","type"\:["int","null"]},{"name"\:"profile_no","type"\:["int","null"]},{"name"\:"profile_type","type"\:["string","null"]},{"name"\:"profile_cat","type"\:["string","null"]},{"name"\:"u_version","type"\:["string","null"]},{"name"\:"order_line_no","type"\:["int","null"]},{"name"\:"profile_c","type"\:["string","null"]},{"name"\:"profile_i",
 
"type"\:["int","null"]},{"name"\:"profile_f","type"\:[{"type"\:"fixed","name"\:"fixed","namespace"\:"hoodie.ods_cis_corp_history_profile_hudi_t1_20231207.ods_cis_corp_history_profile_hudi_t1_20231207_record.profile_f","size"\:9,"logicalType"\:"decimal","precision"\:20,"scale"\:8},"null"]},{"name"\:"profile_d","type"\:[{"type"\:"long","logicalType"\:"timestamp-micros"},"null"]},{"name"\:"active","type"\:["string","null"]},{"name"\:"entry_datetime","type"\:[{"type"\:"long","logicalType"\:"timestamp-micros"},"null"]},{"name"\:"entry_id","type"\:["int","null"]},{"name"\:"h_version","type"\:["int","null"]}]}
   
   more logs
   [hudi.log](https://github.com/apache/hudi/files/13608380/hudi.log)
    see the attachment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to