jjtjiang commented on issue #8981:
URL: https://github.com/apache/hudi/issues/8981#issuecomment-1596539766

   > @jjtjiang Did this upsert ran using some older hudi version? It's a bit 
weird why hudi will write _hoodie_record_key as key:value with the latest 
version.
   > 
   > Did you manually deleted the directory before running using new hudi 
version?
   > 
   > There are high chances that this was an external table and in case you 
only dropped table it wouldn't have deleted the hudi directory. Please confirm 
if you have just dropped the table and not cleaned up the directory.
   
   
   @ad1happy2go  yes , i had manually deleted the directory before running 
using new hudi version .  i use bulksert to init table ,then  use upsert to 
sync increment data. i have see the file  create time ,it is newly generated
   
    the getRecordKey method  in 
org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator
     @Override
     public String getRecordKey(GenericRecord record) {
       // for backward compatibility, we need to use the right format according 
to the number of record key fields
       // 1. if there is only one record key field, the format of record key is 
just "<value>"
       // 2. if there are multiple record key fields, the format is 
"<field1>:<value1>,<field2>:<value2>,..."
       if (getRecordKeyFieldNames().size() == 1) {
         return KeyGenUtils.getRecordKey(record, 
getRecordKeyFieldNames().get(0), isConsistentLogicalTimestampEnabled());
       }
       return KeyGenUtils.getRecordKey(record, getRecordKeyFieldNames(), 
isConsistentLogicalTimestampEnabled());
     }
     public String getEmptyPartition() {
       return EMPTY_PARTITION;
     }
   
   According to the getRecordKey  method notes,  _hoodie_record_key as 
key:value when using bulksert   is  right ,but it acctually  not contain  key 
,just only have value. when using upsert ,it is right , 
   the upsert and bulksert  params  is the same, except the param 
"hoodie.datasource.write.operation".
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to