jjtjiang commented on issue #8981:
URL: https://github.com/apache/hudi/issues/8981#issuecomment-1596539766
> @jjtjiang Did this upsert ran using some older hudi version? It's a bit
weird why hudi will write _hoodie_record_key as key:value with the latest
version.
>
> Did you manually deleted the directory before running using new hudi
version?
>
> There are high chances that this was an external table and in case you
only dropped table it wouldn't have deleted the hudi directory. Please confirm
if you have just dropped the table and not cleaned up the directory.
@ad1happy2go yes , i had manually deleted the directory before running
using new hudi version . i use bulksert to init table ,then use upsert to
sync increment data. i have see the file create time ,it is newly generated
the getRecordKey method in
org.apache.hudi.keygen.NonpartitionedAvroKeyGenerator
@Override
public String getRecordKey(GenericRecord record) {
// for backward compatibility, we need to use the right format according
to the number of record key fields
// 1. if there is only one record key field, the format of record key is
just "<value>"
// 2. if there are multiple record key fields, the format is
"<field1>:<value1>,<field2>:<value2>,..."
if (getRecordKeyFieldNames().size() == 1) {
return KeyGenUtils.getRecordKey(record,
getRecordKeyFieldNames().get(0), isConsistentLogicalTimestampEnabled());
}
return KeyGenUtils.getRecordKey(record, getRecordKeyFieldNames(),
isConsistentLogicalTimestampEnabled());
}
public String getEmptyPartition() {
return EMPTY_PARTITION;
}
According to the getRecordKey method notes, _hoodie_record_key as
key:value when using bulksert is right ,but it acctually not contain key
,just only have value. when using upsert ,it is right ,
the upsert and bulksert params is the same, except the param
"hoodie.datasource.write.operation".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]