danny0405 commented on code in PR #13650:
URL: https://github.com/apache/hudi/pull/13650#discussion_r2299553145
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkBaseIndexSupport.scala:
##########
@@ -173,18 +188,14 @@ abstract class SparkBaseIndexSupport(spark: SparkSession,
var recordKeyQueries: List[Expression] = List.empty
var compositeRecordKeys: List[String] = List.empty
val recordKeyOpt = getRecordKeyConfig
-
val isComplexRecordKey = {
+ val keyGeneratorClassName =
metaClient.getTableConfig.getKeyGeneratorClassName
val fieldCount = recordKeyOpt.map(recordKeys =>
recordKeys.length).getOrElse(0)
- val encodeFieldNameConfig =
metaClient.getTableConfig.getProps.getProperty(
-
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.key(),
-
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.defaultValue().toString
- ).toBoolean
-
+ val isUsingComplexKeyGen = isComplexKeyGenerator(keyGeneratorClassName)
// Consider as complex if:
// 1. Multiple fields (> 1), OR
- // 2. Single field with complex keygen encoding enabled
- (fieldCount > 1) || (fieldCount == 1 && encodeFieldNameConfig)
+ // 2. Using complex key generator with single field
+ fieldCount > 1 || (isUsingComplexKeyGen && fieldCount == 1)
Review Comment:
> Though such a configuration setup is not recommended, it can technically
happen, so we cannot check the number of partition fields
This is a chaos, I feel like we should not encode the record key based on
the key gen class, we should follow a uniform criteria for the encoding, it
should be just related with the number of record key fields itself, otherwise
it is hard to control this behavior, user can always switch the key gen class,
and there still could be correctness issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]