yihua commented on code in PR #13650:
URL: https://github.com/apache/hudi/pull/13650#discussion_r2299453965
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkBaseIndexSupport.scala:
##########
@@ -173,18 +188,14 @@ abstract class SparkBaseIndexSupport(spark: SparkSession,
var recordKeyQueries: List[Expression] = List.empty
var compositeRecordKeys: List[String] = List.empty
val recordKeyOpt = getRecordKeyConfig
-
val isComplexRecordKey = {
+ val keyGeneratorClassName =
metaClient.getTableConfig.getKeyGeneratorClassName
val fieldCount = recordKeyOpt.map(recordKeys =>
recordKeys.length).getOrElse(0)
- val encodeFieldNameConfig =
metaClient.getTableConfig.getProps.getProperty(
-
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.key(),
-
org.apache.hudi.config.HoodieWriteConfig.COMPLEX_KEYGEN_ENCODE_SINGLE_RECORD_KEY_FIELD_NAME.defaultValue().toString
- ).toBoolean
-
+ val isUsingComplexKeyGen = isComplexKeyGenerator(keyGeneratorClassName)
// Consider as complex if:
// 1. Multiple fields (> 1), OR
- // 2. Single field with complex keygen encoding enabled
- (fieldCount > 1) || (fieldCount == 1 && encodeFieldNameConfig)
+ // 2. Using complex key generator with single field
+ fieldCount > 1 || (isUsingComplexKeyGen && fieldCount == 1)
Review Comment:
@danny0405 on Spark/Java side with Complex key generator, user can use
specify Complex key generator with a single record key field and single
partition path field, and the writer still successfully writes the data, and
the table ends up having the following table configs:
```
hoodie.table.keygenerator.type=COMPLEX
hoodie.table.partition.fields=partition
hoodie.table.recordkey.fields=_row_key
```
This can happen if user has a centralized config system to always use
Complex key generator. Then the record key is encoded as
`_row_key:76fa9f9c-a3f5-4d8a-851b-82f07a7ffab1`.
Though such a configuration setup is not recommended, it can technically
happen, so we cannot check the number of partition fields. The correct check
is thus `(isUsingComplexKeyGen && fieldCount == 1)`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]