danny0405 commented on code in PR #13650:
URL: https://github.com/apache/hudi/pull/13650#discussion_r2354559671
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkBaseIndexSupport.scala:
##########
@@ -179,9 +179,16 @@ abstract class SparkBaseIndexSupport(spark: SparkSession,
// For tables with version < 9, single record key field, and using
complex key generator,
// avoid using the index due to ambiguity in record key encoding
+
+ // create a table version 8 with the old encoding and new encoding
+ // enable rli
+ // run a query with this skipping logic
+ // validate query result with data skipping enabled, before fix result
will be wrong
+ // after fix it wont do pruning and return correct result.
+ // TestRecordLeveIndex class check those, and modify record level index
tests to cover key encoding
+
val tableVersion = metaClient.getTableConfig.getTableVersion
val shouldSkipIndex = tableVersion.lesserThan(HoodieTableVersion.NINE) &&
- fieldCount == 1 &&
Review Comment:
nit: or we can have a `isComplexKeyGenerator` method to avoid parsing the
record key fields number 2 times.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]