yihua commented on code in PR #14090:
URL: https://github.com/apache/hudi/pull/14090#discussion_r2443890909
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSecondaryIndexPruning.scala:
##########
@@ -1767,6 +1767,146 @@ class TestSecondaryIndexPruning extends
SparkClientFunctionalTestHarness {
)
}
+ /**
+ * Test Secondary Index with partition path update using global record index.
+ * This test validates that when a record moves from one partition (file
group) to another
+ * using global index, the secondary index is correctly updated and queries
work as expected.
+ *
+ * Test flow:
+ * 1. Create a table with global index enabled
+ * 2. Insert records into different partitions with a secondary index
+ * 3. Update partition path of a record (moving it from partition A to B)
+ * 4. Validate secondary index metadata is correct (no duplicates, no
missing entry)
+ * 5. Validate query results using secondary index pruning
+ */
+ @ParameterizedTest
+ @CsvSource(Array("COPY_ON_WRITE,true", "COPY_ON_WRITE,false",
"MERGE_ON_READ,true", "MERGE_ON_READ,false"))
+ def testSecondaryIndexWithPartitionPathUpdateUsingGlobalIndex(tableType:
HoodieTableType,
Review Comment:
Partition path updates for `MERGE_ON_READ` table would add log files for
deletes and inserts after global index, which also reads the file groups. So
it would be good to have test coverage on `MERGE_ON_READ` table type.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]