hudi-bot opened a new issue, #17367: URL: https://github.com/apache/hudi/issues/17367
While working towards making partition stats default, we ran into an issue, with Byte data type [https://github.com/apache/hudi/pull/12671] min max values when merging multiple values did not align w/ manually computed stats. Check for column "c7" in tests in TestColStatsIndex. To reproduce: switch data type of C7 to "Byte". and run TestColumnStatsIndex.testMetadataColumnStatsIndex. comment out ``` assertEquals(asJson(sort(expectedColStatsIndexTableDf, validationSortColumns)), asJson(sort(transposedColStatsDF.drop("fileName"), validationSortColumns))) ``` in ColumnStatIndexTestBase. Run the test for COW table. you may find the issue w/ below validation ``` assertEquals(asJson(sort(manualColStatsTableDF.drop(colsToDrop: _*), pValidationSortColumns)), asJson(sort(pTransposedColStatsDF.drop(colsToDrop: _*), pValidationSortColumns))) ``` ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-8909 - Type: Sub-task - Parent: https://issues.apache.org/jira/browse/HUDI-9100 - Fix version(s): - 1.1.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
