sivabalan narayanan created HUDI-8909:
-----------------------------------------
Summary: Support/Fix Byte data type w/ partition stats
Key: HUDI-8909
URL: https://issues.apache.org/jira/browse/HUDI-8909
Project: Apache Hudi
Issue Type: Improvement
Components: metadata
Reporter: sivabalan narayanan
While working towards making partition stats default, we ran into an issue,
with Byte data type
[https://github.com/apache/hudi/pull/12671]
min max values when merging multiple values did not align w/ manually computed
stats. Check for column "c7" in tests in TestColStatsIndex.
To reproduce:
switch data type of C7 to "Byte".
and run
TestColumnStatsIndex.testMetadataColumnStatsIndex.
comment out
```
assertEquals(asJson(sort(expectedColStatsIndexTableDf, validationSortColumns)),
asJson(sort(transposedColStatsDF.drop("fileName"), validationSortColumns)))
```
in ColumnStatIndexTestBase.
Run the test for COW table. you may find the issue w/ below validation
```
assertEquals(asJson(sort(manualColStatsTableDF.drop(colsToDrop: _*),
pValidationSortColumns)),
asJson(sort(pTransposedColStatsDF.drop(colsToDrop: _*),
pValidationSortColumns)))
```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)