Happy new year to the Paimon user group!

We are in the process of extending Apache Xtable (
https://github.com/apache/incubator-xtable) to support Paimon as a source
table format.

I have been testing a draft PR (
https://github.com/apache/incubator-xtable/pull/767/files) for parsing
Paimon column stats. I'd like to validate a couple of assumptions with
folks with more knowledge in the community:

- RE: DataFileMeta.keyStats() vs .valueStats, in our testing I have noticed
that column stats for primary keys are always present in
`DataFileMeta.valueStats()`. Is it safe to ignore `DataFileMeta.keyStats()`
in all scenarios, and assume `.valueStats()` is always complete?
https://github.com/apache/incubator-xtable/pull/767/files#r2670898850

- RE: DataFileMeta.valueStatsCols(), I have noticed that sometimes this
field is null (e.g. when stats are being collected on all columns). Is it
safe to assume when .valueStatsCols() is null, we should interpret
.valueStats() with columns in the same order as the schema?
https://github.com/apache/incubator-xtable/pull/767/files#r2670899086

We also have some other items for discussion, I shall make separate threads
for them!

Many thanks
Mao

Reply via email to