Thank you Jingsong! Where value stats are empty, should I parse key stats? Or asking the question in a different way, would value stats always contain a superset of column stats compared to key stats?
Empirically I’ve found that column stats always contained all the columns I expect, but I’m unsure if there are edge cases where they would contain stats for different sets of columns. Thanks Mao On Thu, 8 Jan 2026 at 21:54, Jingsong Li <[email protected]> wrote: > Hi Mao, > > Value stats can be empty, it depends on value stats cols (if this is > null, all columns will be recorded in value stats). > > Best, > Jingsong > > On Thu, Jan 8, 2026 at 2:43 PM Mao Liu <[email protected]> wrote: > > > > Happy new year to the Paimon user group! > > > > We are in the process of extending Apache Xtable ( > https://github.com/apache/incubator-xtable) to support Paimon as a source > table format. > > > > I have been testing a draft PR ( > https://github.com/apache/incubator-xtable/pull/767/files) for parsing > Paimon column stats. I'd like to validate a couple of assumptions with > folks with more knowledge in the community: > > > > - RE: DataFileMeta.keyStats() vs .valueStats, in our testing I have > noticed that column stats for primary keys are always present in > `DataFileMeta.valueStats()`. Is it safe to ignore `DataFileMeta.keyStats()` > in all scenarios, and assume `.valueStats()` is always complete? > > https://github.com/apache/incubator-xtable/pull/767/files#r2670898850 > > > > - RE: DataFileMeta.valueStatsCols(), I have noticed that sometimes this > field is null (e.g. when stats are being collected on all columns). Is it > safe to assume when .valueStatsCols() is null, we should interpret > .valueStats() with columns in the same order as the schema? > > https://github.com/apache/incubator-xtable/pull/767/files#r2670899086 > > > > We also have some other items for discussion, I shall make separate > threads for them! > > > > Many thanks > > Mao >
