Thank you Jingsong!

Where value stats are empty, should I parse key stats?
Or asking the question in a different way, would value stats always contain
a superset of column stats compared to key stats?

Empirically I’ve found that column stats always contained all the columns I
expect, but I’m unsure if there are edge cases where they would contain
stats for different sets of columns.

Thanks
Mao

On Thu, 8 Jan 2026 at 21:54, Jingsong Li <[email protected]> wrote:

> Hi Mao,
>
> Value stats can be empty, it depends on value stats cols (if this is
> null, all columns will be recorded in value stats).
>
> Best,
> Jingsong
>
> On Thu, Jan 8, 2026 at 2:43 PM Mao Liu <[email protected]> wrote:
> >
> > Happy new year to the Paimon user group!
> >
> > We are in the process of extending Apache Xtable (
> https://github.com/apache/incubator-xtable) to support Paimon as a source
> table format.
> >
> > I have been testing a draft PR (
> https://github.com/apache/incubator-xtable/pull/767/files) for parsing
> Paimon column stats. I'd like to validate a couple of assumptions with
> folks with more knowledge in the community:
> >
> > - RE: DataFileMeta.keyStats() vs .valueStats, in our testing I have
> noticed that column stats for primary keys are always present in
> `DataFileMeta.valueStats()`. Is it safe to ignore `DataFileMeta.keyStats()`
> in all scenarios, and assume `.valueStats()` is always complete?
> > https://github.com/apache/incubator-xtable/pull/767/files#r2670898850
> >
> > - RE: DataFileMeta.valueStatsCols(), I have noticed that sometimes this
> field is null (e.g. when stats are being collected on all columns). Is it
> safe to assume when .valueStatsCols() is null, we should interpret
> .valueStats() with columns in the same order as the schema?
> > https://github.com/apache/incubator-xtable/pull/767/files#r2670899086
> >
> > We also have some other items for discussion, I shall make separate
> threads for them!
> >
> > Many thanks
> > Mao
>

Reply via email to