Re: Import Statistics in postgres_fdw before resorting to sampling.

Ashutosh Bapat Fri, 23 Jan 2026 07:53:38 -0800

On Fri, Jan 23, 2026 at 3:50 AM Corey Huinker <[email protected]> wrote:
>
> On Thu, Jan 22, 2026 at 5:16 AM Ashutosh Bapat <[email protected]> 
> wrote:
>>
>> On Thu, Jan 22, 2026 at 2:21 AM Corey Huinker <[email protected]> 
>> wrote:
>> >>
>> >> Changes in this release, aside from rebasing:
>> >>
>> >> - The generic analyze and fdw.h changes are in their own patch (0001) 
>> >> that ignores contrib/postgres_fdw entirely.
>> >> - The option for remote_analyze has been moved to its own patch (0003).
>> >> - The errors raised are now warnings, to ensure that we can always fall 
>> >> back to row sampling.
>> >> - All local attributes with attstatarget > 0 must get matching remote 
>> >> statistics or the import is considered a failure.
>> >> - The pg_restore_attribute_stats() call has been turned into a prepared 
>> >> statement, for clarity and some minor parsing savings.
>> >> - The calls to pg_restore_relation_stats() are parameterized, but not 
>> >> prepared as this is rarely called more than once.
>> >> - postgresStatisticsAreImportable will now disqualify a table if has 
>> >> extended statistics objects, because we can't compute those without a row 
>> >> sample.
>> >
>>
>> Thanks Corey for breaking down these patches. It makes reviewing easier.
>>
>> analyze_rel() and acquire_inherited_sample_rows() both call
>> fdwroutine->AnalyzeForeignTable() but only the first one uses the
>> statistics import facility. Is that intentional? Typical use case of
>> sharding will create a partitioned table with foreign tables as
>> partitions. The partitions will be analyzed by the second function.
>> Thus a big use case of postgres_fdw won't be able to use the import
>> statistics facility. That seems like a major drawback of this patch.
>> Thinking more about it, acquire_inherited_sample_rows() accumulates
>> the sample rows from the child tables and extracts statistics from
>> those rows and then updates corresponding pg_statistics rows. Doing
>> that through import statistics seems a bit tricky since we need to be
>> able to combine statistics from multiple relations. Can we do that?
>
>
> We can't synthesize sample rows from imported statistics, no.
>
>>
>> There's an advantage if we can combine stats across multiple relations
>> - we don't have to sample children twice when analyzing the parent
>> without ONLY. Instead we could produce parent statistics by combining
>> statistics across children and the parent. To me this looks like
>> altogether a different beast just like partial aggregates.
>
>
> I think this patch is only ever going to get us out of 1 of the 2 samples, 
> which isn't ideal but it is a savings.
>


I am not suggesting to synthesize sample rows. Calculate the
statistics of the parent table from that of its children.

>>
>>
>> It will be good to fix this drawback. If not, at least we should
>> figure out (plan/POC) how to deal with the child tables? We need to at
>> least document this drawback - the documentation in the current patch
>> reads as if all foreign tables will use this facility when available.
>
>
>  Yes, we will have to note the limitation. I have made that note, as well as 
> the documentation fix attached.

The note just mentions partition table but the limitation applies to
any foreign child table.

-- 
Best Wishes,
Ashutosh Bapat

Re: Import Statistics in postgres_fdw before resorting to sampling.

Reply via email to