Hi,

On Tue, Jun 16, 2026 at 7:21 AM Ethan Mertz <[email protected]> wrote:
>
> Several PostgreSQL logical replication consumers outside of the publication
> subscription replication framework require  REPLICA IDENTITY FULL for
> their features. Please see a couple examples I researched briefly, but there
> are probably more that I'm not aware of [1] [2]. I have also found that some
> users utilize REPLICA IDENTITY FULL to create auditing solutions.

I'm convinced that there are legitimate uses for capturing both
before- and/or after-image of the rows.

> > Also, I think the heuristic could go beyond just unique vs. non-unique
> > index preference. Factors like index bloat and index size could also
> > help make a better choice. For example, say there are two unique
> > indexes on a table - one on a few text columns and another on a bigint
> > column. Due to non-HOT updates or updates to the indexed columns
> > (which can happen both on the publisher and the subscriber), bloat on
> > the text index grows faster compared to the bigint index. On the
> > subscriber, if the index with more bloat is chosen, lookups become
> > slower due to more pages to traverse - even though both are unique
> > indexes that the heuristic would treat equally. Of course, fully
> > accounting for all these factors may amount to invoking the planner -
> > but even a simple check on index size (relpages) between
> > otherwise-equal candidates could help.
>
> I'm wary of making this logic more intelligent than it already is. My 
> intention
> was to focus on the most dramatic improvements in replication speed, but still
> keep the behavior that users must configure more fine grain control as needed
> with REPLICA IDENTITY INDEX or by adding a primary key.
>
> I agree with your point. In my opinion, a better path forward would be a new
> feature allowing logical apply workers to optionally invoke the planner, 
> configured
> at the subscription level, rather than building increasingly comprehensive
> heuristics outside of planning.

I'm not sure we need another option for this. My concern is that just
choosing a unique index over a non-unique one can lead to suboptimal
apply performance if the unique index has more bloat.

Do we know the cost of invoking the planner for index selection on the
subscriber? AFAICS, the selection happens on the first apply or
whenever the relcache entry gets invalidated - so it's infrequent [1].
If that's the case, why not invoke the planner to make a
better-informed decision? It would account for bloat, size,
selectivity etc. I think it would be worth measuring the performance
impact before proceeding with the just unique vs. non-unique approach.

[1]
/*
* Finding a usable index is an infrequent task. It occurs when an
* operation is first performed on the relation, or after invalidation
* of the relation cache entry (such as ANALYZE or CREATE/DROP index
* on the relation).
*/

-- 
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com


Reply via email to