Re: [PATCH] Improving index selection for logical replication apply with replica identity full

Ethan Mertz Mon, 29 Jun 2026 06:52:17 -0700

Hi,

> I want to make sure we don't regress in the case where the unique
> index has more bloat than the non-unique index. I experimented with a
> small dataset where the unique index grew to 4.5 GB due to bloat and
> observed that the patched was about 20% slower (a couple of seconds at
> this scale, but in production with larger tables, more concurrent
> activity, more bloat and limited memory, I would expect the gap to be
> wider). Would you be able to provide some additional data points on
> this before we proceed further? That would help confirm the heuristic
> holds up across different bloat conditions.


I agree that choosing a bloated unique index could lead to worse
performance than a non-bloated non-unique index in certain cases.
However, I think there are many cases where the more bloated, larger
index would perform significantly better than a less bloated index with
worse
selectivity. Without the exact numbers, it should be clear to see that the
example from my original email would be one of those cases. Deciding
which index to choose based off of size alone and not taking into account
other statistics would likely lead to many wrong decisions.

Moreover, I would argue that even if the choice of a unique index led to
worse performance, it should not be considered a regression. Today
index selection is essentially random, therefore, there is no guarantee
about which index is chosen. I'd argue that a savvy user must assume
that the worst index is chosen when reasoning about performance. In
addition, given that this patch will likely only be applied on a new major
version, any stability of index ordering for the selection would be changed
during the dump and restore.

I'd reiterate as well that this is a small incremental improvement which I
found would be helpful in a few situations that I have observed in user
workload. I don't think that this excludes any future optimizations
including
more factors such as size/bloat, but those must be considered in
combination with other statistics. I'd be interested in looking into and
helping
out with the development of those features in the future.

Best,
Ethan
SDE, Amazon Web Services

Re: [PATCH] Improving index selection for logical replication apply with replica identity full

Reply via email to