Hello,

I have a Postgres database where many similar expression-based indexes have
been added by a script. It is quite likely that some of the indexes are not
actually used, and I would like to remove these unused indexes.

I can identify indexes that are used directly during query execution using
pg_stat_user_indexes. However, it is possible that the optimizer might be
using the statistics on the indexed expressions to guide query planning. If
that is the case then I might remove an index which appears to be unused,
only to then find that this results in the optimizer choosing poorer plans
as it no longer has the additional statistics on the indexed expressions.

How can I identify indexes which are used by the optimizer when planning?
Are there any relevant internal counters similar to pg_stat_user_indexes?
Or an extension which provides this information?

If there is nothing available inside the database, I was thinking I would
have to parse all of the queries executed on the database, extract the
expressions, work out which columns of which tables are referenced by those
expressions, and then check whether those expressions are indexed. Or
(since that seems like a lot of work), I could remove all indexes which are
reported as unused by pg_stat_user_indexes, but for each and every one of
them create extended statistics on the relevant expression(s) using CREATE
STATISTICS. That might result in a lot of unnecessary expression statistics
- are there any significant costs associated with that? The docs say
"providing benefits similar to an expression index without the overhead of
index maintenance" [1], and don't mention any downside.

Lastly, please could someone point me at the part of the Postgres source
code that handles index expressions in the optimizer?

Thank you
Tim

[1] https://www.postgresql.org/docs/14/sql-createstatistics.html

Reply via email to