Hi Tom,

On Wed, Jul 27, 2022 at 6:39 PM Tom Lane <t...@sss.pgh.pa.us> wrote:

> David Geier <geidav...@gmail.com> writes:
> > We tracked down the root cause of this slowdown to lock contention in
> > 'get_relation_info()'. The index lock of every single index of every
> single
> > table used in that query is acquired. We attempted a fix by pre-filtering
> > out all indexes that anyways cannot be used with a certain query, without
> > taking the index locks (credits to Luc Vlaming for idea and
> > implementation). The patch does so by caching the columns present in
> every
> > index, inside 'struct Relation', similarly to 'rd_indexlist'.
>
> I wonder how much thought you gave to the costs imposed by that extra
> cache space.  We have a lot of users who moan about relcache bloat
> already.


The current representation could be compacted (e.g. by storing the column
indexes as 16-bit integers, instead of using a Bitmapset). That should make
the additional space needed neglectable compared to the current size of
RelationData.  On top of that we could maybe reorder the members of
RelationData to save padding bytes. Currently, there is lots of
interleaving of members of different sizes. Even when taking cache locality
into consideration it looks like a fair amount could be saved, probably
accounting for the additional space needed to store the index column data.

  But more to the point, I do not buy the assumption that
> an index's set of columns is a good filter for which indexes are of
> interest.  A trivial counterexample from the regression database is
>
> regression=# explain select count(*) from tenk1;
>                                          QUERY PLAN
>
>
>
> --------------------------------------------------------------------------------
> ------------
>  Aggregate  (cost=219.28..219.29 rows=1 width=8)
>    ->  Index Only Scan using tenk1_hundred on tenk1  (cost=0.29..194.28
> rows=100
> 00 width=0)
> (2 rows)
>
> Only for queries without index conditions, the current version of the
patch simply discards all indexes but the one with the least columns. This
is case (3) in s64_IsUnnecessaryIndex(). This is an over-simplification
with the goal of picking the index which yields least I/O. For single
column indexes that works, but it can fall short for multi-column indexes
(e.g. [INT, TEXT] index vs [INT, INT]t index have both 2 columns but the
latter would be better suited when there's no other index and we want to
read the first integer column). What we should do here instead is to
discard indexes based on storage size.


> It looks to me like the patch also makes unwarranted assumptions about
> being able to discard all but the smallest index having a given set
> of columns.  This would, for example, possibly lead to dropping the
> index that has the most useful sort order, or that has the operator
> class needed to support a specific WHERE clause.t
>
Why would that be? If we keep all indexes that contain columns that are
used by the query, we also keep the indexes which provide a certain sort
order or operator class.

>
> In short, I'm not sure I buy this concept at all.  I think it might
> be more useful to attack the locking overhead more directly.  I kind
> of wonder why we need per-index locks at all during planning ---
> I think that we already interpret AccessShareLock on the parent table
> as being sufficient to block schema changes on existing indexes.
>
> As I said in the reply to your other mail, there's huge savings also in
the serial case where lock contention is not an issue. It seems like
pre-filtering saves work down the road. I'll do some perf runs to track
down where exactly the savings come from. One source I can think of is only
having to consider a subset of all indexes during path creation.


> Unfortunately, as things stand today, the planner needs more than the
> right to look at the indexes' schemas, because it makes physical accesses
> to btree indexes to find out their tree height (and I think there are some
> comparable behaviors in other AMs).  I've never particularly cared for
> that implementation, and would be glad to rip out that behavior if we can
> find another way.  Maybe we could persuade VACUUM or ANALYZE to store that
> info in the index's pg_index row, or some such, and then the planner
> could use it with no lock?
>
> That's another interesting approach, but I would love to save the planner
cycles for the sequential case.

--
David Geier
(ServiceNow)

Reply via email to