On Fri, Apr 30, 2021 at 6:19 PM Peter Geoghegan <p...@bowt.ie> wrote: > A remaining problem is that we must generate a new round of index > tuples for each and every index when only one indexed column is > logically modified by an UPDATE statement. I think that this is much > less of a problem now due to bottom-up index deletion. Sure, it sucks > that we still have to dirty the page at all. But it's nevertheless > true that it all but eliminates version-driven page splits, which are > where almost all of the remaining downside is. It's very reasonable to > now wonder if this particular all-indexes problem is worth solving at > all in light of that. (Modern hardware characteristics also make a > comprehensive fix less valuable in practice.)
It's reasonable to wonder. I think it depends on whether the problem is bloat or just general slowness. To the extent that the problem is bloat, bottom-index deletion will help a lot, but it's not going to help with slowness because, as you say, we still have to dirty the pages. And I am pretty confident that slowness is a very significant part of the problem here. It's pretty common for people migrating from another database system to have, for example, a table with 10 indexes and then repeatedly update a column that is covered by only one of those indexes. Now, with bottom-up index deletion, this should cause a lot less bloat, and that's good. But you still have to update all 10 indexes in the foreground, and that's bad, because the alternative is to find just the one affected index and update it twice -- once to insert the new tuple, and a second time to delete-mark the old tuple. 10 is a lot more than 2, and that's even ignoring the cost of deferred cleanup on the other 9 indexes. So I don't really expect this to get us out of the woods. Somebody whose workload runs five times slower on a pristine data load is quite likely to give up on using PostgreSQL before bloat even enters the picture. -- Robert Haas EDB: http://www.enterprisedb.com