On Fri, Jun 21, 2024 at 11:31:21AM +0200, Michail Nikolaev wrote: > Yes, I also have tried that approach, but it doesn't work, unfortunately. > You may fail test increasing number of connections: > > '--no-vacuum --client=10 -j 2 --transactions=1000', > > The source of the issue is not the swap of the indexes (and not related to > REINDEX CONCURRENTLY only), but the fact that indexes are fetched once > during planning (to find the arbiter), but then later reread with a new > catalog snapshot for the the actual execution.
When I first saw this report, my main worry was that I have somewhat managed to break the state of the indexes leading to data corruption because of an incorrect step in the concurrent operations. However, as far as I can see this is not the case, as an effect of two properties we rely on for concurrent index operations, that hold in the executor and the planner. Simply put: - The planner ignores indexes with !indisvalid. - The executor ignores indexes with !indislive. The critical point is that we wait in DROP INDEX CONC and REINDEX CONC for any transactions still using an index that's waiting to be marked as !indislive, because such indexes *must* not be used in the executor. > So, other possible fixes I see: > * fallback to replanning in case we see something changed during the > execution > * select arbiter indexes during actual execution These two properties make ON CONFLICT react the way it should depending on the state of the indexes selected by the planner based on the query clauses, with changes reflecting when executing, with two patterns involved: - An index may be created in a concurrent operation after the planner has selected the arbiter indexes (the index may be defined, still not valid yet, or just created after), then the query execution would need to handle the extra index created available at execution, with a failure on a ccnew index. - An index may be selected at planning phase, then a different index could be used by a constraint once both indexes swap, with a failure on a ccold index. As far as I can see, it depends on what kind of query semantics and the amount of transparency you are looking for here in your application. An error in the query itself can also be defined as useful so as your application is aware of what happens as an effect of the concurrent index build (reindex or CIC/DIC), and it is not really clear to me why silently falling back to a re-selection of the arbiter indexes would be always better. Replanning could be actually dangerous if a workload is heavily concurrently REINDEX'd, as we could fall into a trap where a query can never decide which index to use. I'm not saying that it cannot be improved, but it's not completely clear to me what query semantics are the best for all users because the behavior of HEAD and your suggestions have merits and demerits. Anything we could come up with would be an improvement anyway, AFAIU. >> That's a HEAD-only thing IMO, >> though. > > Do you mean that it needs to be moved to a separate patch? It should, but I'm wondering if that's necessary for two reasons. First, a check on indisvalid would be incorrect, because indexes marked as !indisvalid && indislive mean that there is a concurrent operation happening, and that this concurrent operation is waiting for all transactions working with a lock on this index to finish before flipping the live flag and make this index invalid for decisions taken in the executor, like HOT updates, etc. A check on indislive may be an idea, still I'm slightly biased regarding its additional value because any indexes opened for a relation are fetched from the relcache with RelationGetIndexList() explaining why indislive indexes cannot be fetched, and we rely on that in the executor for the indexes opened by a relation. -- Michael
signature.asc
Description: PGP signature