On Mon, Sep 07, 2020 at 09:39:16PM -0500, Justin Pryzby wrote: > Also, my previous revision failed to implement your suggestion to first build > catalog entries with INVALID indexes and to then reindex them. Fixed.
- childStmt->oldCreateSubid = InvalidSubTransactionId; - childStmt->oldFirstRelfilenodeSubid = childStmt->InvalidSubTransactionId; + // childStmt->oldCreateSubid = childStmt->InvalidSubTransactionId; + // childStmt->oldFirstRelfilenodeSubid = InvalidSubTransactionId; In the CIC patch, what is that about? It is hard to follow what you are trying to achieve here. + index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY); + CommandCounterIncrement(); + index_set_state_flags(indexRelationId, INDEX_CREATE_SET_VALID); Anyway, this part of the logic is still not good here. If we fail in the middle here, we may run into problems with a single partition index that has only a portion of its flags set. As this gets called for each partitioned index, it also means that we could still finish in a state where we have only a partially-valid partition tree. For example, imagine a partition tree with 2 levels (as reported by pg_partition_tree), then the following happens if an index is created concurrently from the partitioned table of level 0: 1) indexes are created in level 2 first 2) partitioned table of level 1 is switched to be ready and valid 3) indexes of level 1 are created. 4) partitioned table of level 0 is switched to be ready and valid If the command has a failure, say between 2 and 3, we would have as result a command that has partially succeeded in creating a partition tree, while the user was looking for an index for the full tree. This comes back to my previous points, where we should make index_set_state_flags() first, and I have begun a separate thread about that: https://commitfest.postgresql.org/30/2725/ Second, it also means that the patch should really switch all the indexes to be valid in one single transaction, and that we also need more careful refactoring of DefineIndex(). I also had a quick look at the patch for CLUSTER, and it does a much better job, still it has issues. + MemoryContext old_context = MemoryContextSwitchTo(cluster_context); + + inhoids = find_all_inheritors(indexOid, NoLock, NULL); + foreach(lc, inhoids) The area where a private memory context is used should be minimized. In this case, you just need to hold the context while saving the relation and clustered index information in the list of RelToCluster items. As a whole, this case is more simple than CIC, so I'd like to think that it would be good to work on that as next target. Coming to my last point.. This thread has dealt since the beginning with three different problems: - REINDEX on partitioned relations. - CLUSTER on partitioned relations. - CIC on partitioned relations. (Should we also care about DROP INDEX CONCURRENTLY as well?) The first problem has been solved, not the two others yet. Do you think that it could be possible to begin two new separate threads for the remaining issues, with dedicated CF entries? We could also mark the existing one as committed, retitled for REINDEX as a matter of clarity. Also, please note that I am not sure if I will be able to look more at this thread in this CF. -- Michael
signature.asc
Description: PGP signature