On Wed, Dec 11, 2024 at 12:37 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > I confirmed that the proposed patch fixes these issues. I have one > question about the patch: > > In the main loop in SnapBuildDistributeSnapshotAndInval(), we have the > following code: > > /* > * If we don't have a base snapshot yet, there are no changes in this > * transaction which in turn implies we don't yet need a snapshot at > * all. We'll add a snapshot when the first change gets queued. > * > * NB: This works correctly even for subtransactions because > * ReorderBufferAssignChild() takes care to transfer the base snapshot > * to the top-level transaction, and while iterating the changequeue > * we'll get the change from the subtxn. > */ > if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, txn->xid)) > continue; > > Is there any case where we need to distribute inval messages to > transactions that don't have the base snapshot yet but eventually need > the inval messages? >
Good point. It is mentioned that for snapshots: "We'll add a snapshot when the first change gets queued.". I think we achieve this via builder->committed.xip array such that when we set a base snapshot for a transaction, we use that array to form a snapshot. However, I don't see any such consideration for invalidations. Now, we could either always add invalidations to xacts that don't have base_snapshot yet or have a mechanism similar committed.xid array. But it is better to first reproduce the problem. > Overall, with this idea, we distribute invalidation messages to all > concurrent decoded transactions. It could introduce performance > regressions by several causes. For example, we could end up > invalidating RelationSyncCache entries in more cases. While this is > addressed by your selectively cache invalidation patch, there is still > 5% regression. We might need to accept a certain amount of regressions > for making it correct but it would be better to figure out where these > regressions come from. Other than that, I think the performance > regression could happen due to the costs of distributing invalidation > messages. You've already observed there is 1~3% performance regression > in cases where we distribute a large amount of invalidation messages > to one concurrently decoded transaction[1]. I guess that the > selectively cache invalidation idea would not help this case. Also, I > think we might want to test other cases like where we distribute a > small amount of invalidation messages to many concurrently decoded > transactions. > +1. -- With Regards, Amit Kapila.