> On Dec 29, 2025, at 16:24, Chao Li <[email protected]> wrote: > > > >> On Dec 22, 2025, at 14:26, Amit Kapila <[email protected]> wrote: >> >> On Fri, Dec 19, 2025 at 11:14 AM Zhijie Hou (Fujitsu) >> <[email protected]> wrote: >>> >>> On Thursday, December 18, 2025 12:21 PM Chao Li <[email protected]> >>> wrote: >>>> >>>>> On Dec 17, 2025, at 16:48, Zhijie Hou (Fujitsu) <[email protected]> >>>> wrote: >>>>> >>>>> On Wednesday, December 17, 2025 3:56 PM Chao Li >>>> <[email protected]> wrote: >>>>>> Thank you both for all your advice. Here comes my first >>>>>> implementation of INHERIT in the attached v2 patch. >>>>>> >>>>>> On Wed, Dec 17, 2025 at 8:11 AM Euler Taveira >>>> <mailto:[email protected]> wrote: >>>>>> >>>>>>> I wondering if we use INHERIT as default. The main advantage is >>>>>>> usability as Chao Li already mentioned. Is there any cases that >>>>>>> having a different replica identity from parent/partitioned table makes >>>> sense? >>>>>> >>>>>> We can leave this topic open for discussion. In my current >>>>>> implementation, NO INHERIT is still the default. But if we decide to >>>>>> switch the default, I can add a new commit that should include only 1 >>>>>> line code change in gram.y and a tiny doc update. >>>>>> >>>>>> 0001 - when a new partition is created, use the parent's replication >>>>>> identity >>>>>> 0002 - add INHERIT | NO INHERIT >>>>> >>>> >>>> Hi Zhijie, >>>> >>>> Thanks for your feedback and linked information. I think this patch is >>>> avoiding >>>> the hard problem of “index” RI. >>>> >>>>> >>>>> I think there are several design considerations for this proposal: >>>>> >>>>> 1) Since the index names can vary across different partitions, what >>>>> should be the expected behavior if a new partition cannot identify the >>>>> same replica identity key as the root partitioned table? >>>> >>>> Index RI is skipped in this patch. INHERT only works for DEFAULT, FULL and >>>> NONE. >>>> >>> >>> I personally find skipping this part to be inconsistent, particularly given >>> the >>> existing disparities among ALTER TABLE commands related to partitioned >>> table handling. >>> Omitting this part introduces further inconsistency within the ALTER TABLE >>> REPLICA IDENTITY. >>> >> >> Fair point. I think one should summarize the previous discussions with >> key differences and where the previous patch got stuck. Then, it would >> be good to get some feedback from the people involved previously. If >> there is an agreement that we can do INHERIT stuff for specific parts >> then fine, otherwise, I think we need to address the index part as >> well. > >> [1] >> https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql >> [2] >> https://www.postgresql.org/message-id/flat/OSBPR01MB2982A2738F16722899A50082FECB0%40OSBPR01MB2982.jpnprd01.prod.outlook.com#2e5388a7cde3c10430f8668a6c381b06 > > Sure, let me try to summarize the two discussions. > > For [1], key participants included Álvaro Herrera (patch author), Dmitry > Dolgov (reviewer/tester), Robert Haas, Simon Riggs, Michael Paquier, and > Peter Eisentraut. > ---- > > Brief summary with rough timeline: > > * In 2019, Álvaro proposed a patch that essentially did the same thing as > this patch. That patch also attempted to handle “index-based” replica > identity, which my patch intentionally avoids. > * Dmitry tested the patch and reported issues related to attaching > partitions, including index-related errors. > * Robert then pointed out that having REPLICA IDENTITY recurse was > inconsistent with the behavior of other ALTER TABLE actions such as > TABLESPACE, and emphasized that we should really try to think through *all* > actions that might recurse to child partitions. > * Álvaro noted the use of `ALTER TABLE ONLY`, suggesting that WITHOUT ONLY > could recurse to children, while ONLY would affect just the parent. > * Simon commented that recursing TABLESPACE changes could be problematic > because they imply physical data rewrites. > * Robert listed several ALTER TABLE actions that lacked consistent recurse > behavior (identity columns, triggers, CLUSTER, OWNER, TABLESPACE, CHECK > constraints). > * This led to broader discussion about whether TABLESPACE/OWNER/etc. should > recurse. > * Michael echoed support for having REPLICA IDENTITY recurse, controlled via > ONLY. > * Peter pointed out that recursing ADD GENERATED AS IDENTITY / DROP IDENTITY > may not be feasible. > * Álvaro wanted to proceed with the patch. > * Robert maintained that defining consistent semantics for all relevant ALTER > TABLE actions should come first. > > Overall, the blocker was an unresolved semantic disagreement, rather than a > concrete technical objection to the patch itself. There appeared to be broad > agreement that: > * New partitions should use the parent’s replica identity. > * ONLY could be used to control whether replica identity changes recurse. > > However, the discussion about “semantic consistency” significantly broadened > the scope, and there was no clear agreement on whether TABLESPACE/OWNER/etc. > should recurse, which ultimately stalled the effort. > > For [2], key participants included: Takayuki Tsunakawa (patch author), > Bharath Rupireddy (reviewer), Álvaro Herrera, Michael Paquier, and Kyotaro > Horiguchi. > ---- > > Brief summary with rough timeline: > > * In 2020, Takayuki proposed a patch intended to propagate ALTER TABLE SET > LOGGED / UNLOGGED to partitions. > * Bharath reviewed the patch and raised a number of questions and edge cases. > * There were initial discussions about the patch mechanics and expected > behavior. > * Álvaro then pointed out the strong relation to the earlier discussion in > [1]. > * The focus of the discussion shifted to the more fundamental question of > “what is the parent?”: > * Takayuki viewed ALTER TABLE on a partitioned table as a convenient batch > operation on existing partitions, with future partitions remaining > independent. > * Álvaro, Michael, and Kyotaro argued that changing a property on the parent > should define the default for future partitions as well. > * No clear agreement was reached on this semantic question, and the > discussion expanded into broader concerns about consistency across ALTER > TABLE actions. > * Takayuki withdrew the patch > > Overall, [2] also fell under the umbrella of “semantic consistency”, the main > discussion was not about replica identity itself. > > Current situation: > ---- > > * Resolving “semantic consistency” across ALTER TABLE actions in a single > release appears to be the biggest challenge. However, addressing everything > in one patch set does not seem realistic. > * The core question of “what is the parent?” from [1] remains central. That > said, the discussion appeared to lean toward the view that future partitions > should inherit properties from the parent. > * Different properties behave very differently. For example, propagating > REPLICA IDENTITY is a metadata-only change and relatively safe, while > propagating TABLESPACE implies physical data movement and is much riskier. As > a result, each ALTER TABLE action may deserve its own discussion and patch > set. > * The inconsistency is not limited to ALTER TABLE but also exists in CREATE > TABLE behavior. For example, a new partition inherits TABLESPACE from the > parent, but not REPLICA IDENTITY. > * “ALTER TABLE ONLY table_name” is commonly suggested as the mechanism to > control whether changes should recurse to partitions. > > How to proceed? > ---- > > If we stop here, these inconsistencies will remain indefinitely, which I > believe nobody really wants. With that in mind, I’d like to suggest a > two-phase approach. > > Phase 1: Document current behavior and set expectations > > * Identify all ALTER TABLE actions involved in these inconsistencies. > * Update the ALTER TABLE and CREATE TABLE documentation to clearly describe > the current behavior for partitioned tables, and (where appropriate) the > intended or ideal behavior. > * Explicitly document the meaning of ONLY for partitioned tables, and note > that some actions may behave differently, with details described in each > action’s section. > > Phase 2: Address actions incrementally > > * Work on each ALTER TABLE action individually, recognizing that some may be > straightforward while others require more design discussion, coding, and > testing. > * With Phase 1 in place to set expectations, it may not be necessary to > complete all actions in a single release. > * That said, it would still be desirable to keep the work bounded, for > example within one or two major releases, to avoid long-term fragmentation. > > I’ve included the participants from the previous discussions on CC, in case > they want to comment further. >
To move this patch forward, I’ve been working on a few related patches in
parallel:
• [1] Enhances the “ALTER TABLE” documentation to clarify how subcommands
behave on partitioned tables. In that discussion, I summarized the current
behaviors of all subcommands into 15 categories.
• [2] Adds a NOTICE when ONLY is not specified but the action does not
recurse to child partitions.
• [3] Improves the header comments in tablecmds.c to better explain the
meaning of “recurse” and “recursing”.
While working on [1], [2] and [3], I gained a deeper understanding of ALTER
TABLE behavior. Based on that, v3 is a complete rework compared to v2. The
behavior implemented by v3 is:
• Recursion is performed during the preparation phase.
• When ONLY is specified, no recursion occurs; without ONLY, the command
recurses to all child partitions.
• All replica identity types are supported, including USING INDEX (which
was not supported in v2).
• For non-index replica identities, the existing ATSimpleRecursion()
infrastructure is reused; for index-based replica identity, the corresponding
index name must be determined for each child partition.
In a previous discussion [4], Dmitry Dolgov pointed out a test case that
resulted in a DEADLOCK. I ran that test against v3. The test still fails, but I
no longer observe a deadlock; instead, the server now crashes during partition
attachment. I will investigate this further.
While working on v3, I also noticed another issue that may be a bug. When
creating an index on a partitioned table, the index is automatically created on
all partitions. However, if a column type change causes an index rebuild, it
appears that the rebuilt index may be missing from child partitions. I will
verify this behavior and, if confirmed, discuss it in a separate thread.
At this point, I think v3 is in a reviewable state.
[1]
https://postgr.es/m/CAEoWx2%3DmYhCfsnHaN96Qqwq5b0GVS2YgO3zpVqPPRd_iO52wRw%40mail.gmail.com
[2]
https://postgr.es/m/CAEoWx2=SLga-xH09Cq_PAvsHhQHrBK+V0vF821JKgzS=bm0...@mail.gmail.com
[3]
https://postgr.es/m/caeowx2n9e6_zxpbqwmpapuc1c_p4b3y635sjiducpsvm8gb...@mail.gmail.com
[4]
https://www.postgresql.org/message-id/flat/201902041630.gpadougzab7v%40alvherre.pgsql
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
v3-0001-Inherit-replica-identity-for-new-and-attached-par.patch
Description: Binary data
v3-0002-ALTER-TABLE-recurse-REPLICA-IDENTITY-to-partition.patch
Description: Binary data
