Great idea! I've replaced all "no update" with "immutable" in the FLIP. Since 
the title has been updated, here is the new link[1] to this FLIP.
Additionally, if there are no further comments, I will initiate the voting 
later tomorrow.


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-566%3A+Introduce+a+new+IMMUTABLE+columns+constraint



--

    Best!
    Xuyang



At 2026-03-03 01:01:17, "Gustavo de Morais" <[email protected]> wrote:
>Hey Xuyang,
>
>Thanks for the reply and the discussion. I understand now your primary goal
>with the FLIP. If we want to enable these optimizations, PK-lifetime
>immutability makes sense!
>
>I'm fine with keeping the deletion restriction for v1. Suggestion: what if
>we called it IMMUTABLE instead of NO UPDATE? It's shorter and naturally
>implies more a PK-lifetime semantics rather than just no UPDATE operations.
>If you rather go with NO UPDATE, we could introduce a separate NO UPDATE
>ALLOW DELETES variant later if there's demand.
>
>Kind regards,
>Gustavo
>
>
>
>On Sat, 28 Feb 2026 at 04:48, Xuyang <[email protected]> wrote:
>
>> Hi, Gustavo. I’d be glad to explore a more relaxed desigh together with
>> you! Let me share my thoughts.
>>
>>
>> When designing this Flip, my original intention for the "no update" column
>> semantics was: as long as the primary key remains the same, the values in
>> these columns must not change. This allows us to safely treat the "no
>> update" columns as part of the source's unique (upsert) key, without
>> needing to track or interpret the row kind of intermediate results. The
>> reason is that all intermediate data ultimately originates from the source,
>> unless an explicit key transformation occurs(e.g. agg with grouping key).
>>
>>
>> I did not prohibit -U (update-before) records because, in most storage
>> systems, an update is treated as a single atomic operation; Flink merely
>> decomposes it into -U and +U for internal processing. Storage systems can
>> easily enforce the "no update" constraint by checking whether a single
>> update violates immutability for certain columns.
>>
>>
>> However, if we instead restrict the immutability guarantee only between +I
>> and -D, due to upstream operators often emit multiple +I/-D record pairs,
>> we will lose the ability to leverage "no update" column information for
>> optimizations. Consider the following example:
>> ```
>> create table src1(a int, b int, c int, primary key(a) not enforced,
>> column(c) no update not enforced);
>> create table src2(id int, d int, primary key(id) not enforced);
>> select * from src1 join src2 on c = id;
>> ```
>> This join prevents the upstream from generating DropUpdateBefore because,
>> despite being declared as immutable, column `c` can actually take different
>> values for the same primary key `a` across different +I and -D events. As a
>> result, the join key `c` is actually not stable, causing records to be
>> shuffled to different parallel task. Consequently, the system must rely on
>> -U to correctly process the data.
>>
>>
>>
>>
>>
>> --
>>
>>     Best!
>>     Xuyang
>>
>>
>>
>> At 2026-02-27 21:35:08, "Gustavo de Morais" <[email protected]>
>> wrote:
>> >Hey Xuyang,
>> >
>> >Thanks for the reply.
>> >
>> >- Could you give an example for "which could maybe lead to the no-update
>> >metadata being incorrectly applied in some certain scenario"?
>> >- Also, if we're banning -D because joins can't distinguish it from -U,
>> >then by the same logic we'd need to ban -U, right? Could you specify that
>> >in the FLIP?
>> >
>> >Just to clarify, I'm +1 for the FLIP, I'm just wondering if we could make
>> >it more general.
>> >
>> >Kind regards,
>> >Gustavo
>> >
>> >On Fri, 27 Feb 2026 at 06:38, Xuyang <[email protected]> wrote:
>> >
>> >> Hi, Gustavo.
>> >> You're absolutely right! In an ideal scenario, the lifecycle of
>> immutable
>> >> columns should indeed be confined within the sequence +I, -U, +U, -D.
>> >> However, in Flink today, we don't fully distinguish between (+I, -D) and
>> >> (+U, -U) (e.g., at join nodes), which could maybe lead to the no-update
>> >> metadata being incorrectly applied in some certain scenarios.
>> >>
>> >>
>> >> Therefore, for simplicity, I'd prefer not to support -D for this first
>> >> step. I'd like to hear your thoughts on this. I'd like to hear your
>> >> thoughts on this.
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >>     Best!
>> >>     Xuyang
>> >>
>> >>
>> >>
>> >> At 2026-02-26 19:03:57, "Gustavo de Morais" <[email protected]>
>> >> wrote:
>> >> >Hey Xuyang,
>> >> >
>> >> >Thanks for the updates and reply!
>> >> >
>> >> >Regarding dropping the restriction: In my thinking, in Flink's
>> changelog
>> >> >semantics, -D ends the row's lifetime. When we see -D followed by +I
>> for
>> >> >the same PK, like in the example you gave, that's imo creating a new
>> row
>> >> >rather than updating the existing one. I don't think it makes sense to
>> >> >start tracking relations between "conceptually different rows".  If it
>> >> were
>> >> >an update and still the same row, I'd expect a +/-U instead.
>> >> >
>> >> >So I'm still inclining towards being +1 to drop it. That means, NO
>> UPDATE
>> >> >would mean "immutable while the row exists" rather than "immutable for
>> >> this
>> >> >PK forever". For DeltaJoin's stricter needs (no deletes), we could
>> enforce
>> >> >that separately during planning. Does that distinction make sense to
>> you?
>> >> >Let me know what you think.
>> >> >
>> >> >Kind regards,
>> >> >Gustavo
>> >> >
>> >> >On Thu, 26 Feb 2026 at 05:17, Xuyang <[email protected]> wrote:
>> >> >
>> >> >> Hi Gustavo,
>> >> >> Regarding your suggestion to remove the deletion restriction: it's
>> not
>> >> >> only tied to delta joins. My primary concern before has been the
>> >> >> significant overhead for the storage engine in tracking no update
>> column
>> >> >> changes across separate INSERT and DELETE messages.
>> >> >> Consider this example, where col1 is the primary key and col3 is
>> >> declared
>> >> >> as a NO UPDATE column:
>> >> >> Schema: (col1, col2, col3)
>> >> >> 1)+I (pk1, a1, b1)
>> >> >> 2)-D (pk1, a1, b1)
>> >> >> 3)+I (pk1, a2, b2)
>> >> >> In this sequence, the value of the no update column col3 changes
>> from b1
>> >> >> to b2, which violates the NO UPDATE constraint.
>> >> >> If we relax the restriction on DELETE operations, we would
>> effectively
>> >> >> shift the responsibility to users to guarantee that values of no
>> update
>> >> >> columns remain consistent across corresponding INSERT, DELETE, and
>> >> UPDATE
>> >> >> records.
>> >> >> Given these implications, I’d like to hear your thoughts on whether
>> we
>> >> >> should proceed with removing this restriction.
>> >> >>
>> >> >>
>> >> >> Separately, regarding the additional details you mentioned, I’ve
>> updated
>> >> >> them into the FLIP. Here’s a quick summary:
>> >> >> - If a user declares NO UPDATE (b, c) on a table without a primary
>> key,
>> >> an
>> >> >> error will be thrown: “NO UPDATE constraints must be defined on
>> tables
>> >> with
>> >> >> a primary key.”
>> >> >> - If a user declares NO UPDATE(a) and column a is already part of the
>> >> >> primary key, the declaration is silently accepted.
>> >> >> - Updated: CONSTRAINT %s COLUMNS (%s) NO UPDATE%s
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >>     Best!
>> >> >>     Xuyang
>> >> >>
>> >> >>
>> >> >>
>> >> >> At 2026-02-19 20:10:11, "Gustavo de Morais" <[email protected]>
>> >> >> wrote:
>> >> >> >Hey Xuyang,
>> >> >> >
>> >> >> >That's an interesting concept, thanks for the proposal!
>> >> >> >
>> >> >> >I like the FLIP and I think this could open some other
>> optimizations.
>> >> That
>> >> >> >said, I think it makes sense to remove the deletion restriction from
>> >> the
>> >> >> >FLIP - since it's mostly a necessity that comes from the DeltaJoin.
>> We
>> >> >> >could make NO UPDATE be about immutability which is not directly
>> >> connected
>> >> >> >to row permanence. As far as I know, the DeltaJoin already enforces
>> the
>> >> >> >deletion restriction during planning for its sources, so it doesn't
>> >> have
>> >> >> to
>> >> >> >be enforced by this functionality as well.
>> >> >> >
>> >> >> >Also, some small clarifications that could be added to the FLIP:
>> >> >> >- If someone declares NO UPDATE (b, c) on a table without a primary
>> >> key. I
>> >> >> >suppose that's an error?
>> >> >> >- If someone declares NO UPDATE(a) and a is already a primary key.
>> Is
>> >> it
>> >> >> an
>> >> >> >error or do we silently accept it?
>> >> >> >- nit: CONSTRAINT %s FIELDS (%s) NO UPDATE%s -> you mean COLUMNS
>> >> instead
>> >> >> of
>> >> >> >FIELDS, right?
>> >> >> >
>> >> >> >Kind regards,
>> >> >> >Gustavo
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >On Fri, 13 Feb 2026 at 10:08, Xuyang <[email protected]> wrote:
>> >> >> >
>> >> >> >> Hi, everyone.
>> >> >> >> I’d like to propose FLIP-566: Introduce a new NO UPDATE column
>> >> >> >> constraint[1].
>> >> >> >> Flink has introduced the Delta Join, whose core advantage lies in
>> >> >> >> replacing redundant local state storage with direct queries to
>> >> external
>> >> >> >> storage systems (e.g., Apache Fluss). It currently relies on the
>> >> upsert
>> >> >> >> key, which ensures correct changelog processing without
>> UPDATE_BEFORE
>> >> >> >> messages. But this assumes the join key must be part of the
>> primary
>> >> key.
>> >> >> >> As modern storage systems increasingly support general-purpose
>> >> secondary
>> >> >> >> secondary indexes (not limited to primary keys), this restriction
>> is
>> >> >> >> becoming outdated. We need a new semantic mechanism to guarantee
>> the
>> >> >> >> immutability of the join key—specifically, that for a given
>> primary
>> >> key,
>> >> >> >> the column values comprising the join key cannot be modified.
>> >> >> >> Looking forward to your feedback.
>> >> >> >>
>> >> >> >>
>> >> >> >> [1]
>> >> >> >>
>> >> >>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-566%3A+Introduce+a+new+NO+UPDATE+constraint
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >>
>> >> >> >>     Best!
>> >> >> >>     Xuyang
>> >> >>
>> >>
>>

Reply via email to