Hi, Gustavo. I’d be glad to explore a more relaxed desigh together with you! 
Let me share my thoughts.


When designing this Flip, my original intention for the "no update" column 
semantics was: as long as the primary key remains the same, the values in these 
columns must not change. This allows us to safely treat the "no update" columns 
as part of the source's unique (upsert) key, without needing to track or 
interpret the row kind of intermediate results. The reason is that all 
intermediate data ultimately originates from the source, unless an explicit key 
transformation occurs(e.g. agg with grouping key).


I did not prohibit -U (update-before) records because, in most storage systems, 
an update is treated as a single atomic operation; Flink merely decomposes it 
into -U and +U for internal processing. Storage systems can easily enforce the 
"no update" constraint by checking whether a single update violates 
immutability for certain columns.


However, if we instead restrict the immutability guarantee only between +I and 
-D, due to upstream operators often emit multiple +I/-D record pairs, we will 
lose the ability to leverage "no update" column information for optimizations. 
Consider the following example:
```
create table src1(a int, b int, c int, primary key(a) not enforced, column(c) 
no update not enforced);
create table src2(id int, d int, primary key(id) not enforced);
select * from src1 join src2 on c = id;
```
This join prevents the upstream from generating DropUpdateBefore because, 
despite being declared as immutable, column `c` can actually take different 
values for the same primary key `a` across different +I and -D events. As a 
result, the join key `c` is actually not stable, causing records to be shuffled 
to different parallel task. Consequently, the system must rely on -U to 
correctly process the data.





--

    Best!
    Xuyang



At 2026-02-27 21:35:08, "Gustavo de Morais" <[email protected]> wrote:
>Hey Xuyang,
>
>Thanks for the reply.
>
>- Could you give an example for "which could maybe lead to the no-update
>metadata being incorrectly applied in some certain scenario"?
>- Also, if we're banning -D because joins can't distinguish it from -U,
>then by the same logic we'd need to ban -U, right? Could you specify that
>in the FLIP?
>
>Just to clarify, I'm +1 for the FLIP, I'm just wondering if we could make
>it more general.
>
>Kind regards,
>Gustavo
>
>On Fri, 27 Feb 2026 at 06:38, Xuyang <[email protected]> wrote:
>
>> Hi, Gustavo.
>> You're absolutely right! In an ideal scenario, the lifecycle of immutable
>> columns should indeed be confined within the sequence +I, -U, +U, -D.
>> However, in Flink today, we don't fully distinguish between (+I, -D) and
>> (+U, -U) (e.g., at join nodes), which could maybe lead to the no-update
>> metadata being incorrectly applied in some certain scenarios.
>>
>>
>> Therefore, for simplicity, I'd prefer not to support -D for this first
>> step. I'd like to hear your thoughts on this. I'd like to hear your
>> thoughts on this.
>>
>>
>>
>> --
>>
>>     Best!
>>     Xuyang
>>
>>
>>
>> At 2026-02-26 19:03:57, "Gustavo de Morais" <[email protected]>
>> wrote:
>> >Hey Xuyang,
>> >
>> >Thanks for the updates and reply!
>> >
>> >Regarding dropping the restriction: In my thinking, in Flink's changelog
>> >semantics, -D ends the row's lifetime. When we see -D followed by +I for
>> >the same PK, like in the example you gave, that's imo creating a new row
>> >rather than updating the existing one. I don't think it makes sense to
>> >start tracking relations between "conceptually different rows".  If it
>> were
>> >an update and still the same row, I'd expect a +/-U instead.
>> >
>> >So I'm still inclining towards being +1 to drop it. That means, NO UPDATE
>> >would mean "immutable while the row exists" rather than "immutable for
>> this
>> >PK forever". For DeltaJoin's stricter needs (no deletes), we could enforce
>> >that separately during planning. Does that distinction make sense to you?
>> >Let me know what you think.
>> >
>> >Kind regards,
>> >Gustavo
>> >
>> >On Thu, 26 Feb 2026 at 05:17, Xuyang <[email protected]> wrote:
>> >
>> >> Hi Gustavo,
>> >> Regarding your suggestion to remove the deletion restriction: it's not
>> >> only tied to delta joins. My primary concern before has been the
>> >> significant overhead for the storage engine in tracking no update column
>> >> changes across separate INSERT and DELETE messages.
>> >> Consider this example, where col1 is the primary key and col3 is
>> declared
>> >> as a NO UPDATE column:
>> >> Schema: (col1, col2, col3)
>> >> 1)+I (pk1, a1, b1)
>> >> 2)-D (pk1, a1, b1)
>> >> 3)+I (pk1, a2, b2)
>> >> In this sequence, the value of the no update column col3 changes from b1
>> >> to b2, which violates the NO UPDATE constraint.
>> >> If we relax the restriction on DELETE operations, we would effectively
>> >> shift the responsibility to users to guarantee that values of no update
>> >> columns remain consistent across corresponding INSERT, DELETE, and
>> UPDATE
>> >> records.
>> >> Given these implications, I’d like to hear your thoughts on whether we
>> >> should proceed with removing this restriction.
>> >>
>> >>
>> >> Separately, regarding the additional details you mentioned, I’ve updated
>> >> them into the FLIP. Here’s a quick summary:
>> >> - If a user declares NO UPDATE (b, c) on a table without a primary key,
>> an
>> >> error will be thrown: “NO UPDATE constraints must be defined on tables
>> with
>> >> a primary key.”
>> >> - If a user declares NO UPDATE(a) and column a is already part of the
>> >> primary key, the declaration is silently accepted.
>> >> - Updated: CONSTRAINT %s COLUMNS (%s) NO UPDATE%s
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >>     Best!
>> >>     Xuyang
>> >>
>> >>
>> >>
>> >> At 2026-02-19 20:10:11, "Gustavo de Morais" <[email protected]>
>> >> wrote:
>> >> >Hey Xuyang,
>> >> >
>> >> >That's an interesting concept, thanks for the proposal!
>> >> >
>> >> >I like the FLIP and I think this could open some other optimizations.
>> That
>> >> >said, I think it makes sense to remove the deletion restriction from
>> the
>> >> >FLIP - since it's mostly a necessity that comes from the DeltaJoin. We
>> >> >could make NO UPDATE be about immutability which is not directly
>> connected
>> >> >to row permanence. As far as I know, the DeltaJoin already enforces the
>> >> >deletion restriction during planning for its sources, so it doesn't
>> have
>> >> to
>> >> >be enforced by this functionality as well.
>> >> >
>> >> >Also, some small clarifications that could be added to the FLIP:
>> >> >- If someone declares NO UPDATE (b, c) on a table without a primary
>> key. I
>> >> >suppose that's an error?
>> >> >- If someone declares NO UPDATE(a) and a is already a primary key. Is
>> it
>> >> an
>> >> >error or do we silently accept it?
>> >> >- nit: CONSTRAINT %s FIELDS (%s) NO UPDATE%s -> you mean COLUMNS
>> instead
>> >> of
>> >> >FIELDS, right?
>> >> >
>> >> >Kind regards,
>> >> >Gustavo
>> >> >
>> >> >
>> >> >
>> >> >On Fri, 13 Feb 2026 at 10:08, Xuyang <[email protected]> wrote:
>> >> >
>> >> >> Hi, everyone.
>> >> >> I’d like to propose FLIP-566: Introduce a new NO UPDATE column
>> >> >> constraint[1].
>> >> >> Flink has introduced the Delta Join, whose core advantage lies in
>> >> >> replacing redundant local state storage with direct queries to
>> external
>> >> >> storage systems (e.g., Apache Fluss). It currently relies on the
>> upsert
>> >> >> key, which ensures correct changelog processing without UPDATE_BEFORE
>> >> >> messages. But this assumes the join key must be part of the primary
>> key.
>> >> >> As modern storage systems increasingly support general-purpose
>> secondary
>> >> >> secondary indexes (not limited to primary keys), this restriction is
>> >> >> becoming outdated. We need a new semantic mechanism to guarantee the
>> >> >> immutability of the join key—specifically, that for a given primary
>> key,
>> >> >> the column values comprising the join key cannot be modified.
>> >> >> Looking forward to your feedback.
>> >> >>
>> >> >>
>> >> >> [1]
>> >> >>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-566%3A+Introduce+a+new+NO+UPDATE+constraint
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >>     Best!
>> >> >>     Xuyang
>> >>
>>

Reply via email to