Re: [PR] [FLINK-37278] Optimize regular schema evolution topology's performance [flink-cdc]

via GitHub Mon, 24 Feb 2025 21:55:38 -0800


yuxiqian commented on PR #3912:
URL: https://github.com/apache/flink-cdc/pull/3912#issuecomment-2680689436


   > > It seems that `SchemaChangeResponse#ResponseCode` can only be SUCCESS 
now. Can we remove `SchemaChangeResponse#ResponseCode` and simplify the logic 
in `SchemaOperator#handleSchemaChangeEvent` ?
   > 
   > @yuxiqian @Shawn-Hx Hi, Have you noticed that during fault tolerance, the 
same table will be flushed multiple times (related to the task parallelism).So 
I think SchemaChangeResponse#ResponseCode#DUPLICATE should not be deleted, but 
it should be strengthened.
   
   Thanks for @gongzexin's report. IIUC, the root cause of this problem is 
`PreTransformOperator` invokes `getUnionListState` to store persistent schemas, 
all subTasks of `SchemaOperator`s will obtain the same set of table schemas 
when restoring from state, and `SchemaCoordinator` is expected to receive $N$ 
duplicate requests ($N$ = parallelism). Worse still, `UnionListState` will 
block the checkpointing process when some subTasks have entered the FINISHED 
state (FLINK-37368).
   
   I wonder if we can handle it in another PR, and focus on modifying the 
schema evolution request queueing logic here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-37278] Optimize regular schema evolution topology's performance [flink-cdc]

Reply via email to