github-actions[bot] commented on code in PR #64958:
URL: https://github.com/apache/doris/pull/64958#discussion_r3505845838
##########
fe/fe-core/src/main/java/org/apache/doris/mtmv/MTMVJobManager.java:
##########
@@ -143,8 +143,18 @@ public void dropJob(MTMV mtmv, boolean isReplay) {
}
public void alterJob(MTMV mtmv, boolean isReplay) {
- dropJob(mtmv, isReplay);
- createJob(mtmv, isReplay);
+ MTMVJob oldJob = getJobByMTMV(mtmv);
Review Comment:
This still leaves a race because `processAlterMTMV()` has already published
the new refresh metadata before this method waits on the old job lock. While
`oldJob.writeLock()` is blocked by the current refresh, the running task can
observe the new `refreshInfo` through calls such as
`calculateNeedRefreshPartitions()`, and if the change is to `ON COMMIT`,
`MTMVService.processEvent()` can now call `onCommit()` and enqueue another task
on the still-registered old job. If that task reaches `MTMVTask.runTask()` and
blocks behind this ALTER on the old job lock, `dropJobInternal()` marks it
canceled, but after the ALTER drops/recreates the job and releases the old lock
the task continues: it has already passed `TaskProcessor`'s canceled pre-check
and `AbstractTask.before()` resets the status to `RUNNING`. At that point it
can refresh under the removed old job while new on-commit tasks use the
replacement job's lock, so the old/new concurrent refresh is still possible.
The old job lock need
s to cover the refresh-info mutation as well, or the old job must stop
accepting tasks before the MV becomes visible with the new trigger/config.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]