[ 
https://issues.apache.org/jira/browse/FLINK-28187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557824#comment-17557824
 ] 

Aitozi commented on FLINK-28187:
--------------------------------

> For sessionjobs we need to cover both using the jobid magic somehow 

Can this done by generating JobID with the resource UID ?

1. Dispatcher will throw DuplicateJobSubmissionException if the same JobID 
submitted twice.

2. Upgrade happens with the following steps:

 

1) suspend the old job, reconcile status to upgrading

2) submit the job with new spec, same jobId

3) If job submitted succeed, but somehow throws timeout, then observer can 
detect the JobID has running , then update the reconcile status to deployed and 
update the lastReconciledSpec

Do you think this is a valid solution? [~gyfora] 

 

> Duplicate job submission for FlinkSessionJob
> --------------------------------------------
>
>                 Key: FLINK-28187
>                 URL: https://issues.apache.org/jira/browse/FLINK-28187
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.0.0
>            Reporter: Jeesmon Jacob
>            Priority: Critical
>         Attachments: flink-operator-log.txt
>
>
> During a session job submission if a deployment error (ex: 
> concurrent.TimeoutException) is hit, operator will submit the job again. But 
> first submission could have succeeded in jobManager side and second 
> submission could result in duplicate job. Operator log attached.
> Per [~gyfora]:
> The problem is that in case a deployment error was hit, the 
> SessionJobObserver will not be able to tell whether it has submitted the job 
> or not. So it will simply try to submit it again. We have to find a mechanism 
> to correlate Jobs on the cluster with the SessionJob CR itself. Maybe we 
> could override the job name itself for this purpose or something like that.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to