[GitHub] flink issue #2913: [backport] [FLINK-5114] [network] Handle partition produc...
Github user uce commented on the issue: https://github.com/apache/flink/pull/2913 Closing in favour of #2975. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2913: [backport] [FLINK-5114] [network] Handle partition produc...
Github user uce commented on the issue: https://github.com/apache/flink/pull/2913 I removed the `findExecutionAttemptWithId` and only check the latest attempt. If that does not match the expected producer attempt, I answer with a `PartitionProducerDisposedException` to which the requesting `Task` reacts with a `cancelExecution`. I would really like to merge this and kick off a new RC for 1.1.4 soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2913: [backport] [FLINK-5114] [network] Handle partition produc...
Github user uce commented on the issue: https://github.com/apache/flink/pull/2913 > Why is that necessary? Can we not just assume that if the attempt is not equal to the current execution attempt, then the status is some form of "disposed". It's not necessary. It's perfectly fine to do it as you describe. Not having the `currentExecution` set to the producer execution means that the producer was restarted (hence cancelled or failed). This only made the handling in `Task` easier, but it should not dictate this change in the `ExecutionVertex`. I'll change that to only check the `currentExecution` and handle it accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2913: [backport] [FLINK-5114] [network] Handle partition produc...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2913 Having a quick look at this: I think this breaks with a fundamental design in the ExecutionGraph: The `findExecutionAttemptWithId(...)` method searches the prior execution attempts. Why is that necessary? Can we not just assume that if the attempt is not equal to the current execution attempt, then the status is some form of "disposed". If the produced result is finished, the execution will still not be in the "prior execution attempts". That can only happen once the task restarts, in which case you should not try and fetch the partition any more. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---