zhijiangW commented on issue #8463: [FLINK-12530][network] Move Task.inputGatesById to NetworkEnvironment URL: https://github.com/apache/flink/pull/8463#issuecomment-495561142 Thanks for the explanation @azagrebin ! If not caring about the detail implementation, we should think through the partition checker logic and make clear the scope owner of it. The below sharing is just my personal thought, maybe not very correct: - During requesting partition, `RemoteInputChannel/InputGate` might receive `PartitionNotFoundException`, so `RemoteInputChannel/InputGate` should decide how to handle this exception. It could throw this exception to the outside directly to cause task fail. Or it wants to further query partition's state to make the final decision. - The checker/query should be targeting the partition's state, not producer's state. If the producer state is `FINISHED` but the partition state might be `RELEASED`, then only the partition's state could give the right decision. - `ShuffleMaster` could provide the ability for querying partition's state future, just like `ShuffleMaster` would communicate with `ShuffleService` for releasing partition. For simple implementation, we could make use of the RPC between TM/JM for the communication. If the current partition checker refactor in this PR might not be the final way/direction to go, it is better not to touch it now, since it is not very related to the scope of moving `inputGatesById`. Or we could forward step by step and keep the current refactor in this PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
