zhijiangW commented on issue #8463: [FLINK-12530][network] Move 
Task.inputGatesById to NetworkEnvironment
URL: https://github.com/apache/flink/pull/8463#issuecomment-495561142
 
 
   Thanks for the explanation @azagrebin !
   
   If not caring about the detail implementation, we should think through the 
partition checker logic and make clear the scope owner of it. The below sharing 
is just my personal thought, maybe not very correct:
   
   - During requesting partition, `RemoteInputChannel/InputGate` might receive 
`PartitionNotFoundException`, so `RemoteInputChannel/InputGate` should decide 
how to handle this exception. It could throw this exception to the outside 
directly to cause task fail. Or it wants to further query partition's state to 
make the final decision.
   
   - The checker/query should be targeting the partition's state, not 
producer's state. If the producer state is `FINISHED` but the partition state 
might be `RELEASED`, then only the partition's state could give the right 
decision.
   
   - `ShuffleMaster` could provide the ability for querying partition's state 
future, just like `ShuffleMaster` would communicate with `ShuffleService` for 
releasing partition. For simple implementation, we could make use of the RPC 
between TM/JM for the communication.
   
   If the current partition checker refactor in this PR might not be the final 
way/direction to go, it is better not to touch it now, since it is not very 
related to the scope of moving `inputGatesById`. Or we could forward step by 
step and keep the current refactor in this PR.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to