zhuzhurk commented on pull request #13628:
URL: https://github.com/apache/flink/pull/13628#issuecomment-708509757


   > A somewhat related question concerning the `ExecutionSlotSharingGroup` is 
the following: How will the `ExecutionSlotSharingGroup` be calculated if we 
have the following JobGraph:
   > 
   > ```
   > v1 --> v2
   > v3 --> v4
   > ```
   > 
   > here `v1, v2, v3, v4` are `JobVertices` and `-->` is a blocking data 
exchange. Given this specification we can say that neither `v1` and `v2` can 
share slots because of the blocking data exchange. The same applies for `v3` 
and `v4`. However, `v3` or `v4` could share the slot with `v1`. How is this 
solved? Will the effective execution slot sharing group for `v1` contain `v3` 
and `v4` or will we say `v1` can share with `v3` but not `v4`?
   
   `ExecutionSlotSharingGroup` relies on how we set the `SlotSharingGroup` to 
`JobVertex`. This means, if v1, v2, v3, v4 are in the same `SlotSharingGroup`, 
then we can have an `ExecutionSlotSharingGroup` {ev11, ev21, ev31, ev41}. So I 
think your question here is about how we set `SlotSharingGroup` of job vertices.
   The ways to set default `SlotSharingGroup` for job vertices are currently 
different for streaming jobs(including DataStream and Table/SQL streaming), 
blink planner batch jobs and DataSet jobs. 
    - For streaming jobs, all job vertices are by default in the same "default" 
`SlotSharingGroup`. 
    - For blink planner jobs, each logical pipelined region corresponds to a 
`SlotSharingGroup`.
    - For DataSet jobs, all job vertices are by default in the same "default" 
`SlotSharingGroup`. This is not ideal and I think should be changed to be 
aligned with blink planner jobs.
   
   Regarding whether we should make v1 and v3 in the same `SlotSharingGroup`, I 
think it's not very necessary while it can complicate things. I feel it is not 
necessary because their slot sharing will not improve input locality. And 
unlike streaming jobs, it does not simplify the resource reasoning.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to