eaglewatcherwb commented on issue #8309: [FLINK-12229] [runtime] Implement 
LazyFromSourcesScheduling Strategy
URL: https://github.com/apache/flink/pull/8309#issuecomment-490734729
 
 
   Hi, @GJL @tillrohrmann , I have updated the PR based on the latest master. 
Would you mind to begin another round of code review ?
   
   As discussed in the first round of code review, there are still some open 
questions, I make a summary to be  convenient for discussion.
   
   1.  
[[SchedulePolicy](https://github.com/apache/flink/pull/8309#discussion_r281049742)]
 Using vertex state transitions to schedule vertices has the benefit of 
avoiding flood `onPartitionConsumable` notifications, while there may be 
idle-waiting-result of PIPELINED shuffle mode. So, I think we could keep the 
benefit by relying on both vertex state transitions and `onPartitionConsumable` 
notifications.
        1) `DeploymentOption#sendScheduleOrUpdateConsumerMessage` set to true 
if the vertex has PIPELINED produced result partition and set to false if all 
the produced result partitions are BLOCKING
        2) Schedule vertices with BLOCKING input result partition using vertex 
state transition.
        3) Schedule vertices with PIPELINED input result partitions using 
`onPartitionConsumable` notification.
   
   2. [[JobGraph 
Usage](https://github.com/apache/flink/pull/8309#discussion_r281037134)] The 
only usage of `JobGraph` is to provide `InputDependencyConstraint` in 
`LazyFromSourcesSchedulingStrategy`, while, it is not used in 
`EagerSchedulingStrategy`. Maybe we could remove `JobGraph` from 
`SchedulingStrategyFactory#createInstance` and add `InputDependencyConstraint` 
information into `SchedulingTopology`, which need an new interface in 
`SchedulingTopology`:
   `InputDependencyConstraint getInputDependencyConstraint(JobVertexID 
jobVertexId)`?
   
   3. [[ANY/ALL Schedule 
Granularity](https://issues.apache.org/jira/browse/FLINK-12229)]  In the 
original scheduler, the schedule granularity is ANY/ALL the IntermediateDataSet 
finishes, and using granularity of result partition could speedup deployments 
but may involve flood of partition update network communication and resource 
deadlock. Thus, in this PR my implementation is consistent with the original 
logic. 
   However, we are wondering we could use some methods to keep both speedup 
deployments advantage and avoiding flood partition update and resource 
deadlock. Based on our production experience, we propose to introduce a new 
trigger `InputDependencyConstraint#Progress`, which is a float between 0.0~1.0 
and identifies the percentage of the finish result partitions. 1.0 means ALL 
the input result partitions finish and we configured it to 0.8 as default to 
balance the speedup advantage and flood partition update, possible resource 
deadlock.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to