mridulm commented on PR #56055: URL: https://github.com/apache/spark/pull/56055#issuecomment-4610662542
> Can you elaborate? For example, this integration could be modeled as I described above - submitting map stages with streaming shuffle wired up between the stages. > Sorry to hear that! Can you help me understand better what is not clear to you? The proposed scheduler is codifying expectations specific to this implementation - and not generic constructs. In the default DAGScheduler, there is a rationale for why a child waits for parent before it starts. Here, it is unclear why/.when it can start, and when it cant - the decisions appear to be driven by RTF implementation details, and not robustly defined - making it less extensible to use for other usecases. As an example, barrier scheduling was designed for MPI based apps, but the constructs in scheduler are generic, and applicable to other usecases as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
