jerrypeng commented on PR #56055: URL: https://github.com/apache/spark/pull/56055#issuecomment-4667167414
@mridulm the implementation presented in this PR is not really a workaround — it is a working solution for the needs of RTM. I agree that we can eventually design a more natively integrated solution that provides more generic functionality. However, can we approach that incrementally? My philosophy for software development is iterative. I would like to first introduce something that works for the RTM use case, while minimizing risk to existing Spark use cases. That is what this PR is trying to do. The changes are intentionally scoped so that we can test RTM end-to-end without requiring a larger DAGScheduler redesign upfront. I would rather get something working first, validate it end-to-end, and then iteratively refine the abstractions. Building a more generic framework may take time, and I am happy to work toward that, but I do not think we should block RTM progress on having the fully generalized design in place from day one. Let me know what you think. Regardless I am going to look into how to natively built this functionality in the DAGScheduler. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
