I think in this case, the original design that was proposed before the document was implemented on the Spark on K8s fork, that we took some time to build separately before proposing that the fork be merged into the main line.
Specifically, the timeline of events was: We started building Spark on Kubernetes on a fork and was prepared to merge our work directly into master, Discussion on https://issues.apache.org/jira/browse/SPARK-18278 led us to move down the path of working on a fork first. We would harden the fork, have the fork become used more widely to prove its value and robustness in practice. See https://github.com/apache-spark-on-k8s/spark On said fork, we made the original design decisions to use a step-based builder pattern for the driver but not the same design for the executors. This original discussion was made among the collaborators of the fork, as much of the work on the fork in general was not done on the mailing list. We eventually decided to merge the fork into the main line, and got the feedback in the corresponding PRs. Therefore the question may less so be with this specific design, but whether or not the overarching approach we took - building Spark on K8s on a fork first before merging into mainline – was the correct one in the first place. There’s also the issue that the work done on the fork was isolated from the dev mailing list. Moving forward as we push our work into mainline Spark, we aim to be transparent with the Spark community via the Spark mailing list and Spark JIRA tickets. We’re specifically aiming to deprecate the fork and migrate all the work done on the fork into the main line. -Matt Cheah From: Mark Hamstra <m...@clearstorydata.com> Date: Monday, February 5, 2018 at 1:44 PM To: Matt Cheah <mch...@palantir.com> Cc: "dev@spark.apache.org" <dev@spark.apache.org>, "ramanath...@google.com" <ramanath...@google.com>, Ilan Filonenko <i...@cornell.edu>, Erik <e...@redhat.com>, Marcelo Vanzin <van...@cloudera.com> Subject: Re: Spark on Kubernetes Builder Pattern Design Document That's good, but you should probably stop and consider whether the discussions that led up to this document's creation could have taken place on this dev list -- because if they could have, then they probably should have as part of the whole spark-on-k8s project becoming part of mainline spark development, not a separate fork. On Mon, Feb 5, 2018 at 1:17 PM, Matt Cheah <mch...@palantir.com> wrote: Hi everyone, While we were building the Spark on Kubernetes integration, we realized that some of the abstractions we introduced for building the driver application in spark-submit, and building executor pods in the scheduler backend, could be improved for better readability and clarity. We received feedback in this pull request[github.com] in particular. In response to this feedback, we’ve put together a design document that proposes a possible refactor to address the given feedback. You may comment on the proposed design at this link: https://docs.google.com/document/d/1XPLh3E2JJ7yeJSDLZWXh_lUcjZ1P0dy9QeUEyxIlfak/edit#[docs.google.com] I hope that we can have a productive discussion and continue improving the Kubernetes integration further. Thanks, -Matt Cheah
smime.p7s
Description: S/MIME cryptographic signature