It seems we have two standard practices for resource distribution in place
here:

- the Spark way is that the application (Spark) distributes the resources
*during* app execution, and does this by exposing files/jars on an http
server on the driver (or pre-staged elsewhere), and executors downloading
from that location (driver or remote)
- the Kubernetes way is that the cluster manager (Kubernetes) distributes
the resources *before* app execution, and does this primarily via docker
images, and secondarily through init containers for non-image resources.
I'd imagine a motivation for this choice in k8s' part is immutability of
the application at runtime

When the Spark and K8s standard practices are in conflict (as they seem to
be here), which convention should be followed?

Looking at the Spark-on-YARN integration, there's almost a parallel here
with spark.yarn.archive, where that configures the cluster (YARN) to do
distribution pre-runtime instead of the application mid-runtime.

Based purely on the lines-of-code removal, right now I lean towards
eliminating init containers.  It doesn't seem like credential segregation
between init container and main pod container is that valuable right now,
and the retryability could/should be in all of Spark's cluster managers,
not just k8s.

So I support Anirudh's suggestion to move towards bringing the change
demonstrated in Marcelo's POC into master.

On Wed, Jan 10, 2018 at 3:00 PM, Anirudh Ramanathan <
ramanath...@google.com.invalid> wrote:

> Thanks for this discussion everyone. It has been very useful in getting an
> overall understanding here.
> I think in general, consensus is that this change doesn't introduce
> behavioral changes, and it's definitely an advantage to reuse the
> constructs that Spark provides to us.
>
> Moving on to a different question here - of pushing this through to Spark.
> The init containers have been tested over the past two Spark releases by
> external users and integration testing - and this would be a fundamental
> change to that behavior.
> We should work on getting enough test coverage and confidence here.
>
> We can start by getting a PR going perhaps, and start augmenting the
> integration testing to ensure that there are no surprises - with/without
> credentials, accessing GCS, S3 etc as well.
> When we get enough confidence and test coverage, let's merge this in.
> Does that sound like a reasonable path forward?
>
>
>
> On Wed, Jan 10, 2018 at 2:53 PM, Marcelo Vanzin <van...@cloudera.com>
> wrote:
>
>> On Wed, Jan 10, 2018 at 2:51 PM, Matt Cheah <mch...@palantir.com> wrote:
>> > those sidecars may perform side effects that are undesirable if the
>> main Spark application failed because dependencies weren’t available
>>
>> If the contract is that the Spark driver pod does not have an init
>> container, and the driver handles its own dependencies, then by
>> definition that situation cannot exist.
>>
>> --
>> Marcelo
>>
>
>
>
> --
> Anirudh Ramanathan
>

Reply via email to