Github user mccheah commented on the issue:
https://github.com/apache/spark/pull/20669
Hm, noted that we're making this tradeoff. We have an internal use case
where we're pushing a custom logging properties file into the container using
`spark.files`. Logging properties files need to be in the container before the
JVM starts to configure the appenders from the get-go, but logging properties
are more dynamic and probably don't belong in a statically built Docker image.
We use YARN cluster mode primarily and rely on its file distribution, and we
migrated to the fork's implementation of Kubernetes without having to change
our internal setup.
I think we can adapt to this change, but I don't think the use case I've
described is as uncommon as one may think. There's plenty of lower-level
tooling out there that requires the JVM to load files in static initializations.
> Oh, btw, if you think that is a really, really important feature, you
still don't need an init container for that. You can just run the dependency
download tool before you run spark-submit in the driver container. Problem
solved.
Agreed. Init-containers are but one option to support this. The question
was more if running spark-submit in client mode is completely sufficient, which
it seems like it isn't in this specific case.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]