We've also had some similar disk fill issues.
For Java/Scala RDDs, shuffle file cleanup is done as part of the JVM
garbage collection. I've noticed that if RDDs maintain references in the
code, and cannot be garbage collected, then immediate shuffle files hang
around.
Best way to handle this is
I will agree that the side effects of using Futures in driver code tend to
be tricky to track down.
If you forget to clear the job description and job group information, when
the LocalProperties on the SparkContext remain intact -
SparkContext#submitJob makes sure to pass down the
Except if there has been reasons not doing it like that from the beginning?
>
> thanks,
> Michel
>
> Le jeudi 2 juillet 2020 à 00:43:25 UTC+1, Edward Mitchell <
> edee...@gmail.com> a écrit :
>
>
> Okay, I see what's going on here.
>
> Looks like the w
Okay, I see what's going on here.
Looks like the way that spark is coded, the driver container image
(specified by --conf
spark.kubernetes.driver.container.image) and executor container image
(specified by --conf
spark.kubernetes.executor.container.image) is required.
If they're not specified