Theodore Vasiloudis created SPARK-5838:
------------------------------------------
Summary: Changing SPARK_LOCAL_DIRS option in spark-env.sh does not
take effect without daemon restart
Key: SPARK-5838
URL: https://issues.apache.org/jira/browse/SPARK-5838
Project: Spark
Issue Type: Bug
Components: Deploy, EC2, Spark Submit
Affects Versions: 1.1.1
Reporter: Theodore Vasiloudis
Priority: Minor
This issue has already been mentioned in the mailing list here:
http://apache-spark-user-list.1001560.n3.nabble.com/set-spark-local-dir-on-driver-program-doesn-t-take-effect-td11040.html
The problem usually has to do with Spark creating too many files during
shuffles, filling up the small amount of disk space that most EC2 instances
have for root on /mnt2.
The workaround is to set SPARK_LOCAL_DIRS to a larger volume (e.g. to the
/mnt/spark volume only, removing /mnt2).
However for these changes to take effect, the daemons need to be restarted with
sbin/stop-all -> sbin/start-all.
Even more troubling is the fact that the Web UI-> Environment reports that the
spark.local.dir is set to the new path, but Spark still spills to /mnt2 as well.
To my knowledge this is not mentioned anywhere in the documentation or any
other mailing list reply except for the one I linked.
I guess possible solutions are to either ensure the change does take effect so
that reality agrees with what the Web UI is reporting, or include a section on
the documentation of EC2 for this kind of problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]