Theodore Vasiloudis created SPARK-5838:
------------------------------------------

             Summary: Changing SPARK_LOCAL_DIRS option in spark-env.sh does not 
take effect without daemon restart
                 Key: SPARK-5838
                 URL: https://issues.apache.org/jira/browse/SPARK-5838
             Project: Spark
          Issue Type: Bug
          Components: Deploy, EC2, Spark Submit
    Affects Versions: 1.1.1
            Reporter: Theodore Vasiloudis
            Priority: Minor


This issue has already been mentioned in the mailing list here: 
http://apache-spark-user-list.1001560.n3.nabble.com/set-spark-local-dir-on-driver-program-doesn-t-take-effect-td11040.html

The problem usually has to do with Spark creating too many files during 
shuffles, filling up the small amount of disk space that most EC2 instances 
have for root on /mnt2.

The workaround is to set SPARK_LOCAL_DIRS to a larger volume (e.g. to the 
/mnt/spark volume only, removing /mnt2).

However for these changes to take effect, the daemons need to be restarted with 
sbin/stop-all -> sbin/start-all. 
Even more troubling is the fact that the Web UI-> Environment reports that the 
spark.local.dir is set to the new path, but Spark still spills to /mnt2 as well.

To my knowledge this is not mentioned anywhere in the documentation or any 
other mailing list reply except for the one I linked.

I guess possible solutions are to either ensure the change does take effect so 
that reality agrees with what the Web UI is reporting, or include a section on 
the documentation of EC2 for this kind of problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to