I'm using spark 1.2.2 on mesos 0.21 I have a java job that is submitted to mesos from marathon.
I also have cgroups configured for mesos on each node. Even though the job, when running, uses 512MB, it tries to take over 3GB at startup and is killed by cgroups. When I start mesos-slave, It's started like this (we use supervisord): command=/usr/sbin/mesos-slave --disk_watch_interval=10secs --gc_delay=480mins --isolation=cgroups/cpu,cgroups/mem --cgroups_hierarchy=/cgroup --resources="mem(* ):3000;cpus(*):2;ports(*):[25000-30000];disk(*):5000" --cgroups_root=mesos --master=zk://prodMesosMaster01:2181,prodMesosMaster02:2181,prodMesosMaster03:2181/me sos --work_dir=/tmp/mesos --log_dir=/var/log/mesos In cgconfig.conf: memory.limit_in_bytes="3221225472"; spark-submit from marathon: bin/spark-submit --executor-memory 128m --master mesos://zk://prodMesosMaster01:2181,prodMesosMaster02:2181,prodMesosMaster03:2181/mesos --class com.company.alert.AlertConsumer AlertConsumer.jar --zk prodMesosMaster01:2181,prodMesosMaster02:2181,prodMesosMaster03:2181 --mesos mesos://zk://prodMesosMaster01:2181,prodMesosMaster02:2181,prodMesosMaster03:2181/mesos --spark_executor_uri http://prodmesosfileserver01/spark-dist/1.2.2/spark-dist-1.2.2.tgz We increased the cgroup limit to 6GB and the memory resources from 3000 to 6000 for the startup of mesos and now cgroups doesn't kill the job anymore. But the question is, how do I limit the start of the job so it isn't trying to take 3GB, even if when running it's only using 512MB? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-on-mesos-gets-killed-by-cgroups-for-too-much-memory-tp24769.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org