Re: Resource allocation in yarn-cluster mode
Hi Zsolt, spark.executor.memory, spark.executor.cores, and spark.executor.instances are only honored when launching through spark-submit. Marcelo is working on a Spark launcher (SPARK-4924) that will enable using these programmatically. That's correct that the error comes up when yarn.scheduler.maximum-allocation-mb is exceeded. The reason it doesn't just use a smaller amount of memory is because it could be surprising to the user to find out they're silently getting less memory than they requested. Also, I don't think YARN exposes this up front so Spark has no way to check. -Sandy On Tue, Feb 10, 2015 at 8:38 AM, Zsolt Tóth toth.zsolt@gmail.com wrote: One more question: Is there reason why Spark throws an error when requesting too much memory instead of capping it to the maximum value (as YARN would do by default)? Thanks! 2015-02-10 17:32 GMT+01:00 Zsolt Tóth toth.zsolt@gmail.com: Hi, I'm using Spark in yarn-cluster mode and submit the jobs programmatically from the client in Java. I ran into a few issues when tried to set the resource allocation properties. 1. It looks like setting spark.executor.memory, spark.executor.cores and spark.executor.instances have no effect because ClientArguments checks only for the command line arguments (--num-executors, --executors cores, etc.). Is it possible to use the properties in yarn-cluster mode instead of the command line arguments? 2. My nodes have 5GB memory but when I set --executor-memory to 4g (overhead 384m), I get the exception that the required executor memory is above the max threshold of this cluster. It looks like this threshold is the value of the yarn.scheduler.maximum-allocation-mb property. Is that correct? Thanks, Zsolt
Re: Resource allocation in yarn-cluster mode
One more question: Is there reason why Spark throws an error when requesting too much memory instead of capping it to the maximum value (as YARN would do by default)? Thanks! 2015-02-10 17:32 GMT+01:00 Zsolt Tóth toth.zsolt@gmail.com: Hi, I'm using Spark in yarn-cluster mode and submit the jobs programmatically from the client in Java. I ran into a few issues when tried to set the resource allocation properties. 1. It looks like setting spark.executor.memory, spark.executor.cores and spark.executor.instances have no effect because ClientArguments checks only for the command line arguments (--num-executors, --executors cores, etc.). Is it possible to use the properties in yarn-cluster mode instead of the command line arguments? 2. My nodes have 5GB memory but when I set --executor-memory to 4g (overhead 384m), I get the exception that the required executor memory is above the max threshold of this cluster. It looks like this threshold is the value of the yarn.scheduler.maximum-allocation-mb property. Is that correct? Thanks, Zsolt
Resource allocation in yarn-cluster mode
Hi, I'm using Spark in yarn-cluster mode and submit the jobs programmatically from the client in Java. I ran into a few issues when tried to set the resource allocation properties. 1. It looks like setting spark.executor.memory, spark.executor.cores and spark.executor.instances have no effect because ClientArguments checks only for the command line arguments (--num-executors, --executors cores, etc.). Is it possible to use the properties in yarn-cluster mode instead of the command line arguments? 2. My nodes have 5GB memory but when I set --executor-memory to 4g (overhead 384m), I get the exception that the required executor memory is above the max threshold of this cluster. It looks like this threshold is the value of the yarn.scheduler.maximum-allocation-mb property. Is that correct? Thanks, Zsolt