Re: Resource allocation in yarn-cluster mode

2015-02-10 Thread Sandy Ryza
Hi Zsolt,

spark.executor.memory, spark.executor.cores, and spark.executor.instances
are only honored when launching through spark-submit.  Marcelo is working
on a Spark launcher (SPARK-4924) that will enable using these
programmatically.

That's correct that the error comes up when
yarn.scheduler.maximum-allocation-mb is exceeded.  The reason it doesn't
just use a smaller amount of memory is because it could be surprising to
the user to find out they're silently getting less memory than they
requested.  Also, I don't think YARN exposes this up front so Spark has no
way to check.

-Sandy

On Tue, Feb 10, 2015 at 8:38 AM, Zsolt Tóth toth.zsolt@gmail.com
wrote:

 One more question: Is there reason why Spark throws an error when
 requesting too much memory instead of capping it to the maximum value (as
 YARN would do by default)?

 Thanks!

 2015-02-10 17:32 GMT+01:00 Zsolt Tóth toth.zsolt@gmail.com:

 Hi,

 I'm using Spark in yarn-cluster mode and submit the jobs programmatically
 from the client in Java. I ran into a few issues when tried to set the
 resource allocation properties.

 1. It looks like setting spark.executor.memory, spark.executor.cores and
 spark.executor.instances have no effect because ClientArguments checks only
 for the command line arguments (--num-executors, --executors cores, etc.).
 Is it possible to use the properties in yarn-cluster mode instead of the
 command line arguments?

 2. My nodes have 5GB memory but when I set --executor-memory to 4g
 (overhead 384m), I get the exception that the required executor memory is
 above the max threshold of this cluster. It looks like this threshold is
 the value of the yarn.scheduler.maximum-allocation-mb property. Is that
 correct?

 Thanks,
 Zsolt





Re: Resource allocation in yarn-cluster mode

2015-02-10 Thread Zsolt Tóth
One more question: Is there reason why Spark throws an error when
requesting too much memory instead of capping it to the maximum value (as
YARN would do by default)?

Thanks!

2015-02-10 17:32 GMT+01:00 Zsolt Tóth toth.zsolt@gmail.com:

 Hi,

 I'm using Spark in yarn-cluster mode and submit the jobs programmatically
 from the client in Java. I ran into a few issues when tried to set the
 resource allocation properties.

 1. It looks like setting spark.executor.memory, spark.executor.cores and
 spark.executor.instances have no effect because ClientArguments checks only
 for the command line arguments (--num-executors, --executors cores, etc.).
 Is it possible to use the properties in yarn-cluster mode instead of the
 command line arguments?

 2. My nodes have 5GB memory but when I set --executor-memory to 4g
 (overhead 384m), I get the exception that the required executor memory is
 above the max threshold of this cluster. It looks like this threshold is
 the value of the yarn.scheduler.maximum-allocation-mb property. Is that
 correct?

 Thanks,
 Zsolt



Resource allocation in yarn-cluster mode

2015-02-10 Thread Zsolt Tóth
Hi,

I'm using Spark in yarn-cluster mode and submit the jobs programmatically
from the client in Java. I ran into a few issues when tried to set the
resource allocation properties.

1. It looks like setting spark.executor.memory, spark.executor.cores and
spark.executor.instances have no effect because ClientArguments checks only
for the command line arguments (--num-executors, --executors cores, etc.).
Is it possible to use the properties in yarn-cluster mode instead of the
command line arguments?

2. My nodes have 5GB memory but when I set --executor-memory to 4g
(overhead 384m), I get the exception that the required executor memory is
above the max threshold of this cluster. It looks like this threshold is
the value of the yarn.scheduler.maximum-allocation-mb property. Is that
correct?

Thanks,
Zsolt