Re: Setting only master heap

2014-10-26 Thread Keith Simmons
Hi Guys,

Here's some lines from the log file before the OOM.  They don't look that
helpful, so let me know if there's anything else I should be sending.  I am
running in standalone mode.

spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:00:36 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-52] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:Exception
in thread qtp2057079871-30 java.lang.OutOfMemoryError: Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:00:07 WARN AbstractNioSelector: Unexpected exception in the selector
loop.
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:02:51 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-8] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-38] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-6] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-43] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-13] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-5] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-12] shutting down ActorSystem
[sparkMaster]

On Thu, Oct 23, 2014 at 2:10 PM, Nan Zhu zhunanmcg...@gmail.com wrote:

 h…

 my observation is that, master in Spark 1.1 has higher frequency of GC……

 Also, before 1.1, I never encounter GC overtime in Master, after upgrade
 to 1.1, I have met for 2 times (we upgrade soon after 1.1 release)….

 Best,

 --
 Nan Zhu

 On Thursday, October 23, 2014 at 1:08 PM, Andrew Or wrote:

 Yeah, as Sameer commented, there is unfortunately not an equivalent
 `SPARK_MASTER_MEMORY` that you can set. You can work around this by
 starting the master and the slaves separately with different settings of
 SPARK_DAEMON_MEMORY each time.

 AFAIK there haven't been any major changes in the standalone master in
 1.1.0, so I don't see an immediate explanation for what you're observing.
 In general the Spark master doesn't use that much memory, and even if there
 are many applications it will discard the old ones appropriately, so unless
 you have a ton (like thousands) of concurrently running applications
 connecting to it there's little likelihood for it to OOM. At least that's
 my understanding.

 -Andrew

 2014-10-22 15:51 GMT-07:00 Sameer Farooqui same...@databricks.com:

 Hi Keith,

 Would be helpful if you could post the error message.

 Are you running Spark in Standalone mode or with YARN?

 In general, the Spark Master is only used for scheduling and it should be
 fine with the default setting of 512 MB RAM.

 Is it actually the Spark Driver's memory that you intended to change?



 *++ If in Standalone mode ++*
 You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
 Spark Master, Worker and even HistoryServer daemons together.

 SPARK_WORKER_MEMORY is slightly confusing. In Standalone 

Re: Setting only master heap

2014-10-23 Thread Andrew Or
Yeah, as Sameer commented, there is unfortunately not an equivalent
`SPARK_MASTER_MEMORY` that you can set. You can work around this by
starting the master and the slaves separately with different settings of
SPARK_DAEMON_MEMORY each time.

AFAIK there haven't been any major changes in the standalone master in
1.1.0, so I don't see an immediate explanation for what you're observing.
In general the Spark master doesn't use that much memory, and even if there
are many applications it will discard the old ones appropriately, so unless
you have a ton (like thousands) of concurrently running applications
connecting to it there's little likelihood for it to OOM. At least that's
my understanding.

-Andrew

2014-10-22 15:51 GMT-07:00 Sameer Farooqui same...@databricks.com:

 Hi Keith,

 Would be helpful if you could post the error message.

 Are you running Spark in Standalone mode or with YARN?

 In general, the Spark Master is only used for scheduling and it should be
 fine with the default setting of 512 MB RAM.

 Is it actually the Spark Driver's memory that you intended to change?



 *++ If in Standalone mode ++*
 You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
 Spark Master, Worker and even HistoryServer daemons together.

 SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the
 amount of memory that a worker advertises as available for drivers to
 launch executors. The sum of the memory used by executors spawned from a
 worker cannot exceed SPARK_WORKER_MEMORY.

 Unfortunately, I'm not aware of a way to set the memory for Master and
 Worker individually, other than launching them manually. You can also try
 setting the config differently on each machine's spark-env.sh file.


 *++ If in YARN mode ++*
 In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is
 only in the Standalone documentation.

 Remember that in YARN mode there is no Spark Worker, instead the YARN
 NodeManagers launches the Executors. And in YARN, there is no need to run a
 Spark Master JVM (since the YARN ResourceManager takes care of the
 scheduling).

 So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And
 use SPARK_DRIVER_MEMORY to set the Driver's memory.

 Just an FYI - for compatibility's sake, even in YARN mode there is a
 setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do
 set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would
 have done.


 - Sameer


 On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons ke...@pulse.io wrote:

 We've been getting some OOMs from the spark master since upgrading to
 Spark 1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to
 increase the worker heap, which as far as I know is fine.  Is there any
 setting which *only* increases the master heap size?

 Keith





Re: Setting only master heap

2014-10-22 Thread Sameer Farooqui
Hi Keith,

Would be helpful if you could post the error message.

Are you running Spark in Standalone mode or with YARN?

In general, the Spark Master is only used for scheduling and it should be
fine with the default setting of 512 MB RAM.

Is it actually the Spark Driver's memory that you intended to change?



*++ If in Standalone mode ++*
You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
Spark Master, Worker and even HistoryServer daemons together.

SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the
amount of memory that a worker advertises as available for drivers to
launch executors. The sum of the memory used by executors spawned from a
worker cannot exceed SPARK_WORKER_MEMORY.

Unfortunately, I'm not aware of a way to set the memory for Master and
Worker individually, other than launching them manually. You can also try
setting the config differently on each machine's spark-env.sh file.


*++ If in YARN mode ++*
In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is
only in the Standalone documentation.

Remember that in YARN mode there is no Spark Worker, instead the YARN
NodeManagers launches the Executors. And in YARN, there is no need to run a
Spark Master JVM (since the YARN ResourceManager takes care of the
scheduling).

So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And
use SPARK_DRIVER_MEMORY to set the Driver's memory.

Just an FYI - for compatibility's sake, even in YARN mode there is a
setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do
set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would
have done.


- Sameer


On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons ke...@pulse.io wrote:

 We've been getting some OOMs from the spark master since upgrading to
 Spark 1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to
 increase the worker heap, which as far as I know is fine.  Is there any
 setting which *only* increases the master heap size?

 Keith