Re: Setting only master heap
Hi Guys, Here's some lines from the log file before the OOM. They don't look that helpful, so let me know if there's anything else I should be sending. I am running in standalone mode. spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:00:36 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-52] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:Exception in thread "qtp2057079871-30" java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:00:07 WARN AbstractNioSelector: Unexpected exception in the selector loop. spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:02:51 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-8] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-38] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-6] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-43] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-13] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-5] shutting down ActorSystem [sparkMaster] spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError: Java heap space spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22 05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkMaster-akka.actor.default-dispatcher-12] shutting down ActorSystem [sparkMaster] On Thu, Oct 23, 2014 at 2:10 PM, Nan Zhu wrote: > h… > > my observation is that, master in Spark 1.1 has higher frequency of GC…… > > Also, before 1.1, I never encounter GC overtime in Master, after upgrade > to 1.1, I have met for 2 times (we upgrade soon after 1.1 release)…. > > Best, > > -- > Nan Zhu > > On Thursday, October 23, 2014 at 1:08 PM, Andrew Or wrote: > > Yeah, as Sameer commented, there is unfortunately not an equivalent > `SPARK_MASTER_MEMORY` that you can set. You can work around this by > starting the master and the slaves separately with different settings of > SPARK_DAEMON_MEMORY each time. > > AFAIK there haven't been any major changes in the standalone master in > 1.1.0, so I don't see an immediate explanation for what you're observing. > In general the Spark master doesn't use that much memory, and even if there > are many applications it will discard the old ones appropriately, so unless > you have a ton (like thousands) of concurrently running applications > connecting to it there's little likelihood for it to OOM. At least that's > my understanding. > > -Andrew > > 2014-10-22 15:51 GMT-07:00 Sameer Farooqui : > > Hi Keith, > > Would be helpful if you could post the error message. > > Are you running Spark in Standalone mode or with YARN? > > In general, the Spark Master is only used for scheduling and it should be > fine with the default setting of 512 MB RAM. > > Is it actually the Spark Driver's memory that you intended to change? > > > > *++ If in Standalone mode ++* > You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the > Spark Master, Worker and even HistoryServer daemons together. > > SPARK_WORKER_MEMORY is slightly confusing. In Standalon
Re: Setting only master heap
h… my observation is that, master in Spark 1.1 has higher frequency of GC…… Also, before 1.1, I never encounter GC overtime in Master, after upgrade to 1.1, I have met for 2 times (we upgrade soon after 1.1 release)…. Best, -- Nan Zhu On Thursday, October 23, 2014 at 1:08 PM, Andrew Or wrote: > Yeah, as Sameer commented, there is unfortunately not an equivalent > `SPARK_MASTER_MEMORY` that you can set. You can work around this by starting > the master and the slaves separately with different settings of > SPARK_DAEMON_MEMORY each time. > > AFAIK there haven't been any major changes in the standalone master in 1.1.0, > so I don't see an immediate explanation for what you're observing. In general > the Spark master doesn't use that much memory, and even if there are many > applications it will discard the old ones appropriately, so unless you have a > ton (like thousands) of concurrently running applications connecting to it > there's little likelihood for it to OOM. At least that's my understanding. > > -Andrew > > 2014-10-22 15:51 GMT-07:00 Sameer Farooqui (mailto:same...@databricks.com)>: > > Hi Keith, > > > > Would be helpful if you could post the error message. > > > > Are you running Spark in Standalone mode or with YARN? > > > > In general, the Spark Master is only used for scheduling and it should be > > fine with the default setting of 512 MB RAM. > > > > Is it actually the Spark Driver's memory that you intended to change? > > > > > > > > ++ If in Standalone mode ++ > > You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the > > Spark Master, Worker and even HistoryServer daemons together. > > > > SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the > > amount of memory that a worker advertises as available for drivers to > > launch executors. The sum of the memory used by executors spawned from a > > worker cannot exceed SPARK_WORKER_MEMORY. > > > > Unfortunately, I'm not aware of a way to set the memory for Master and > > Worker individually, other than launching them manually. You can also try > > setting the config differently on each machine's spark-env.sh > > (http://spark-env.sh) file. > > > > > > ++ If in YARN mode ++ > > In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is > > only in the Standalone documentation. > > > > Remember that in YARN mode there is no Spark Worker, instead the YARN > > NodeManagers launches the Executors. And in YARN, there is no need to run a > > Spark Master JVM (since the YARN ResourceManager takes care of the > > scheduling). > > > > So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And > > use SPARK_DRIVER_MEMORY to set the Driver's memory. > > > > Just an FYI - for compatibility's sake, even in YARN mode there is a > > setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do > > set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would > > have done. > > > > > > - Sameer > > > > > > On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons > (mailto:ke...@pulse.io)> wrote: > > > We've been getting some OOMs from the spark master since upgrading to > > > Spark 1.1.0. I've found SPARK_DAEMON_MEMORY, but that also seems to > > > increase the worker heap, which as far as I know is fine. Is there any > > > setting which *only* increases the master heap size? > > > > > > Keith >
Re: Setting only master heap
Yeah, as Sameer commented, there is unfortunately not an equivalent `SPARK_MASTER_MEMORY` that you can set. You can work around this by starting the master and the slaves separately with different settings of SPARK_DAEMON_MEMORY each time. AFAIK there haven't been any major changes in the standalone master in 1.1.0, so I don't see an immediate explanation for what you're observing. In general the Spark master doesn't use that much memory, and even if there are many applications it will discard the old ones appropriately, so unless you have a ton (like thousands) of concurrently running applications connecting to it there's little likelihood for it to OOM. At least that's my understanding. -Andrew 2014-10-22 15:51 GMT-07:00 Sameer Farooqui : > Hi Keith, > > Would be helpful if you could post the error message. > > Are you running Spark in Standalone mode or with YARN? > > In general, the Spark Master is only used for scheduling and it should be > fine with the default setting of 512 MB RAM. > > Is it actually the Spark Driver's memory that you intended to change? > > > > *++ If in Standalone mode ++* > You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the > Spark Master, Worker and even HistoryServer daemons together. > > SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the > amount of memory that a worker advertises as available for drivers to > launch executors. The sum of the memory used by executors spawned from a > worker cannot exceed SPARK_WORKER_MEMORY. > > Unfortunately, I'm not aware of a way to set the memory for Master and > Worker individually, other than launching them manually. You can also try > setting the config differently on each machine's spark-env.sh file. > > > *++ If in YARN mode ++* > In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is > only in the Standalone documentation. > > Remember that in YARN mode there is no Spark Worker, instead the YARN > NodeManagers launches the Executors. And in YARN, there is no need to run a > Spark Master JVM (since the YARN ResourceManager takes care of the > scheduling). > > So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And > use SPARK_DRIVER_MEMORY to set the Driver's memory. > > Just an FYI - for compatibility's sake, even in YARN mode there is a > setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do > set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would > have done. > > > - Sameer > > > On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons wrote: > >> We've been getting some OOMs from the spark master since upgrading to >> Spark 1.1.0. I've found SPARK_DAEMON_MEMORY, but that also seems to >> increase the worker heap, which as far as I know is fine. Is there any >> setting which *only* increases the master heap size? >> >> Keith >> > >
Re: Setting only master heap
Hi Keith, Would be helpful if you could post the error message. Are you running Spark in Standalone mode or with YARN? In general, the Spark Master is only used for scheduling and it should be fine with the default setting of 512 MB RAM. Is it actually the Spark Driver's memory that you intended to change? *++ If in Standalone mode ++* You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the Spark Master, Worker and even HistoryServer daemons together. SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the amount of memory that a worker advertises as available for drivers to launch executors. The sum of the memory used by executors spawned from a worker cannot exceed SPARK_WORKER_MEMORY. Unfortunately, I'm not aware of a way to set the memory for Master and Worker individually, other than launching them manually. You can also try setting the config differently on each machine's spark-env.sh file. *++ If in YARN mode ++* In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is only in the Standalone documentation. Remember that in YARN mode there is no Spark Worker, instead the YARN NodeManagers launches the Executors. And in YARN, there is no need to run a Spark Master JVM (since the YARN ResourceManager takes care of the scheduling). So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And use SPARK_DRIVER_MEMORY to set the Driver's memory. Just an FYI - for compatibility's sake, even in YARN mode there is a setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would have done. - Sameer On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons wrote: > We've been getting some OOMs from the spark master since upgrading to > Spark 1.1.0. I've found SPARK_DAEMON_MEMORY, but that also seems to > increase the worker heap, which as far as I know is fine. Is there any > setting which *only* increases the master heap size? > > Keith >
Setting only master heap
We've been getting some OOMs from the spark master since upgrading to Spark 1.1.0. I've found SPARK_DAEMON_MEMORY, but that also seems to increase the worker heap, which as far as I know is fine. Is there any setting which *only* increases the master heap size? Keith