Re: Setting only master heap

2014-10-26 Thread Keith Simmons
Hi Guys,

Here's some lines from the log file before the OOM.  They don't look that
helpful, so let me know if there's anything else I should be sending.  I am
running in standalone mode.

spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:00:36 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-52] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:Exception
in thread "qtp2057079871-30" java.lang.OutOfMemoryError: Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:00:07 WARN AbstractNioSelector: Unexpected exception in the selector
loop.
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:02:51 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-8] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-38] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-6] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-43] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-13] shutting down ActorSystem
[sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-5] shutting down ActorSystem [sparkMaster]
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5:java.lang.OutOfMemoryError:
Java heap space
spark-pulse-org.apache.spark.deploy.master.Master-1-hadoop10.pulse.io.out.5-14/10/22
05:03:22 ERROR ActorSystemImpl: Uncaught fatal error from thread
[sparkMaster-akka.actor.default-dispatcher-12] shutting down ActorSystem
[sparkMaster]

On Thu, Oct 23, 2014 at 2:10 PM, Nan Zhu  wrote:

> h…
>
> my observation is that, master in Spark 1.1 has higher frequency of GC……
>
> Also, before 1.1, I never encounter GC overtime in Master, after upgrade
> to 1.1, I have met for 2 times (we upgrade soon after 1.1 release)….
>
> Best,
>
> --
> Nan Zhu
>
> On Thursday, October 23, 2014 at 1:08 PM, Andrew Or wrote:
>
> Yeah, as Sameer commented, there is unfortunately not an equivalent
> `SPARK_MASTER_MEMORY` that you can set. You can work around this by
> starting the master and the slaves separately with different settings of
> SPARK_DAEMON_MEMORY each time.
>
> AFAIK there haven't been any major changes in the standalone master in
> 1.1.0, so I don't see an immediate explanation for what you're observing.
> In general the Spark master doesn't use that much memory, and even if there
> are many applications it will discard the old ones appropriately, so unless
> you have a ton (like thousands) of concurrently running applications
> connecting to it there's little likelihood for it to OOM. At least that's
> my understanding.
>
> -Andrew
>
> 2014-10-22 15:51 GMT-07:00 Sameer Farooqui :
>
> Hi Keith,
>
> Would be helpful if you could post the error message.
>
> Are you running Spark in Standalone mode or with YARN?
>
> In general, the Spark Master is only used for scheduling and it should be
> fine with the default setting of 512 MB RAM.
>
> Is it actually the Spark Driver's memory that you intended to change?
>
>
>
> *++ If in Standalone mode ++*
> You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
> Spark Master, Worker and even HistoryServer daemons together.
>
> SPARK_WORKER_MEMORY is slightly confusing. In Standalon

Re: Setting only master heap

2014-10-23 Thread Nan Zhu
h…  

my observation is that, master in Spark 1.1 has higher frequency of GC……

Also, before 1.1, I never encounter GC overtime in Master, after upgrade to 
1.1, I have met for 2 times (we upgrade soon after 1.1 release)….

Best,  

--  
Nan Zhu


On Thursday, October 23, 2014 at 1:08 PM, Andrew Or wrote:

> Yeah, as Sameer commented, there is unfortunately not an equivalent 
> `SPARK_MASTER_MEMORY` that you can set. You can work around this by starting 
> the master and the slaves separately with different settings of 
> SPARK_DAEMON_MEMORY each time.
>  
> AFAIK there haven't been any major changes in the standalone master in 1.1.0, 
> so I don't see an immediate explanation for what you're observing. In general 
> the Spark master doesn't use that much memory, and even if there are many 
> applications it will discard the old ones appropriately, so unless you have a 
> ton (like thousands) of concurrently running applications connecting to it 
> there's little likelihood for it to OOM. At least that's my understanding.
>  
> -Andrew
>  
> 2014-10-22 15:51 GMT-07:00 Sameer Farooqui  (mailto:same...@databricks.com)>:
> > Hi Keith,
> >  
> > Would be helpful if you could post the error message.
> >  
> > Are you running Spark in Standalone mode or with YARN?
> >  
> > In general, the Spark Master is only used for scheduling and it should be 
> > fine with the default setting of 512 MB RAM.
> >  
> > Is it actually the Spark Driver's memory that you intended to change?
> >  
> >  
> >  
> > ++ If in Standalone mode ++
> > You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the 
> > Spark Master, Worker and even HistoryServer daemons together.
> >  
> > SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the 
> > amount of memory that a worker advertises as available for drivers to 
> > launch executors. The sum of the memory used by executors spawned from a 
> > worker cannot exceed SPARK_WORKER_MEMORY.
> >  
> > Unfortunately, I'm not aware of a way to set the memory for Master and 
> > Worker individually, other than launching them manually. You can also try 
> > setting the config differently on each machine's spark-env.sh 
> > (http://spark-env.sh) file.
> >  
> >  
> > ++ If in YARN mode ++
> > In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is 
> > only in the Standalone documentation.
> >  
> > Remember that in YARN mode there is no Spark Worker, instead the YARN 
> > NodeManagers launches the Executors. And in YARN, there is no need to run a 
> > Spark Master JVM (since the YARN ResourceManager takes care of the 
> > scheduling).
> >  
> > So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And 
> > use SPARK_DRIVER_MEMORY to set the Driver's memory.
> >  
> > Just an FYI - for compatibility's sake, even in YARN mode there is a 
> > setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do 
> > set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would 
> > have done.
> >  
> >  
> > - Sameer
> >  
> >  
> > On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons  > (mailto:ke...@pulse.io)> wrote:
> > > We've been getting some OOMs from the spark master since upgrading to 
> > > Spark 1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to 
> > > increase the worker heap, which as far as I know is fine.  Is there any 
> > > setting which *only* increases the master heap size?
> > >  
> > > Keith  
>  



Re: Setting only master heap

2014-10-23 Thread Andrew Or
Yeah, as Sameer commented, there is unfortunately not an equivalent
`SPARK_MASTER_MEMORY` that you can set. You can work around this by
starting the master and the slaves separately with different settings of
SPARK_DAEMON_MEMORY each time.

AFAIK there haven't been any major changes in the standalone master in
1.1.0, so I don't see an immediate explanation for what you're observing.
In general the Spark master doesn't use that much memory, and even if there
are many applications it will discard the old ones appropriately, so unless
you have a ton (like thousands) of concurrently running applications
connecting to it there's little likelihood for it to OOM. At least that's
my understanding.

-Andrew

2014-10-22 15:51 GMT-07:00 Sameer Farooqui :

> Hi Keith,
>
> Would be helpful if you could post the error message.
>
> Are you running Spark in Standalone mode or with YARN?
>
> In general, the Spark Master is only used for scheduling and it should be
> fine with the default setting of 512 MB RAM.
>
> Is it actually the Spark Driver's memory that you intended to change?
>
>
>
> *++ If in Standalone mode ++*
> You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
> Spark Master, Worker and even HistoryServer daemons together.
>
> SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the
> amount of memory that a worker advertises as available for drivers to
> launch executors. The sum of the memory used by executors spawned from a
> worker cannot exceed SPARK_WORKER_MEMORY.
>
> Unfortunately, I'm not aware of a way to set the memory for Master and
> Worker individually, other than launching them manually. You can also try
> setting the config differently on each machine's spark-env.sh file.
>
>
> *++ If in YARN mode ++*
> In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is
> only in the Standalone documentation.
>
> Remember that in YARN mode there is no Spark Worker, instead the YARN
> NodeManagers launches the Executors. And in YARN, there is no need to run a
> Spark Master JVM (since the YARN ResourceManager takes care of the
> scheduling).
>
> So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And
> use SPARK_DRIVER_MEMORY to set the Driver's memory.
>
> Just an FYI - for compatibility's sake, even in YARN mode there is a
> setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do
> set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would
> have done.
>
>
> - Sameer
>
>
> On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons  wrote:
>
>> We've been getting some OOMs from the spark master since upgrading to
>> Spark 1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to
>> increase the worker heap, which as far as I know is fine.  Is there any
>> setting which *only* increases the master heap size?
>>
>> Keith
>>
>
>


Re: Setting only master heap

2014-10-22 Thread Sameer Farooqui
Hi Keith,

Would be helpful if you could post the error message.

Are you running Spark in Standalone mode or with YARN?

In general, the Spark Master is only used for scheduling and it should be
fine with the default setting of 512 MB RAM.

Is it actually the Spark Driver's memory that you intended to change?



*++ If in Standalone mode ++*
You're right that SPARK_DAEMON_MEMORY set the memory to allocate to the
Spark Master, Worker and even HistoryServer daemons together.

SPARK_WORKER_MEMORY is slightly confusing. In Standalone mode, it is the
amount of memory that a worker advertises as available for drivers to
launch executors. The sum of the memory used by executors spawned from a
worker cannot exceed SPARK_WORKER_MEMORY.

Unfortunately, I'm not aware of a way to set the memory for Master and
Worker individually, other than launching them manually. You can also try
setting the config differently on each machine's spark-env.sh file.


*++ If in YARN mode ++*
In YARN, there is no setting for SPARK_DAEMON_MEMORY. Therefore this is
only in the Standalone documentation.

Remember that in YARN mode there is no Spark Worker, instead the YARN
NodeManagers launches the Executors. And in YARN, there is no need to run a
Spark Master JVM (since the YARN ResourceManager takes care of the
scheduling).

So, with YARN use SPARK_EXECUTOR_MEMORY to set the Executor's memory. And
use SPARK_DRIVER_MEMORY to set the Driver's memory.

Just an FYI - for compatibility's sake, even in YARN mode there is a
setting for SPARK_WORKER_MEMORY, but this has been deprecated. If you do
set it, it just does the same thing as setting SPARK_EXECUTOR_MEMORY would
have done.


- Sameer


On Wed, Oct 22, 2014 at 1:46 PM, Keith Simmons  wrote:

> We've been getting some OOMs from the spark master since upgrading to
> Spark 1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to
> increase the worker heap, which as far as I know is fine.  Is there any
> setting which *only* increases the master heap size?
>
> Keith
>


Setting only master heap

2014-10-22 Thread Keith Simmons
We've been getting some OOMs from the spark master since upgrading to Spark
1.1.0.  I've found SPARK_DAEMON_MEMORY, but that also seems to increase the
worker heap, which as far as I know is fine.  Is there any setting which
*only* increases the master heap size?

Keith