Re: Debugging too many files open exception issue in Spark shuffle

2015-09-23 Thread DB Tsai
in  ./apps/mesos-0.22.1/sbin/mesos-daemon.sh

#!/usr/bin/env bash

prefix=/apps/mesos-0.22.1
exec_prefix=/apps/mesos-0.22.1

deploy_dir=${prefix}/etc/mesos

# Increase the default number of open file descriptors.
ulimit -n 8192


Sincerely,

DB Tsai
--
Blog: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D


On Wed, Sep 23, 2015 at 5:14 PM, java8964  wrote:
> That is interesting.
>
> I don't have any Mesos experience, but just want to know the reason why it
> does so.
>
> Yong
>
>> Date: Wed, 23 Sep 2015 15:53:54 -0700
>> Subject: Debugging too many files open exception issue in Spark shuffle
>> From: dbt...@dbtsai.com
>> To: user@spark.apache.org
>
>>
>> Hi,
>>
>> Recently, we ran into this notorious exception while doing large
>> shuffle in mesos at Netflix. We ensure that `ulimit -n` is a very
>> large number, but still have the issue.
>>
>> It turns out that mesos overrides the `ulimit -n` to a small number
>> causing the problem. It's very non-trivial to debug (as logging in on
>> the slave gives the right ulimit - it's only in the mesos context that
>> it gets overridden).
>>
>> Here is the code you can run in Spark shell to get the actual allowed
>> # of open files for Spark.
>>
>> import sys.process._
>> val p = 1 to 100
>> val rdd = sc.parallelize(p, 100)
>> val openFiles = rdd.map(x=> Seq("sh", "-c", "ulimit
>> -n").!!.toDouble.toLong).collect
>>
>> Hope this can help someone in the same situation.
>>
>> Sincerely,
>>
>> DB Tsai
>> --
>> Blog: https://www.dbtsai.com
>> PGP Key ID: 0xAF08DF8D
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Debugging too many files open exception issue in Spark shuffle

2015-09-23 Thread java8964
That is interesting.
I don't have any Mesos experience, but just want to know the reason why it does 
so.
Yong

> Date: Wed, 23 Sep 2015 15:53:54 -0700
> Subject: Debugging too many files open exception issue in Spark shuffle
> From: dbt...@dbtsai.com
> To: user@spark.apache.org
> 
> Hi,
> 
> Recently, we ran into this notorious exception while doing large
> shuffle in mesos at Netflix. We ensure that `ulimit -n` is a very
> large number, but still have the issue.
> 
> It turns out that mesos overrides the `ulimit -n` to a small number
> causing the problem. It's very non-trivial to debug (as logging in on
> the slave gives the right ulimit - it's only in the mesos context that
> it gets overridden).
> 
> Here is the code you can run in Spark shell to get the actual allowed
> # of open files for Spark.
> 
> import sys.process._
> val p = 1 to 100
> val rdd = sc.parallelize(p, 100)
> val openFiles = rdd.map(x=> Seq("sh", "-c", "ulimit
> -n").!!.toDouble.toLong).collect
> 
> Hope this can help someone in the same situation.
> 
> Sincerely,
> 
> DB Tsai
> --
> Blog: https://www.dbtsai.com
> PGP Key ID: 0xAF08DF8D
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>