I got the number from the Hadoop admin. It's 1M actually. I suspect the
consolidation didn't work as expected? Any other reason?


On Thu, Jul 31, 2014 at 11:01 AM, Shao, Saisai <saisai.s...@intel.com>
wrote:

>  I don’t think it’s a bug of consolidated shuffle, it’s a Linux
> configuration problem. The default open files in Linux is 1024, while your
> open file is larger than 1024 you will get the error as you mentioned
> below. So you can set the open file numbers to a large one by: ulimit –n
> xxx or write into /etc/security/limits.conf in Ubuntu.
>
>
>
> Shuffle consolidation can reduce the total shuffle file numbers, but the
> concurrent opened file number is the same as basic hash-based shuffle.
>
>
>
> Thanks
>
> Jerry
>
>
>
> *From:* Jianshi Huang [mailto:jianshi.hu...@gmail.com]
> *Sent:* Thursday, July 31, 2014 10:34 AM
> *To:* user@spark.apache.org
> *Cc:* xia...@sjtu.edu.cn
> *Subject:* Re: spark.shuffle.consolidateFiles seems not working
>
>
>
> Ok... but my question is why spark.shuffle.consolidateFiles is working
> (or is it)? Is this a bug?
>
>
>
> On Wed, Jul 30, 2014 at 4:29 PM, Larry Xiao <xia...@sjtu.edu.cn> wrote:
>
> Hi Jianshi,
>
> I've met similar situation before.
> And my solution was 'ulimit', you can use
>
> -a to see your current settings
> -n to set open files limit
> (and other limits also)
>
> And I set -n to 10240.
>
> I see spark.shuffle.consolidateFiles helps by reusing open files.
> (so I don't know to what extend does it help)
>
> Hope it helps.
>
> Larry
>
>
>
> On 7/30/14, 4:01 PM, Jianshi Huang wrote:
>
> I'm using Spark 1.0.1 on Yarn-Client mode.
>
> SortByKey always reports a FileNotFoundExceptions with messages says "too
> many open files".
>
> I already set spark.shuffle.consolidateFiles to true:
>
>   conf.set("spark.shuffle.consolidateFiles", "true")
>
> But it seems not working. What are the other possible reasons? How to fix
> it?
>
> Jianshi
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
>
>
>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Reply via email to