[
https://issues.apache.org/jira/browse/ASTERIXDB-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405361#comment-17405361
]
Ian Maxon commented on ASTERIXDB-2948:
--------------------------------------
If increasing ulimit is still not enough, the number of handles open at any
point in time should be controllable via the config option
{{storage.buffercache.maxopenfiles}}
> "Too many open files" on large data sets in Parquet/S3
> ------------------------------------------------------
>
> Key: ASTERIXDB-2948
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-2948
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: EXT - External data
> Affects Versions: 0.9.8
> Reporter: Ingo Müller
> Priority: Major
>
> When I run complex queries on a very large machine (96 vCPUs, 48 configured
> IO devices/partitions) with Parquet files on S3, I occasionally get the
> following error:
> {{java.io.FileNotFoundException:
> /data/asterixdb/iodevice40/./ExternalSortGroupByRunGenerator13134601214093461962.waf
> (Too many open files)}}
> This only happens after a certain size; I think the smallest instance of the
> data set where I observed the error was around 0.5TB. I have not been able to
> test these queries with files on HDFS or the local filesystem since they do
> not fit onto the disk of the system.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)