Ingo Müller created ASTERIXDB-2948:
--------------------------------------
Summary: "Too many open files" on large data sets in Parquet/S3
Key: ASTERIXDB-2948
URL: https://issues.apache.org/jira/browse/ASTERIXDB-2948
Project: Apache AsterixDB
Issue Type: Bug
Components: EXT - External data
Affects Versions: 0.9.8
Reporter: Ingo Müller
When I run complex queries on a very large machine (96 vCPUs, 48 configured IO
devices/partitions) with Parquet files on S3, I occasionally get the following
error:
{{java.io.FileNotFoundException:
/data/asterixdb/iodevice40/./ExternalSortGroupByRunGenerator13134601214093461962.waf
(Too many open files)}}
This only happens after a certain size; I think the smallest instance of the
data set where I observed the error was around 0.5TB. I have not been able to
test these queries with files on HDFS or the local filesystem since they do not
fit onto the disk of the system.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)