Ingo Müller created ASTERIXDB-2948:
--------------------------------------

             Summary: "Too many open files" on large data sets in Parquet/S3
                 Key: ASTERIXDB-2948
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2948
             Project: Apache AsterixDB
          Issue Type: Bug
          Components: EXT - External data
    Affects Versions: 0.9.8
            Reporter: Ingo Müller


When I run complex queries on a very large machine (96 vCPUs, 48 configured IO 
devices/partitions) with Parquet files on S3, I occasionally get the following 
error:

{{java.io.FileNotFoundException: 
/data/asterixdb/iodevice40/./ExternalSortGroupByRunGenerator13134601214093461962.waf
 (Too many open files)}}

This only happens after a certain size; I think the smallest instance of the 
data set where I observed the error was around 0.5TB. I have not been able to 
test these queries with files on HDFS or the local filesystem since they do not 
fit onto the disk of the system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to