Antonio Piccolboni created SPARK-9921:
-----------------------------------------
Summary: Too many open files in Spark SQL
Key: SPARK-9921
URL: https://issues.apache.org/jira/browse/SPARK-9921
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.5.0
Environment: os x
Reporter: Antonio Piccolboni
Data is table with 300K rows, 16 cols, covers a single year, so there are 12
months and 365 days with roughly similar number of rows (each row is a
scheduled flight)
Error is
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", :
Unable to retrieve JDBC result set for SELECT `year`, `month`, `flights`
FROM (select `year`, `month`, sum(`flights`) as `flights`
from (select `year`, `month`, `day`, count(*) as `flights`
from `flights`
group by `year`, `month`, `day`) as `_w21`
group by `year`, `month`) AS `_w22`
LIMIT 10 (org.apache.spark.SparkException: Job aborted due to stage failure:
Task 0 in stage 237.0 failed 1 times, most recent failure: Lost task 0.0 in
stage 237.0 (TID 8634, localhost): java.io.FileNotFoundException:
/user/hive/warehouse/flights/file11ce460c958e (Too many open files)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.<init>(RawLocalFileSystem.java:103)
at
org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:195)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<i
As you can see the query is not something one would write by hand very easily,
because it's computer generated, but it makes perfect sense: it's a count of
flights by month. Could be done without the nested query, but that's not the
point.
This query used to work on 1.4, doesn't on 1.5. There has also been a os
upgrade to yosemite in the meantime, so it's hard to separate the effects of
the two. Following suggestions that default system limits for open files are
too low for spark to work properly, I increase hard and soft limits to 32k. For
some reason, the error happens when java has about 10250 open files as
reported by lsof. Not clear to me where that limit is coming from. Total files
open is 16k. If this is not a bug, I would like to ask what a safe number of
allowed open files is and if there are other configurations that need to be
tuned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]