Kun Liu created FLUME-2979:
------------------------------
Summary: File descriptor leaks in TaildirSource
Key: FLUME-2979
URL: https://issues.apache.org/jira/browse/FLUME-2979
Project: Flume
Issue Type: Bug
Components: Sinks+Sources
Affects Versions: v1.7.0
Reporter: Kun Liu
TaildirSource creates ReliableTaildirEventReader object in start() method, and
the constructor of ReliableTaildirEventReader will try to open matched files in
updateTailFiles(). However, If the flume process has no permission to some
matched files, then the updateTailFiles() method throws IOException and
interrupts the creation of ReliableTaildirEventReader object. The following
logs show the opening of some matched files and the exception thrown when try
to open a file with no permission.
------------logs--------------
25 Aug 2016 12:50:28,337 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294) -
Opening file: /home/work/quota/test.2016-06-27-121958, inode: 91668538, pos: 0
25 Aug 2016 12:50:28,337 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294) -
Opening file: /home/work/quota/test.2016-06-27-111458, inode: 91668558, pos: 0
25 Aug 2016 12:50:28,337 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294) -
Opening file: /home/work/quota/test.2016-06-27-080458, inode: 91668536, pos: 0
25 Aug 2016 12:50:28,338 INFO [lifecycleSupervisor-1-0]
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294) -
Opening file: /home/work/quota/test.2016-06-27-140958, inode: 91668655, pos: 0
25 Aug 2016 12:50:28,338 ERROR [lifecycleSupervisor-1-0]
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253) -
Unable to start PollableSourceRunner: { source:Taildir source: { positionFile:
/opt/bls/.bls/flume/taildir_position/6d124945-0f73-42ed-7178-56a2168fc5f5/taildir_position.json,
skipToEnd: false, byteOffsetHeader: false, idleTimeout: 120000,
writePosInterval: 3000 } counterGroup:{ name:null counters:{} } } - Exception
follows.
org.apache.flume.FlumeException: Failed opening file:
/home/work/quota/test.2016-06-27-140958
at
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:297)
at
org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:260)
at
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:94)
at
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:48)
at
org.apache.flume.source.taildir.ReliableTaildirEventReader$Builder.build(ReliableTaildirEventReader.java:361)
at
org.apache.flume.source.taildir.TaildirSource.start(TaildirSource.java:100)
at
org.apache.flume.source.PollableSourceRunner.start(PollableSourceRunner.java:72)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.io.FileNotFoundException:
/home/work/quota/test.2016-06-27-140958 (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(Unknown Source)
at org.apache.flume.source.taildir.TailFile.<init>(TailFile.java:60)
at
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:295)
... 14 more
------------logs--------------
As the creation of ReliableTaildirEventReader is interrupted, the TaildirSource
catches the IOException and throws FlumeException in start() method, but the
opened files are not closed, which leads to the leak of file descriptors. As
time goes by, the PollableSourceRunner keeps starting TaildirSource, and the
number of opened files in flume process exceeds the upper limit, then the 'Too
many open files' error occurs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)