[ 
https://issues.apache.org/jira/browse/FLUME-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kun Liu updated FLUME-2979:
---------------------------
    Description: 
TaildirSource creates ReliableTaildirEventReader object in start() method,  and 
the constructor of ReliableTaildirEventReader will try to open matched files in 
updateTailFiles(). However, If the flume process has no permission to some 
matched files, then the updateTailFiles() method throws IOException and 
interrupts the creation of ReliableTaildirEventReader object. The following 
logs show the opening of some matched files and the exception thrown when try 
to open a file with no permission.

------------logs--------------
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-121958, inode: 91668538, pos: 0
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-111458, inode: 91668558, pos: 0
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-080458, inode: 91668536, pos: 0
25 Aug 2016 12:50:28,338 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-140958, inode: 91668655, pos: 0
25 Aug 2016 12:50:28,338 ERROR [lifecycleSupervisor-1-0] 
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - 
Unable to start PollableSourceRunner: { source:Taildir source: { positionFile: 
/home/work/taildir_position.json, skipToEnd: false, byteOffsetHeader: false, 
idleTimeout: 120000, writePosInterval: 3000 } counterGroup:{ name:null 
counters:{} } } - Exception follows.
org.apache.flume.FlumeException: Failed opening file: 
/home/work/quota/test.2016-06-27-140958
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:297)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:260)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:94)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:48)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader$Builder.build(ReliableTaildirEventReader.java:361)
        at 
org.apache.flume.source.taildir.TaildirSource.start(TaildirSource.java:100)
        at 
org.apache.flume.source.PollableSourceRunner.start(PollableSourceRunner.java:72)
        at 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.FileNotFoundException: 
/home/work/quota/test.2016-06-27-140958 (Permission denied)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(Unknown Source)
        at org.apache.flume.source.taildir.TailFile.<init>(TailFile.java:60)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:295)
        ... 14 more
------------logs--------------


As the creation of ReliableTaildirEventReader is interrupted, the TaildirSource 
catches the IOException and throws FlumeException in start() method, but the 
opened files are not closed, which leads to the leak of file descriptors. As 
time goes by,  the PollableSourceRunner keeps starting TaildirSource, and the 
number of opened files in flume process exceeds the upper limit, then the 'Too 
many open files' error occurs.

  was:
TaildirSource creates ReliableTaildirEventReader object in start() method,  and 
the constructor of ReliableTaildirEventReader will try to open matched files in 
updateTailFiles(). However, If the flume process has no permission to some 
matched files, then the updateTailFiles() method throws IOException and 
interrupts the creation of ReliableTaildirEventReader object. The following 
logs show the opening of some matched files and the exception thrown when try 
to open a file with no permission.

------------logs--------------
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-121958, inode: 91668538, pos: 0
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-111458, inode: 91668558, pos: 0
25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-080458, inode: 91668536, pos: 0
25 Aug 2016 12:50:28,338 INFO  [lifecycleSupervisor-1-0] 
(org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
Opening file: /home/work/quota/test.2016-06-27-140958, inode: 91668655, pos: 0
25 Aug 2016 12:50:28,338 ERROR [lifecycleSupervisor-1-0] 
(org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - 
Unable to start PollableSourceRunner: { source:Taildir source: { positionFile: 
/opt/bls/.bls/flume/taildir_position/6d124945-0f73-42ed-7178-56a2168fc5f5/taildir_position.json,
 skipToEnd: false, byteOffsetHeader: false, idleTimeout: 120000, 
writePosInterval: 3000 } counterGroup:{ name:null counters:{} } } - Exception 
follows.
org.apache.flume.FlumeException: Failed opening file: 
/home/work/quota/test.2016-06-27-140958
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:297)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:260)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:94)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:48)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader$Builder.build(ReliableTaildirEventReader.java:361)
        at 
org.apache.flume.source.taildir.TaildirSource.start(TaildirSource.java:100)
        at 
org.apache.flume.source.PollableSourceRunner.start(PollableSourceRunner.java:72)
        at 
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.FileNotFoundException: 
/home/work/quota/test.2016-06-27-140958 (Permission denied)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(Unknown Source)
        at org.apache.flume.source.taildir.TailFile.<init>(TailFile.java:60)
        at 
org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:295)
        ... 14 more
------------logs--------------


As the creation of ReliableTaildirEventReader is interrupted, the TaildirSource 
catches the IOException and throws FlumeException in start() method, but the 
opened files are not closed, which leads to the leak of file descriptors. As 
time goes by,  the PollableSourceRunner keeps starting TaildirSource, and the 
number of opened files in flume process exceeds the upper limit, then the 'Too 
many open files' error occurs.


> File descriptor leaks in TaildirSource
> --------------------------------------
>
>                 Key: FLUME-2979
>                 URL: https://issues.apache.org/jira/browse/FLUME-2979
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.7.0
>            Reporter: Kun Liu
>
> TaildirSource creates ReliableTaildirEventReader object in start() method,  
> and the constructor of ReliableTaildirEventReader will try to open matched 
> files in updateTailFiles(). However, If the flume process has no permission 
> to some matched files, then the updateTailFiles() method throws IOException 
> and interrupts the creation of ReliableTaildirEventReader object. The 
> following logs show the opening of some matched files and the exception 
> thrown when try to open a file with no permission.
> ------------logs--------------
> 25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
> (org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
> Opening file: /home/work/quota/test.2016-06-27-121958, inode: 91668538, pos: 0
> 25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
> (org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
> Opening file: /home/work/quota/test.2016-06-27-111458, inode: 91668558, pos: 0
> 25 Aug 2016 12:50:28,337 INFO  [lifecycleSupervisor-1-0] 
> (org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
> Opening file: /home/work/quota/test.2016-06-27-080458, inode: 91668536, pos: 0
> 25 Aug 2016 12:50:28,338 INFO  [lifecycleSupervisor-1-0] 
> (org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile:294)  - 
> Opening file: /home/work/quota/test.2016-06-27-140958, inode: 91668655, pos: 0
> 25 Aug 2016 12:50:28,338 ERROR [lifecycleSupervisor-1-0] 
> (org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run:253)  - 
> Unable to start PollableSourceRunner: { source:Taildir source: { 
> positionFile: /home/work/taildir_position.json, skipToEnd: false, 
> byteOffsetHeader: false, idleTimeout: 120000, writePosInterval: 3000 } 
> counterGroup:{ name:null counters:{} } } - Exception follows.
> org.apache.flume.FlumeException: Failed opening file: 
> /home/work/quota/test.2016-06-27-140958
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:297)
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader.updateTailFiles(ReliableTaildirEventReader.java:260)
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:94)
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader.<init>(ReliableTaildirEventReader.java:48)
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader$Builder.build(ReliableTaildirEventReader.java:361)
>         at 
> org.apache.flume.source.taildir.TaildirSource.start(TaildirSource.java:100)
>         at 
> org.apache.flume.source.PollableSourceRunner.start(PollableSourceRunner.java:72)
>         at 
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
>         at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
>  Source)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>  Source)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>         at java.lang.Thread.run(Unknown Source)
> Caused by: java.io.FileNotFoundException: 
> /home/work/quota/test.2016-06-27-140958 (Permission denied)
>         at java.io.RandomAccessFile.open(Native Method)
>         at java.io.RandomAccessFile.<init>(Unknown Source)
>         at org.apache.flume.source.taildir.TailFile.<init>(TailFile.java:60)
>         at 
> org.apache.flume.source.taildir.ReliableTaildirEventReader.openFile(ReliableTaildirEventReader.java:295)
>         ... 14 more
> ------------logs--------------
> As the creation of ReliableTaildirEventReader is interrupted, the 
> TaildirSource catches the IOException and throws FlumeException in start() 
> method, but the opened files are not closed, which leads to the leak of file 
> descriptors. As time goes by,  the PollableSourceRunner keeps starting 
> TaildirSource, and the number of opened files in flume process exceeds the 
> upper limit, then the 'Too many open files' error occurs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to