[ https://issues.apache.org/jira/browse/KAFKA-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15352386#comment-15352386 ]
ASF GitHub Bot commented on KAFKA-3904: --------------------------------------- GitHub user HenryCaiHaiying opened a pull request: https://github.com/apache/kafka/pull/1563 KAFKA-3904: File descriptor leaking (Too many open files) for long ru… …nning stream process I noticed when my application was running for more than one day, I will get 'Too many open files' error. I used 'lsof' to list all the file descriptors used by the process, it's over 32K, but most of them belongs to the .lock file, e.g. a single lock file shows 2700 times. I looked at the code, I think the problem is in: FileChannel channel = new RandomAccessFile(lockFile, "rw").getChannel(); Each time new RandomAccessFile is called, a new fd will be created. Fix this by caching the FileChannels we created so far. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HenryCaiHaiying/kafka fd Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1563.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1563 ---- commit bb68fb8b820e4a0baef0c549ea1a4a8cfc913187 Author: Henry Cai <h...@pinterest.com> Date: 2016-06-28T05:24:03Z KAFKA-3904: File descriptor leaking (Too many open files) for long running stream process I noticed when my application was running for more than one day, I will get 'Too many open files' error. I used 'lsof' to list all the file descriptors used by the process, it's over 32K, but most of them belongs to the .lock file, e.g. a single lock file shows 2700 times. I looked at the code, I think the problem is in: FileChannel channel = new RandomAccessFile(lockFile, "rw").getChannel(); Each time new RandomAccessFile is called, a new fd will be created. Fix this by caching the FileChannels we created so far. ---- > File descriptor leaking (Too many open files) for long running stream process > ----------------------------------------------------------------------------- > > Key: KAFKA-3904 > URL: https://issues.apache.org/jira/browse/KAFKA-3904 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: Henry Cai > Assignee: Henry Cai > Labels: architecture, newbie > > I noticed when my application was running long (> 1 day), I will get 'Too > many open files' error. > I used 'lsof' to list all the file descriptors used by the process, it's over > 32K, but most of them belongs to the .lock file, e.g. this same lock file > shows 2700 times. > I looked at the code, I think the problem is in: > File lockFile = new File(stateDir, ProcessorStateManager.LOCK_FILE_NAME); > FileChannel channel = new RandomAccessFile(lockFile, "rw").getChannel(); > Each time new RandomAccessFile is called, a new fd will be created, we > probably should either close or reuse this RandomAccessFile object. > lsof result: > java 14799 hcai *740u REG 9,0 0 2415928585 > /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock > java 14799 hcai *743u REG 9,0 0 2415928585 > /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock > java 14799 hcai *746u REG 9,0 0 2415928585 > /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock > java 14799 hcai *755u REG 9,0 0 2415928585 > /mnt/stream/join/rocksdb/ads-demo-30/0_16/.lock > hcai@teststream02001:~$ lsof -p 14799 | grep lock | grep 0_16 | wc > 2709 24381 319662 -- This message was sent by Atlassian JIRA (v6.3.4#6332)