Hi. Yes, it happens with 0.18.3.
I'm closing now every FSData stream I receive from HDFS, so the number of open fd's in DataNode is reduced. Problem is that my own DFS client still have a high number of fd's open, mostly pipes and epolls. They sometimes quickly drop to the level of ~400 - 500, and sometimes just stuck at ~1000. I'm still trying to find out how well it behaves if I set the maximum fd number to 65K. Regards. 2009/6/22 Raghu Angadi <rang...@yahoo-inc.com> > > Is this before 0.20.0? Assuming you have closed these streams, it is mostly > https://issues.apache.org/jira/browse/HADOOP-4346 > > It is the JDK internal implementation that depends on GC to free up its > cache of selectors. HADOOP-4346 avoids this by using hadoop's own cache. > > Raghu. > > > Stas Oskin wrote: > >> Hi. >> >> After tracing some more with the lsof utility, and I managed to stop the >> growth on the DataNode process, but still have issues with my DFS client. >> >> It seems that my DFS client opens hundreds of pipes and eventpolls. Here >> is >> a small part of the lsof output: >> >> java 10508 root 387w FIFO 0,6 6142565 pipe >> java 10508 root 388r FIFO 0,6 6142565 pipe >> java 10508 root 389u 0000 0,10 0 6142566 >> eventpoll >> java 10508 root 390u FIFO 0,6 6135311 pipe >> java 10508 root 391r FIFO 0,6 6135311 pipe >> java 10508 root 392u 0000 0,10 0 6135312 >> eventpoll >> java 10508 root 393r FIFO 0,6 6148234 pipe >> java 10508 root 394w FIFO 0,6 6142570 pipe >> java 10508 root 395r FIFO 0,6 6135857 pipe >> java 10508 root 396r FIFO 0,6 6142570 pipe >> java 10508 root 397r 0000 0,10 0 6142571 >> eventpoll >> java 10508 root 398u FIFO 0,6 6135319 pipe >> java 10508 root 399w FIFO 0,6 6135319 pipe >> >> I'm using FSDataInputStream and FSDataOutputStream, so this might be >> related >> to pipes? >> >> So, my questions are: >> >> 1) What happens these pipes/epolls to appear? >> >> 2) More important, how I can prevent their accumation and growth? >> >> Thanks in advance! >> >> 2009/6/21 Stas Oskin <stas.os...@gmail.com> >> >> Hi. >>> >>> I have HDFS client and HDFS datanode running on same machine. >>> >>> When I'm trying to access a dozen of files at once from the client, >>> several >>> times in a row, I'm starting to receive the following errors on client, >>> and >>> HDFS browse function. >>> >>> HDFS Client: "Could not get block locations. Aborting..." >>> HDFS browse: "Too many open files" >>> >>> I can increase the maximum number of files that can opened, as I have it >>> set to the default 1024, but would like to first solve the problem, as >>> larger value just means it would run out of files again later on. >>> >>> So my questions are: >>> >>> 1) Does the HDFS datanode keeps any files opened, even after the HDFS >>> client have already closed them? >>> >>> 2) Is it possible to find out, who keeps the opened files - datanode or >>> client (so I could pin-point the source of the problem). >>> >>> Thanks in advance! >>> >>> >> >