Yes. Otherwise the file descriptors will flow away like water. I also strongly suggest having at least 64k file descriptors as the open file limit.
On Sun, Jun 21, 2009 at 12:43 PM, Stas Oskin <stas.os...@gmail.com> wrote: > Hi. > > Thanks for the advice. So you advice explicitly closing each and every file > handle that I receive from HDFS? > > Regards. > > 2009/6/21 jason hadoop <jason.had...@gmail.com> > > > Just to be clear, I second Brian's opinion. Relying on finalizes is a > very > > good way to run out of file descriptors. > > > > On Sun, Jun 21, 2009 at 9:32 AM, <brian.lev...@nokia.com> wrote: > > > > > IMHO, you should never rely on finalizers to release scarce resources > > since > > > you don't know when the finalizer will get called, if ever. > > > > > > -brian > > > > > > > > > > > > -----Original Message----- > > > From: ext jason hadoop [mailto:jason.had...@gmail.com] > > > Sent: Sunday, June 21, 2009 11:19 AM > > > To: core-user@hadoop.apache.org > > > Subject: Re: "Too many open files" error, which gets resolved after > some > > > time > > > > > > HDFS/DFS client uses quite a few file descriptors for each open file. > > > > > > Many application developers (but not the hadoop core) rely on the JVM > > > finalizer methods to close open files. > > > > > > This combination, expecially when many HDFS files are open can result > in > > > very large demands for file descriptors for Hadoop clients. > > > We as a general rule never run a cluster with nofile less that 64k, and > > for > > > larger clusters with demanding applications have had it set 10x higher. > I > > > also believe there was a set of JVM versions that leaked file > descriptors > > > used for NIO in the HDFS core. I do not recall the exact details. > > > > > > On Sun, Jun 21, 2009 at 5:27 AM, Stas Oskin <stas.os...@gmail.com> > > wrote: > > > > > > > Hi. > > > > > > > > After tracing some more with the lsof utility, and I managed to stop > > the > > > > growth on the DataNode process, but still have issues with my DFS > > client. > > > > > > > > It seems that my DFS client opens hundreds of pipes and eventpolls. > > Here > > > is > > > > a small part of the lsof output: > > > > > > > > java 10508 root 387w FIFO 0,6 6142565 > > pipe > > > > java 10508 root 388r FIFO 0,6 6142565 > > pipe > > > > java 10508 root 389u 0000 0,10 0 6142566 > > > > eventpoll > > > > java 10508 root 390u FIFO 0,6 6135311 > > pipe > > > > java 10508 root 391r FIFO 0,6 6135311 > > pipe > > > > java 10508 root 392u 0000 0,10 0 6135312 > > > > eventpoll > > > > java 10508 root 393r FIFO 0,6 6148234 > > pipe > > > > java 10508 root 394w FIFO 0,6 6142570 > > pipe > > > > java 10508 root 395r FIFO 0,6 6135857 > > pipe > > > > java 10508 root 396r FIFO 0,6 6142570 > > pipe > > > > java 10508 root 397r 0000 0,10 0 6142571 > > > > eventpoll > > > > java 10508 root 398u FIFO 0,6 6135319 > > pipe > > > > java 10508 root 399w FIFO 0,6 6135319 > > pipe > > > > > > > > I'm using FSDataInputStream and FSDataOutputStream, so this might be > > > > related > > > > to pipes? > > > > > > > > So, my questions are: > > > > > > > > 1) What happens these pipes/epolls to appear? > > > > > > > > 2) More important, how I can prevent their accumation and growth? > > > > > > > > Thanks in advance! > > > > > > > > 2009/6/21 Stas Oskin <stas.os...@gmail.com> > > > > > > > > > Hi. > > > > > > > > > > I have HDFS client and HDFS datanode running on same machine. > > > > > > > > > > When I'm trying to access a dozen of files at once from the client, > > > > several > > > > > times in a row, I'm starting to receive the following errors on > > client, > > > > and > > > > > HDFS browse function. > > > > > > > > > > HDFS Client: "Could not get block locations. Aborting..." > > > > > HDFS browse: "Too many open files" > > > > > > > > > > I can increase the maximum number of files that can opened, as I > have > > > it > > > > > set to the default 1024, but would like to first solve the problem, > > as > > > > > larger value just means it would run out of files again later on. > > > > > > > > > > So my questions are: > > > > > > > > > > 1) Does the HDFS datanode keeps any files opened, even after the > HDFS > > > > > client have already closed them? > > > > > > > > > > 2) Is it possible to find out, who keeps the opened files - > datanode > > or > > > > > client (so I could pin-point the source of the problem). > > > > > > > > > > Thanks in advance! > > > > > > > > > > > > > > > > > > > > > -- > > > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > > > http://www.amazon.com/dp/1430219424?tag=jewlerymall > > > www.prohadoopbook.com a community for Hadoop Professionals > > > > > > > > > > > -- > > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > > http://www.amazon.com/dp/1430219424?tag=jewlerymall > > www.prohadoopbook.com a community for Hadoop Professionals > > > -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.amazon.com/dp/1430219424?tag=jewlerymall www.prohadoopbook.com a community for Hadoop Professionals