Also, read() returning -1 is not an error, it's EOF. This is the same as for the regular Java InputStream.
best, Colin On Thu, Dec 20, 2012 at 10:32 AM, Christoph Rupp <ch...@crupp.de> wrote: > Thank you, Harsh. I appreciate it. > > 2012/12/20 Harsh J <ha...@cloudera.com> > >> Hi Christoph, >> >> If you use sync/hflush/hsync, the new length of data is only seen by a >> new reader, not an existent reader. The "workaround" you've done >> exactly how we've implemented the "fs -tail <file>" utility. See code >> for that at >> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Tail.java?view=markup >> (Note the looping at ~74). >> >> On Thu, Dec 20, 2012 at 5:51 PM, Christoph Rupp <ch...@crupp.de> wrote: >> > Hi, >> > >> > I am experiencing an unexpected situation where FSDataInputStream.read() >> > returns -1 while reading data from a file that another process still >> appends >> > to. According to the documentation read() should never return -1 but >> throw >> > Exceptions on errors. In addition, there's more data available, and >> read() >> > definitely should not fail. >> > >> > The problem gets worse because the FSDataInputStream is not able to >> recover >> > from this. If it once returns -1 then it will always return -1, even if >> the >> > file continues growing. >> > >> > If, at the same time, other Java processes read other HDFS files, they >> will >> > also return -1 immediately after opening the file. It smells like this >> error >> > gets propagated to other client processes as well. >> > >> > I found a workaround: close the FSDataInputStream, open it again and then >> > seek to the previous position. And then reading works fine. >> > >> > Another problem that i have seen is that the FSDataInputStream returns -1 >> > when reaching EOF. It will never return 0 (which i would expect when >> > reaching EOF). >> > >> > I use CDH 4.1.2, but also saw this with CDH 3u5. I have attached samples >> to >> > reproduce this. >> > >> > My cluster consists of 4 machines; 1 namenode and 3 datanodes. I run my >> > tests on the namenode machine. there are no other HDFS users, and the >> load >> > that is generated by my tests is fairly low, i would say. >> > >> > One process writes to 6 files simultaneously, but with a 5 sec sleep >> between >> > each write. It uses an FSDataOutputStream, and after writing data it >> calls >> > sync(). Each write() appends 8 mb; it stops when the file grows to 100 >> mb. >> > >> > Six processes read files; each process reads one file. At first each >> reader >> > loops till the file exists. If it does then it opens the >> FSDataInputStream >> > and starts reading. Usually the first process returns the first 8 MB in >> the >> > file before it starts returning -1. But the other processes immediately >> > return -1 without reading any data. I start the 6 reader processes >> before i >> > start the writer. >> > >> > Search HdfsReader.java for "WORKAROUND" and remove the comments; this >> will >> > reopen the FSDataInputStream after -1 is returned, and then everything >> > works. >> > >> > Sources are attached. >> > >> > This is a very basic scenario and i wonder if i'm doing anything wrong >> or if >> > i found an HDFS bug. >> > >> > bye >> > Christoph >> > >> >> >> >> -- >> Harsh J >>