Re: FSDataInputStream.read(byte[]) only reads to a block boundary?

Raghu Angadi Sun, 28 Jun 2009 11:18:40 -0700

This seems to be the case. I don't think there is any specific reasonnot to read across the block boundary...

Even if HDFS does read across the blocks, it is still not a good idea toignore the JavaDoc for read(). If you want all the bytes read, then youshould have a while loop or one of the readFully() variants. For e.g. ifyou later change your code by wrapping a BufferedInputStream around'in', you would still get partial reads even if HDFS reads all the data.


Raghu.

forbbs forbbs wrote:

The hadoop version is 0.19.0.
My file is larger than 64MB, and the block size is 64MB.

The output of the code below is '10'. May I read across the block
boundary?  Or I should use 'while (left..){}' style code?

 public static void main(String[] args) throws IOException
  {
    Configuration conf = new Configuration();
    FileSystem fs = FileSystem.get(conf);
    FSDataInputStream fin = fs.open(new Path(args[0]));

    fin.seek(64*1024*1024 - 10);
    byte[] buffer = new byte[32*1024];
    int len = fin.read(buffer);
    //int len = fin.read(buffer, 0, 128);
    System.out.println(len);

    fin.close();
  }

Re: FSDataInputStream.read(byte[]) only reads to a block boundary?

Reply via email to