HDFS File API should be extended to include positional read
-----------------------------------------------------------

                 Key: HADOOP-519
                 URL: http://issues.apache.org/jira/browse/HADOOP-519
             Project: Hadoop
          Issue Type: New Feature
          Components: dfs
    Affects Versions: 0.6.0
         Environment: All
            Reporter: Milind Bhandarkar
         Assigned To: Milind Bhandarkar
             Fix For: 0.7.0


HDFS Input streams should support positional read. Positional read (such as the 
pread syscall on linux) allows reading for a specified offset without affecting 
the current file offset. Since the underlying file state is not touched, pread 
can be used efficiently in multi-threaded programs.

Here is how I plan to implement it.

Provide PositionedReadable interface, with the following methods:

int read(long position, byte[] buffer, int offset, int length);
void readFully(long position, byte[] buffer, int offset, int length);
void readFully(long position, byte[] buffer);

Abstract class FSInputStream would provide default implementation of the above 
methods using getPos(), seek() and read() methods. The default implementation 
is inefficient in multi-threaded programs since it locks the object while 
seeking, reading, and restoring to old state.

DFSClient.DFSInputStream, which extends FSInputStream will provide an efficient 
non-synchronized implementation for above calls.

In addition, FSDataInputStream, which is a wrapper around FSInputStream, will 
provide wrapper methods for above read methods as well.

Patch forthcoming early next week.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to