First, this documentation: http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FSDataInputStream.html claims that FSDataInputStream has a seek() method, but javap doesn't show one: $ javap -classpath [haddoopjars] org.apache.hadoop.fs.FSDataInputStream Compiled from "FSDataInputStream.java" public class org.apache.hadoop.fs.FSDataInputStream extends java.io.DataInputStream implements org.apache.hadoop.fs.Seek able,org.apache.hadoop.fs.PositionedReadable,java.io.Closeable,org.apache.hadoop.fs.ByteBufferReadable,org.apache.hadoop .fs.HasFileDescriptor,org.apache.hadoop.fs.CanSetDropBehind,org.apache.hadoop.fs.CanSetReadahead { public org.apache.hadoop.fs.FSDataInputStream(java.io.InputStream) throws java.io.IOException; public synchronized void seek(long) throws java.io.IOException; public long getPos() throws java.io.IOException; public int read(long, byte[], int, int) throws java.io.IOException; public void readFully(long, byte[], int, int) throws java.io.IOException; public void readFully(long, byte[]) throws java.io.IOException; public boolean seekToNewSource(long) throws java.io.IOException; public java.io.InputStream getWrappedStream(); public int read(java.nio.ByteBuffer) throws java.io.IOException; public java.io.FileDescriptor getFileDescriptor() throws java.io.IOException; public void setReadahead(java.lang.Long) throws java.io.IOException, java.lang.UnsupportedOperationException; public void setDropBehind(java.lang.Boolean) throws java.io.IOException, java.lang.UnsupportedOperationException; }
Second, after every call to inputStream.read(position, byteArray, 0, size), the getPos() call returns the same answer. Should it change? Given the lack of all these things, how is one supposed to call read(ByteBuffer) for random I/O? john