[
https://issues.apache.org/jira/browse/HDFS-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374535#comment-15374535
]
Ravikumar commented on HDFS-3051:
---------------------------------
How about returning the MappedByteBuffers of all blocks for a file in local. If
there are non-local blocks, this method can simply return empty.
public List<ByteBuffer> readFullyScatterGatherLocal(EnumSet<ReadOption> options)
throws IOException {
return ((PositionedReadable)in).readFullyScatterGather(options);
}
A quick sample-impl can be like
public List<ByteBuffer> readFullyScatterGatherLocal(EnumSet<ReadOption>) throws
IOException
{
List<LocatedBlock> blockRange = getBlockRange(0, getFileLength());
if(!allBlocksInLocal(blockRange))
{
return;
}
List<ByteBuffer> retval = new LinkedList<ByteBuffer>();
for(LocatedBlock blk:blockRange)
{
blkReader = fetchBlockReader(blk, localDNAddrPair);
ClientMmap mmap = blkReader.getClientMmap(readOptions);
mmap.setunmap(false); //Instruction to cache-eviction to avoid unmapping
this. Slots, streams & all other resources will be closed
result.add(mmap.getMappedByteBuffer());
closeBlockReader(blkReader);
}
return retval
}
Apps opening InputStreams only once (Hbase??) can call this method & use the
zero-copy buffers for reads, if file is local. If not available, they can fall
back to regular DFSInputStream. Reads can eliminate sync overheads & get same
perf as a local filesystem.
But I don't know if "leaking" MappedByteBuffers to calling code can have nasty
side-effects.
> A zero-copy ScatterGatherRead api from FSDataInputStream
> --------------------------------------------------------
>
> Key: HDFS-3051
> URL: https://issues.apache.org/jira/browse/HDFS-3051
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client, performance
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
>
> It will be nice if we can get a new API from FSDtaInputStream that allows for
> zero-copy read for hdfs readers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]