[
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010009#comment-15010009
]
Walter Su commented on HDFS-8901:
---------------------------------
bq. in the worst case, if the user provided read buffer isn't the right buffer
type desired by the erasure coder, then all cells will need one time copy.
I think the buffer given by caller is always byte\[\]? There's no bytebuffer
version of pread API.
{code}
public int read(long position, byte[] buffer, int offset, int length)
{code}
Besides, Even if the caller provides a bytebuffer, the data copy from _byte\[\]
buffer_ to _decodeInputs_ still needed.
Assume schema is 3+2, then
{noformat}
byte[] buf is 64k 64k 64k 64k 64k 64k 64k 64k 64k ...
decodeInputs[0] is 64k 64k 64k ...
decodeInputs[1] is 64k 64k 64k ...
decodeInputs[2] is 64k 64k 64k ...
{noformat}
We can't feed encoder the original _byte\[\] buffer_ and only call encoding one
time.
We can encode in one time if we provide decodeInputs, then we need data copy.
We can also encode many times stripe by stripe like stateful read, then data
copy isn't needed.
We are not pursuing fewer times of data copy since the performance is bounded
by IO speed and coding speed.
> Use ByteBuffer in striping positional read
> ------------------------------------------
>
> Key: HDFS-8901
> URL: https://issues.apache.org/jira/browse/HDFS-8901
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: Kai Zheng
> Attachments: HDFS-8901-v2.patch, initial-poc.patch
>
>
> Native erasure coder prefers to direct ByteBuffer for performance
> consideration. To prepare for it, this change uses ByteBuffer through the
> codes in implementing striping position read. It will also fix avoiding
> unnecessary data copying between striping read chunk buffers and decode input
> buffers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)