[
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455103#comment-15455103
]
SammiChen commented on HDFS-8901:
---------------------------------
Thanks Zhe for take a lot of time review patch and give so many good advices!
{quote}The non-EC part:
In my previous comment I suggested moving the {{readAll}} code to
{{BlockReaderUtil}}. Actually thinking more about it, {{BlockReader}} already
provides {{int read(ByteBuffer buf)}}. Shouldn't the input stream level read
rely on the {{BlockReader}} lever {{ByteBuffer}} read?
{code}
- int nread = reader.readAll(buf, offset, len);
+ int nread = BlockReaderUtil.readAll(reader, buf, len);
{code}
{quote}
I thought the {{ BlockReaderUtil.readAll }} implementation just demonstrates
the idea, that is leverage {{BlockReader#read(ByteBuffer buf)}} to read data
into buf. Please enlighten me If I'm wrong.
bq. Nit: {{targetEnd}} should probably be initialized immediately after
{{bytesToRead}}, before the try clause
Handled in the new patch
bq. {{hedgedFetchBlockByteRange}} currently still allocates a new buffer ({{bb
= ByteBuffer.allocate(len);}}) and copies the result into {{buf}}. As a
follow-on I guess we can think about directly using part of {{buf}}?
Good suggestion, consider do a follow-on later.
bq. Is it necessary to {{slice()}} the {{dup}} buffer in {{pread}}?
Good advice, addressed in the new patch.
{quote}
The EC-specific part:
Thanks for clarifying this. Then shouldn't we do the flip at {{decode}} time?
{quote}
Good advice, addressed in the new patch.
{quote}
I still think they can be unified. In both cases the chunk will actually be
read into a single {{ByteBuffer}}. It's just that stateful read will read a
chunk into a contiguous segment of the {{ByteBuffer}}, but positional read will
read the chunk into multiple segments of the {{ByteBuffer}}. From the
{{StripingChunk}}'s angle, I think we should just give it a list of
{{ByteBuffer}}'s. When reading a {{StripingChunk}}, we fill the
{{ByteBuffer}}'s one by one. In other words, we should treat
{{StripingChunk#byteBuffer}} in this patch as a "slice" in {{chunkBuffer}}. I
think we can do this consolidation as a follow-on, and most of your handlings
of {{chunkBuffer}} are correct.
{quote}
Catch your point now. Will Consider a follow-on later.
{quote} # I think the logic in the below code is not very safe. It works now
because a) {{finalizeDecodeInputs}} is only used by positional read; b) In
positional read, {{public StripingChunk(ByteBuffer buf)}} constructor is only
used for parity and out-of-range chunks (the data won't be in application
buffer anyway). Whether to copy chunk data into {{decodeInputs}} should depend
on whether {{decodeInputs}} is a duplicate or slice {{ByteBuffer}} from the
application buffer. Can we remove this change in this patch and always
{{copyTo}}? After the above follow-on we can re-enable the optimization.
{code}
- chunk.copyTo(decodeInputs[i]);
+ if (chunk.useChunkBuffer()) {
+ chunk.getChunkBuffer().copyTo(decodeInputs[i]);
+ }
{code}
{quote}
I understand your concern. Given currently {{finalizeDecodeInputs}} is only
used by positional read decode process, {{decodeInputs}} buffer has two
sources, one is application buffer, another is internal buffer allocated to
hold the out-of-range chunks and parity. When the buffer is formal type, it
don't need copy, because element of {{decodeInputs}} already point to the
buffer. When the buffer is internal allocated buffer,
{{chunk.useChunkBuffer()}} will be true, and then copy the data. I run the
{{TestDFSStripedInputStream}}, go through each condition branch, make sure the
decode result is correct.
{quote}
If we don't want to affect the position or limit of target, should we just
duplicate it? It could have a non-zero position before entering this call.
{code}
void copyTo(ByteBuffer target) {
for (ByteBuffer slice : slices) {
target.put(slice);
}
target.flip();
}
{code}
{quote}
{{copyTo}} is used to copy a chunk data to the target buffer for next step
decoding. After the call, target buffer is ready for read. So we do want to
affect the position of target buffer.
> Use ByteBuffer in striping positional read
> ------------------------------------------
>
> Key: HDFS-8901
> URL: https://issues.apache.org/jira/browse/HDFS-8901
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: SammiChen
> Attachments: HDFS-8901-v10.patch, HDFS-8901-v2.patch,
> HDFS-8901-v3.patch, HDFS-8901-v4.patch, HDFS-8901-v5.patch,
> HDFS-8901-v6.patch, HDFS-8901-v7.patch, HDFS-8901-v8.patch,
> HDFS-8901-v9.patch, HDFS-8901.v11.patch, HDFS-8901.v12.patch,
> HDFS-8901.v13.patch, HDFS-8901.v14.patch, HDFS-8901.v15.patch,
> HDFS-8901.v16.patch, initial-poc.patch
>
>
> Native erasure coder prefers to direct ByteBuffer for performance
> consideration. To prepare for it, this change uses ByteBuffer through the
> codes in implementing striping position read. It will also fix avoiding
> unnecessary data copying between striping read chunk buffers and decode input
> buffers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]