[
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453470#comment-15453470
]
Zhe Zhang commented on HDFS-8901:
---------------------------------
Thanks Sammi for the update! Sorry it's taking me long to review. I think we
are really close.
The non-EC part:
# In my previous comment I suggested moving the {{readAll}} code to
{{BlockReaderUtil}}. Actually thinking more about it, {{BlockReader}} already
provides {{int read(ByteBuffer buf)}}. Shouldn't the input stream level read
rely on the {{BlockReader}} lever {{ByteBuffer}} read?
{code}
- int nread = reader.readAll(buf, offset, len);
+ int nread = BlockReaderUtil.readAll(reader, buf, len);
{code}
# Nit: {{targetEnd}} should probably be initialized immediately after
{{bytesToRead}}, before the try clause
{code}
+ long targetEnd;
+ int bytesToRead = (int) Math.min(remaining,
+ blk.getBlockSize() - targetStart);
try {
+ targetEnd = targetStart + bytesToRead - 1;
{code}
# {{hedgedFetchBlockByteRange}} currently still allocates a new buffer ({{bb =
ByteBuffer.allocate(len);}}) and copies the result into {{buf}}. As a follow-on
I guess we can think about directly using part of {{buf}}?
# Is it necessary to {{slice()}} the {{dup}} buffer in {{pread}}?
The EC-specific part:
# bq. For question 4, why flip is not needed before? The reason is previously,
positional reader use a byte array wrapped buffer as read buffer. Then later
code use byte array to access the fetched data. And for stateful reader,
previously it has already reset the buffer position by {{
decodeInputs[i].position(0) }} before decoding.
Thanks for clarifying this. Then shouldn't we do the flip at {{decode}} time?
# bq. Every StripingChunk will have either a ChunkByteBuffer or a ByteBuffer to
track different buffer source. ChunkByteBuffer will be used when reusing user
input buffer as the striping read buffer during positional read. And ByteBuffer
will be used when read using stateful strip reader.
I still think they can be unified. In both cases the chunk will actually be
read into a single {{ByteBuffer}}. It's just that stateful read will read a
chunk into a contiguous segment of the {{ByteBuffer}}, but positional read will
read the chunk into multiple segments of the {{ByteBuffer}}. From the
{{StripingChunk}}'s angle, I think we should just give it a list of
{{ByteBuffer}}'s. When reading a {{StripingChunk}}, we fill the
{{ByteBuffer}}'s one by one. In other words, we should treat
{{StripingChunk#byteBuffer}} in this patch as a "slice" in {{chunkBuffer}}. I
think we can do this consolidation as a follow-on, and most of your handlings
of {{chunkBuffer}} are correct.
# I think the logic in the below code is not very safe. It works now because a)
{{finalizeDecodeInputs}} is only used by positional read; b) In positional
read, {{public StripingChunk(ByteBuffer buf)}} constructor is only used for
parity and out-of-range chunks (the data won't be in application buffer
anyway). Whether to copy chunk data into {{decodeInputs}} should depend on
whether {{decodeInputs}} is a duplicate or slice {{ByteBuffer}} from the
application buffer. Can we remove this change in this patch and always
{{copyTo}}? After the above follow-on we can re-enable the optimization.
{code}
- chunk.copyTo(decodeInputs[i]);
+ if (chunk.useChunkBuffer()) {
+ chunk.getChunkBuffer().copyTo(decodeInputs[i]);
+ }
{code}
# If we don't want to affect the position or limit of target, should we just
duplicate it? It could have a non-zero position before entering this call.
{code}
void copyTo(ByteBuffer target) {
for (ByteBuffer slice : slices) {
target.put(slice);
}
target.flip();
}
{code}
> Use ByteBuffer in striping positional read
> ------------------------------------------
>
> Key: HDFS-8901
> URL: https://issues.apache.org/jira/browse/HDFS-8901
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Kai Zheng
> Assignee: SammiChen
> Attachments: HDFS-8901-v10.patch, HDFS-8901-v2.patch,
> HDFS-8901-v3.patch, HDFS-8901-v4.patch, HDFS-8901-v5.patch,
> HDFS-8901-v6.patch, HDFS-8901-v7.patch, HDFS-8901-v8.patch,
> HDFS-8901-v9.patch, HDFS-8901.v11.patch, HDFS-8901.v12.patch,
> HDFS-8901.v13.patch, HDFS-8901.v14.patch, HDFS-8901.v15.patch,
> HDFS-8901.v16.patch, initial-poc.patch
>
>
> Native erasure coder prefers to direct ByteBuffer for performance
> consideration. To prepare for it, this change uses ByteBuffer through the
> codes in implementing striping position read. It will also fix avoiding
> unnecessary data copying between striping read chunk buffers and decode input
> buffers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]