[
https://issues.apache.org/jira/browse/HDFS-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571976#comment-14571976
]
Kai Zheng commented on HDFS-8319:
---------------------------------
Yes, we can avoid this by always allocating direct buffer for parity blocks as
well. But unlike the buffer used by data blocks, this (64KB * 3) buffer may
never be used if decoding is unnecessary.
I'm not sure about this. The buffer allocation behavior can happen when it's
decided to recover an erasure. We don't need to allocate them initially. Looks
like it's not related to the buffer type? I may be not right because not
reading the whole codes yet.
bq. This means all the input buffers' array() return the same (64KB * 6) byte
array, while their position are totally independent and can be all 0.
I see. Thanks a lot for the detailed explanation. We could also not use
{{slice}} call, instead use {{duplicate}} with position and limit being
properly set. I understand that this may not seem the flexible, but we need
some tradeoff. Another way to work around this case is to use additional
parameters like {{offsets}} and {{lengths}}, but doing so is a little complex
though flexible.
Could we say spec the constraint in Javadoc?
bq. One question is whether the mixed scenario breaks the functionality?
It doesn't break the codes in the branch, as the patches for native coders
(HADOOP-11540 and etc.) are not in in yet. In your changes it will do break
native coders, because once it's determined to proceed usingDirectBuffer(true),
all the inputs/output buffers will be regarded as direct buffers and passed to
JNI native codes. If any of them isn't direct buffer in fact, it will coredump.
In current Java coders it works because in the underlying it will proceed as
regarding all input/output buffers are ByteBuffers (regardless of on-heap or
direct) so it suffers performance loss, as no-converting-to-array happens.
For the long term consideration, I'm not very sure we will support mixing
buffers, because doing so we have to convert all the buffers to be uniform
either on-heap or direct before calling into the underlying implementation,
where data bytes are uniformly retrieved, computed and stored in matrix and
vector operation. The conversion will need to copy data, though not complex to
do.
> Erasure Coding: support decoding for stateful read
> --------------------------------------------------
>
> Key: HDFS-8319
> URL: https://issues.apache.org/jira/browse/HDFS-8319
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Jing Zhao
> Assignee: Jing Zhao
> Attachments: HDFS-8319.001.patch, HDFS-8319.002.patch,
> HDFS-8319.003.patch
>
>
> HDFS-7678 adds the decoding functionality for pread. This jira plans to add
> decoding to stateful read.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)