[jira] [Commented] (HDFS-8319) Erasure Coding: support decoding for stateful read

Kai Zheng (JIRA) Wed, 03 Jun 2015 19:09:07 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571976#comment-14571976
 ]


Kai Zheng commented on HDFS-8319:
---------------------------------

Yes, we can avoid this by always allocating direct buffer for parity blocks as 
well. But unlike the buffer used by data blocks, this (64KB * 3) buffer may 
never be used if decoding is unnecessary.
I'm not sure about this. The buffer allocation behavior can happen when it's 
decided to recover an erasure. We don't need to allocate them initially. Looks 
like it's not related to the buffer type? I may be not right because not 
reading the whole codes yet.
bq. This means all the input buffers' array() return the same (64KB * 6) byte 
array, while their position are totally independent and can be all 0.
I see. Thanks a lot for the detailed explanation. We could also not use 
{{slice}} call, instead use {{duplicate}} with position and limit being 
properly set. I understand that this may not seem the flexible, but we need 
some tradeoff. Another way to work around this case is to use additional 
parameters like {{offsets}} and {{lengths}}, but doing so is a little complex 
though flexible.
Could we say spec the constraint in Javadoc?
bq. One question is whether the mixed scenario breaks the functionality?
It doesn't break the codes in the branch, as the patches for native coders 
(HADOOP-11540 and etc.) are not in in yet. In your changes it will do break 
native coders, because once it's determined to proceed usingDirectBuffer(true), 
all the inputs/output buffers will be regarded as direct buffers and passed to 
JNI native codes. If any of them isn't direct buffer in fact, it will coredump. 
In current Java coders it works because in the underlying it will proceed as 
regarding all input/output buffers are ByteBuffers (regardless of on-heap or 
direct) so it suffers performance loss, as no-converting-to-array happens.
For the long term consideration, I'm not very sure we will support mixing 
buffers, because doing so we have to convert all the buffers to be uniform 
either on-heap or direct before calling into the underlying implementation, 
where data bytes are uniformly retrieved, computed and stored in matrix and 
vector operation. The conversion will need to copy data, though not complex to 
do. 


> Erasure Coding: support decoding for stateful read
> --------------------------------------------------
>
>                 Key: HDFS-8319
>                 URL: https://issues.apache.org/jira/browse/HDFS-8319
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-8319.001.patch, HDFS-8319.002.patch, 
> HDFS-8319.003.patch
>
>
> HDFS-7678 adds the decoding functionality for pread. This jira plans to add 
> decoding to stateful read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8319) Erasure Coding: support decoding for stateful read

Reply via email to