[ 
https://issues.apache.org/jira/browse/HDFS-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8320:
----------------------------
    Attachment: HDFS-8320-HDFS-7285.01.patch

Many thanks to Jing for the helpful review! Updating the patch to address the 
comments.

# Considering we will remove cellSize out of the ECSchema, we can consider 
adding a separate cellSize parameter
HDFS-8408 will potentially bring us a unified {{ErasureCodingInfo}} class to 
represent both codec schema and {{cellSize}}. How about we make the change 
after it?
# In blockSeekTo, since we only need to get each internal block's start offset, 
to call getRangesForInternalBlocks which breaks the whole block group into 
cells may be an overkill. 
Good point! Actually it can be even simpler than that because we don't care 
about the span, instead only care about the start offsets. Let me know if the 
new {{getStartOffsetsForInternalBlocks}} method looks OK.
# Looks like HADOOP-11938 will be ready soon. Please see if you want to update 
the decoding function accordingly in this jira.
I had a quick try but got a content mismatch. Will take some more time to 
address this separately.

Addressed smaller issues (2, 3, 5). Will address #4 separately since the patch 
is already big.

> Erasure coding: consolidate striping-related terminologies
> ----------------------------------------------------------
>
>                 Key: HDFS-8320
>                 URL: https://issues.apache.org/jira/browse/HDFS-8320
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8320-HDFS-7285.00.patch, 
> HDFS-8320-HDFS-7285.01.patch
>
>
> Right now we are doing striping-based I/O in a number of places:
> # Client output stream (HDFS-7889)
> # Client input stream
> #* pread (HDFS-7782, HDFS-7678)
> #* stateful read (HDFS-8033, HDFS-8281, HDFS-8319)
> # DN reconstruction (HDFS-7348)
> In each place we use one or multiple of the following terminologies:
> # Cell
> # Stripe
> # Block group
> # Internal block
> # Chunk
> This JIRA aims to systematically define these terminologies in relation with 
> each other and in the context of the containing file. For example, a cell 
> belong to stripe _i_ and internal block _j_ can be indexed as {{(i, j)}} and 
> its logical index _k_ in the file can be calculated.
> With the above consolidation, hopefully we can further consolidate striping 
> I/O codes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to