[
https://issues.apache.org/jira/browse/HDFS-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhe Zhang updated HDFS-8320:
----------------------------
Attachment: HDFS-8320-HDFS-7285.01.patch
Many thanks to Jing for the helpful review! Updating the patch to address the
comments.
# Considering we will remove cellSize out of the ECSchema, we can consider
adding a separate cellSize parameter
HDFS-8408 will potentially bring us a unified {{ErasureCodingInfo}} class to
represent both codec schema and {{cellSize}}. How about we make the change
after it?
# In blockSeekTo, since we only need to get each internal block's start offset,
to call getRangesForInternalBlocks which breaks the whole block group into
cells may be an overkill.
Good point! Actually it can be even simpler than that because we don't care
about the span, instead only care about the start offsets. Let me know if the
new {{getStartOffsetsForInternalBlocks}} method looks OK.
# Looks like HADOOP-11938 will be ready soon. Please see if you want to update
the decoding function accordingly in this jira.
I had a quick try but got a content mismatch. Will take some more time to
address this separately.
Addressed smaller issues (2, 3, 5). Will address #4 separately since the patch
is already big.
> Erasure coding: consolidate striping-related terminologies
> ----------------------------------------------------------
>
> Key: HDFS-8320
> URL: https://issues.apache.org/jira/browse/HDFS-8320
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-8320-HDFS-7285.00.patch,
> HDFS-8320-HDFS-7285.01.patch
>
>
> Right now we are doing striping-based I/O in a number of places:
> # Client output stream (HDFS-7889)
> # Client input stream
> #* pread (HDFS-7782, HDFS-7678)
> #* stateful read (HDFS-8033, HDFS-8281, HDFS-8319)
> # DN reconstruction (HDFS-7348)
> In each place we use one or multiple of the following terminologies:
> # Cell
> # Stripe
> # Block group
> # Internal block
> # Chunk
> This JIRA aims to systematically define these terminologies in relation with
> each other and in the context of the containing file. For example, a cell
> belong to stripe _i_ and internal block _j_ can be indexed as {{(i, j)}} and
> its logical index _k_ in the file can be calculated.
> With the above consolidation, hopefully we can further consolidate striping
> I/O codes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)