[jira] [Commented] (HDDS-3816) Erasure Coding

Uma Maheswara Rao G (Jira) Tue, 02 Mar 2021 11:37:06 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293946#comment-17293946
 ]


Uma Maheswara Rao G commented on HDDS-3816:
-------------------------------------------

{quote}
I see here we introduced the offset and length for chunk, that's only designed 
for the FILE_PER_BLOCK layout, right? Under the layout FILE_PER_BLOCK layout, 
one block only has one chunk file. But will we also support the EC chunk write 
for another FILE_PER_CHUNK layout in the first phase implementation? We have 
two available chunk layout now. And the EC chunk read/write logic should be 
different under this two layout.
{quote}
Thanks Marton for reply. 
[~linyiqun] Thanks for the questions. Let me add few more points:
>From the datanode perspective, the EC block or normal block, it will write in 
>the same way. So, FILE_PER_CHUNK or FILE_PER_BLOCK should not be problem. 
>Client should have the intelligence how and when to read the specific block. 
The idea here is that: we can reuse the existing blockOutputStreams from client 
side for writing. We can pre create the datablockNumer(data+parity) of 
BlockOutPutStreams ahead and keep write one full chunk to first stream, then 
next chunk to second stream etc. Once we have 3 chunks written in EC:3:2, we 
will generate parity and writes to last twi streams. By now we have finished on 
full stripe. Now it's time to rotate back (simple % should do this). Here take 
the first chunk writer BlockOutPutStream, it will simply writes to 1st chunk, 
4the chunk, 7th chunk etc. However for DN, it's simply a contiguous stream. It 
should be a reader's intelligence to read in the same order based on which 
order it has written the chunks. Hope this helps to clarify the things. 


> Erasure Coding
> --------------
>
>                 Key: HDDS-3816
>                 URL: https://issues.apache.org/jira/browse/HDDS-3816
>             Project: Apache Ozone
>          Issue Type: New Feature
>          Components: OM, Ozone Client, Ozone Datanode, SCM
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>              Labels: EC, pull-request-available
>         Attachments: Apache Ozone Erasure Coding-V2.pdf, 
> EC-read-write-path.pdf, Erasure Coding in Apache Hadoop Ozone.pdf, Ozone EC 
> Container groups and instances.pdf, Ozone EC v3.pdf
>
>
> We propose to implement Erasure Coding in Apache Hadoop Ozone to provide 
> efficient storage. With EC in place, Ozone can provide same or better 
> tolerance by giving 50% or more  storage space savings. 
> In HDFS project, we already have native codecs(ISAL) and Java codecs 
> implemented, we can leverage the same or similar codec design.
> However, the critical part of EC data layout design is in-progress, we will 
> post the design doc soon.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-3816) Erasure Coding

Reply via email to