+1 for merge. Thanks for the great work on this. Thanks Hanisha
> On Apr 7, 2022, at 2:18 AM, jackson yao <jacksonyao...@gmail.com> wrote: > > thanks for the great work! i am +1 for merging (non-binding) > > mingchao zhao <captain...@apache.org> 于2022年4月7日周四 14:18写道: > >> +1 for the merge. Thanks >> >> Mukul Kumar Singh <mksingh.apa...@gmail.com> 于2022年4月7日周四 14:05写道: >> >>> +1 for the merge. >>> >>> >>> Thanks Lokesh >>> >>> On 07/04/22 11:29 am, Lokesh Jain wrote: >>>> +1 for merge >>>> >>>> Thanks >>>> Lokesh >>>> >>>>> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <szets...@gmail.com> wrote: >>>>> >>>>> +1 >>>>> We should merge it so that more people can try it. We can work on the >>>>> remaining tasks in the master branch. Thanks a lot! >>>>> >>>>> Tsz-Wo >>>>> >>>>> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan >>>>> <avija...@cloudera.com.invalid> wrote: >>>>> >>>>>> +1 for the merge. Thanks for the great work! >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde >>> <ppo...@cloudera.com.invalid >>>>>> wrote: >>>>>> >>>>>>> +1 for the EC branch merge. >>>>>>> >>>>>>> Regards, >>>>>>> Prashant >>>>>>> >>>>>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <swa...@apache.org> >>> wrote: >>>>>>>> >>>>>>>> +1 for the EC branch merge. >>>>>>>> >>>>>>>> Best, >>>>>>>> Sid >>>>>>>> >>>>>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: >>>>>>>> >>>>>>>>> Great news! >>>>>>>>> +1 to merge. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" < >> sodonn...@cloudera.com >>>>>>> .INVALID> >>>>>>>>> wrote: >>>>>>>>>> I have been working on the code on this branch for some time, >> and I >>>>>>>>> believe >>>>>>>>>> it is in a good state to merge now. It is mostly new code, and if >>>>>>> nothing >>>>>>>>>> attempts to use EC, none of the EC code paths will be executed. >>>>>>>>>> >>>>>>>>>> +1 to merge from me. >>>>>>>>>> >>>>>>>>>> Stephen. >>>>>>>>>> >>>>>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla < >>> umamah...@apache.org> >>>>>>>>> wrote: >>>>>>>>>>> =====Few Edits Below=================== >>>>>>>>>>> >>>>>>>>>>> Dear Ozone Devs, >>>>>>>>>>> >>>>>>>>>>> As you may know, we have been actively developing Ozone Erasure >>>>>> Coding >>>>>>>>>>> support in a separate branch HDDS-3816-ec. >>>>>>>>>>> >>>>>>>>>>> We have finished the development of EC key write and read >>>>>>> functionality. >>>>>>>>>>> The support of offline recovery( Recovering replica from node >>> loss) >>>>>>>>> will be >>>>>>>>>>> part of second phase work. >>>>>>>>>>> >>>>>>>>>>> Since the code has already grown and increasingly started seeing >>>>>> merge >>>>>>>>>>> complications, we would like to merge the current EC branch into >>>>>>> master. >>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work >> and >>>>>>>>> continued >>>>>>>>>>> the offline recovery work there. (we have uploaded the design >> doc >>>>>>> there) >>>>>>>>>>> Details on Changes: >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Most of the EC core logic went to newly extended classes. Key >>>>>>> changes >>>>>>>>>>> went into EC*OutputStream and EC*InputStream classes for write >>> and >>>>>>>>> read >>>>>>>>>>> respectively. Based on replication type, ECPipelineProvider >> will >>>>>> be >>>>>>>>>>> chosen >>>>>>>>>>> for creating EC pipelines. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Since we cannot represent the EC replication in the existing >>>>>>>>> replication >>>>>>>>>>> factor, we have introduced ECReplicationConfig. The >>>>>>> ReplicationConfig >>>>>>>>>>> interface is already pushed to master, so it’s not a new idea >>>>>> coming >>>>>>>>>>> through this branch merge now. What is newly coming here is >> the >>>>>>>>>>> ECReplicationConfig class which can be used to express EC >>>>>>> replication >>>>>>>>>>> configuration. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> We wanted to provide the support to enable EC at bucket level. >>> To >>>>>>>>>>> simplify some complications, we have moved the default >>> replication >>>>>>>>>>> configurations from client to server. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Client side replication type and replication factor removed >> from >>>>>> the >>>>>>>>>>> configuration files and introduced the >>>>>>>>> ozone.server.default.replication >>>>>>>>>>> and ozone.server.default.replication.type.We would continue to >>>>>>>>> respect >>>>>>>>>>> if >>>>>>>>>>> one configures at client side explicitly or passed through >> APIs, >>>>>>>>>>> otherwise >>>>>>>>>>> server side bucket level properties or server side default >>>>>>>>> configuration >>>>>>>>>>> would take effect. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> Other than this change, the rest of EC side code should not >>> impact >>>>>>>>> any >>>>>>>>>>> of the existing code flows. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this >>>>>>> feature >>>>>>>>>>> and we will continue to improve further in master. >>>>>>>>>>> >>>>>>>>>>> Git Branch Name : HDDS-3816-ec >>>>>>>>>>> >>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351 >>>>>>>>>>> >>>>>>>>>>> Completed tasks: ~ 142 >>>>>>>>>>> >>>>>>>>>>> + We are covering the following two mandatory JIRAs to come in: >>>>>>>>>>> >>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to >>> older >>>>>>>>> server >>>>>>>>>>> could fail due to the unavailability for client default >>> replication >>>>>>>>> config >>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. >>>>>>>>>>> >>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two. >>>>>>>>>>> >>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe >> they're >>>>>> not >>>>>>>>>>> blockers for merge. >>>>>>>>>>> >>>>>>>>>>> In short what you can do now with this feature: >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> You can enable EC at bucket level and cluster level. >>>>>>>>>>> >>>>>>>>>>> How to enable it at bucket level? Just create the bucket by >>> passing >>>>>>> the >>>>>>>>> ec >>>>>>>>>>> replication options. >>>>>>>>>>> >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> You can create EC keys and read the same back. >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> You should be able to continue writing even when chosen nodes >>> are >>>>>>>>>>> failing. (Of Course minimum of Data+Parity live nodes should >> be >>>>>>>>>>> available >>>>>>>>>>> in cluster for complete the write) >>>>>>>>>>> - >>>>>>>>>>> >>>>>>>>>>> You should be able to read the file back even if a few nodes >>>>>> failed >>>>>>>>> in >>>>>>>>>>> the same ec block group(Failures should not be more than >> parity >>>>>>>>> number >>>>>>>>>>> of >>>>>>>>>>> nodes.). >>>>>>>>>>> >>>>>>>>>>> What is pending? Offline recovery of lost/missing EC containers. >>> As >>>>>>>>>>> mentioned above, post merge of this branch, I will create a >>> separate >>>>>>>>> JIRA >>>>>>>>>>> for starting the work for OfflineRecovery. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> There are automated acceptance test cases already added. >> HDDS-6231 >>>>>>>>>>> >>>>>>>>>>> In addition to that, we have also performed basic Acceptance >>> Testing >>>>>>> in >>>>>>>>>>> physical cluster: >>>>>>>>>>> >>>>>>>>>>> 1. >>>>>>>>>>> >>>>>>>>>>> Installed 10 nodes cluster and created EC bucket (3:2). >>>>>>>>>>> >>>>>>>>>>> Uploaded 10GB key. >>>>>>>>>>> >>>>>>>>>>> Downloaded the same key and checked the md5sum. >>>>>>>>>>> >>>>>>>>>>> 1. >>>>>>>>>>> >>>>>>>>>>> Uploaded 8GB key. >>>>>>>>>>> >>>>>>>>>>> Downloaded the same key and checked the md5sum. >>>>>>>>>>> >>>>>>>>>>> 1. >>>>>>>>>>> >>>>>>>>>>> Uploaded 3MB key >>>>>>>>>>> >>>>>>>>>>> Downloaded the same and verified md5sum. >>>>>>>>>>> >>>>>>>>>>> 1. >>>>>>>>>>> >>>>>>>>>>> Changed bucket to (6:3) >>>>>>>>>>> >>>>>>>>>>> Uploaded 8GB key >>>>>>>>>>> >>>>>>>>>>> Download the same. >>>>>>>>>>> >>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys >>> must >>>>>> be >>>>>>>>>>> 3:2.Verified >>>>>>>>>>> with several different size key writes and reads. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Since the merge discussion thread, we have well stabilized code >>> and >>>>>>>>> fixed >>>>>>>>>>> several bugs. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Merge checklist items assessment is here: >>>>>>>>>>> >>>>>>>>>>> >>>>>> >>> >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >>>>>>>>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, >>> Istvan >>>>>>>>> Fajth >>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and >>> also >>>>>>>>> thanks >>>>>>>>>>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for >>>>>>>>> collaborating >>>>>>>>>>> on some of the EC tasks. >>>>>>>>>>> >>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as >>>>>> well. >>>>>>>>>>> Thanks to many others who were involved in design discussions, >>>>>> Arpit, >>>>>>>>> Sidd, >>>>>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, >>> Prashanth, >>>>>>>>> Rakesh, >>>>>>>>>>> Yiqun Lin. >>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much >>> appreciated. >>>>>>>>> Without >>>>>>>>>>> your tremendous help, we would have not reached this position >> yet. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> To start with, here is my +1 >>>>>>>>>>> >>>>>>>>>>> The vote will run for 5 days. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Uma >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla < >>>>>> umamah...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Dear Ozone Devs, >>>>>>>>>>>> >>>>>>>>>>>> As you may know, we have been actively developing Ozone Erasure >>>>>>> Coding >>>>>>>>>>>> support in a separate branch HDDS-3816-ec. >>>>>>>>>>>> >>>>>>>>>>>> We have finished the development of EC key write and read >>>>>>>>> functionality. >>>>>>>>>>>> The support of offline recovery( Recovering replica from node >>> loss) >>>>>>>>> will >>>>>>>>>>> be >>>>>>>>>>>> part of second phase work. >>>>>>>>>>>> >>>>>>>>>>>> Since the code has already grown and increasingly started >> seeing >>>>>>> merge >>>>>>>>>>>> complications, we would like to propose to merge the current EC >>>>>>> branch >>>>>>>>>>> into >>>>>>>>>>>> master. >>>>>>>>>>>> >>>>>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work >> and >>>>>>>>>>>> continued the offline recovery work there. >>>>>>>>>>>> >>>>>>>>>>>> Details on Changes: >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> Most of the EC core logic went to newly extended classes. Key >>>>>>>>> changes >>>>>>>>>>>> went into EC*OutputStream and EC*InputStream classes for >> write >>>>>> and >>>>>>>>>>> read >>>>>>>>>>>> respectively. Based on replication type, ECPipelineProvider >>> will >>>>>> be >>>>>>>>>>> chosen >>>>>>>>>>>> for creating EC pipelines. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> Since we cannot represent the EC replication in the existing >>>>>>>>>>>> replication factor, we have introduced ECReplicationConfig. >> The >>>>>>>>>>>> ReplicationConfig interface is already pushed to master, so >>> it’s >>>>>>>>> not >>>>>>>>>>> a new >>>>>>>>>>>> idea coming through this branch merge now. What is newly >> coming >>>>>>>>> here >>>>>>>>>>> is the >>>>>>>>>>>> ECReplicationConfig class which can be used to express EC >>>>>>>>> replication >>>>>>>>>>>> configuration. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> We wanted to provide the support to enable EC at bucket >> level. >>> To >>>>>>>>>>>> simplify some complications, we have moved the default >>>>>> replication >>>>>>>>>>>> configurations from client to server. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> Client side replication type and replication factor removed >>> from >>>>>>>>> the >>>>>>>>>>>> configuration files and introduced the >>>>>>>>>>> ozone.server.default.replication >>>>>>>>>>>> and ozone.server.default.replication.type.We would continue >> to >>>>>>>>>>> respect if >>>>>>>>>>>> one configures at client side explicitly or passed through >>> APIs, >>>>>>>>>>> otherwise >>>>>>>>>>>> server side bucket level properties or server side default >>>>>>>>>>> configuration >>>>>>>>>>>> would take effect. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> Other than this change, the rest of EC side code should not >>>>>> impact >>>>>>>>> any >>>>>>>>>>>> of the existing code flows. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering >> this >>>>>>>>> feature >>>>>>>>>>>> and we will continue to improve further in master. >>>>>>>>>>>> >>>>>>>>>>>> Git Branch Name : HDDS-3816-ec >>>>>>>>>>>> >>>>>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351 >>>>>>>>>>>> >>>>>>>>>>>> Completed tasks: ~ 142 >>>>>>>>>>>> >>>>>>>>>>>> + We are covering the following two mandatory JIRAs: >>>>>>>>>>>> >>>>>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to >>> older >>>>>>>>>>>> server could fail due to the unavailability for client default >>>>>>>>>>> replication >>>>>>>>>>>> config >>>>>>>>>>>> >>>>>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. >>>>>>>>>>>> >>>>>>>>>>>> PRs reviews in-progress and expected to close in a day or two. >>>>>>>>>>>> >>>>>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe >> they're >>>>>> not >>>>>>>>>>>> blockers for merge. >>>>>>>>>>>> >>>>>>>>>>>> In short what you can do now with this feature: >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> You can enable EC at bucket level and cluster level. >>>>>>>>>>>> >>>>>>>>>>>> How to enable it at bucket level? Just create the bucket by >>> passing >>>>>>>>> the >>>>>>>>>>> ec >>>>>>>>>>>> replication options. >>>>>>>>>>>> >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> You can create EC keys and read the same back. >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> You should be able to continue writing even when chosen nodes >>> are >>>>>>>>>>>> failing. (Of Course minimum of Data+Parity live nodes should >> be >>>>>>>>>>> available >>>>>>>>>>>> in cluster for complete the write) >>>>>>>>>>>> - >>>>>>>>>>>> >>>>>>>>>>>> You should be able to read the file back even if a few nodes >>>>>>>>> failed in >>>>>>>>>>>> the same ec block group(Failures should not be more than >> parity >>>>>>>>>>> number of >>>>>>>>>>>> nodes.). >>>>>>>>>>>> >>>>>>>>>>>> What is pending? Offline recovery of lost/missing EC >> containers. >>> As >>>>>>>>>>>> mentioned above, post merge of this branch, I will create a >>>>>> separate >>>>>>>>> JIRA >>>>>>>>>>>> for starting the work for OfflineRecovery. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> There are automated acceptance test cases already added. >>> HDDS-6231 >>>>>>>>>>>> >>>>>>>>>>>> In addition to that, we have also performed basic Acceptance >>>>>> Testing >>>>>>>>> in >>>>>>>>>>>> physical cluster: >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Installed 10 nodes cluster and created EC bucket (3:2). >>>>>>>>>>>> >>>>>>>>>>>> Uploaded 10GB key. >>>>>>>>>>>> >>>>>>>>>>>> Downloaded the same key and checked the md5sum. >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Uploaded 8GB key. >>>>>>>>>>>> >>>>>>>>>>>> Downloaded the same key and checked the md5sum. >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Uploaded 3MB key >>>>>>>>>>>> >>>>>>>>>>>> Downloaded the same and verified md5sum. >>>>>>>>>>>> >>>>>>>>>>>> 1. >>>>>>>>>>>> >>>>>>>>>>>> Changed bucket to (6:3) >>>>>>>>>>>> >>>>>>>>>>>> Uploaded 8GB key >>>>>>>>>>>> >>>>>>>>>>>> Download the same. >>>>>>>>>>>> >>>>>>>>>>>> Also verified the new key should be in 6:3 policy and old keys >>> must >>>>>>> be >>>>>>>>>>> 3:2.Verified >>>>>>>>>>>> with several different size key writes and reads. >>>>>>>>>>>> >>>>>>>>>>>> Merge checklist items assessment is here: >>>>>>>>>>>> >>>>>> >>> >> https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist >>>>>>>>>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, >>> Istvan >>>>>>>>> Fajth >>>>>>>>>>>> <pi...@cloudera.com> for great efforts in core development and >>>>>> also >>>>>>>>>>>> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for >>>>>>>>> collaborating >>>>>>>>>>>> on some of the EC tasks. >>>>>>>>>>>> >>>>>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as >>>>>> well. >>>>>>>>>>>> Thanks to many others who were involved in design discussions, >>>>>> Arpit, >>>>>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, >>>>>>>>> Prashanth, >>>>>>>>>>>> Rakesh, Yiqun Lin. >>>>>>>>>>>> Sorry if I miss anyone here, but your efforts are much >>> appreciated. >>>>>>>>>>>> Without your tremendous help, we would have not reached this >>>>>> position >>>>>>>>>>> yet. >>>>>>>>>>>> If there are no objections for the merge, I will start the >>> official >>>>>>>>> vote >>>>>>>>>>>> later. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> >>>>>>>>>>>> EC Branch Devs >>>>>>>>>>>> >>>>>>> >>>>>>> >> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org >>>>>>> For additional commands, e-mail: dev-h...@ozone.apache.org >>>>>>> >>>>>>> >>>>>> -- >>>>>> Thanks & Regards, >>>>>> Aravindan >>>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org >>>> For additional commands, e-mail: dev-h...@ozone.apache.org >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org >>> For additional commands, e-mail: dev-h...@ozone.apache.org >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org For additional commands, e-mail: dev-h...@ozone.apache.org