thanks for the great work! i am +1 for merging (non-binding) mingchao zhao <captain...@apache.org> 于2022年4月7日周四 14:18写道:
> +1 for the merge. Thanks > > Mukul Kumar Singh <mksingh.apa...@gmail.com> 于2022年4月7日周四 14:05写道: > > > +1 for the merge. > > > > > > Thanks Lokesh > > > > On 07/04/22 11:29 am, Lokesh Jain wrote: > > > +1 for merge > > > > > > Thanks > > > Lokesh > > > > > >> On 07-Apr-2022, at 11:06 AM, Tsz Wo Sze <szets...@gmail.com> wrote: > > >> > > >> +1 > > >> We should merge it so that more people can try it. We can work on the > > >> remaining tasks in the master branch. Thanks a lot! > > >> > > >> Tsz-Wo > > >> > > >> On Thu, Apr 7, 2022 at 1:17 PM Aravindan Vijayan > > >> <avija...@cloudera.com.invalid> wrote: > > >> > > >>> +1 for the merge. Thanks for the great work! > > >>> > > >>> > > >>> > > >>> On Wed, Apr 6, 2022 at 9:45 PM Prashant Pogde > > <ppo...@cloudera.com.invalid > > >>> wrote: > > >>> > > >>>> +1 for the EC branch merge. > > >>>> > > >>>> Regards, > > >>>> Prashant > > >>>> > > >>>>> On Apr 6, 2022, at 8:20 PM, Siddharth Wagle <swa...@apache.org> > > wrote: > > >>>>> > > >>>>> +1 for the EC branch merge. > > >>>>> > > >>>>> Best, > > >>>>> Sid > > >>>>> > > >>>>> On Wed, Apr 6, 2022 at 8:05 PM guimark <guim...@126.com> wrote: > > >>>>> > > >>>>>> Great news! > > >>>>>> +1 to merge. > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> At 2022-04-06 22:18:31, "Stephen O'Donnell" < > sodonn...@cloudera.com > > >>>> .INVALID> > > >>>>>> wrote: > > >>>>>>> I have been working on the code on this branch for some time, > and I > > >>>>>> believe > > >>>>>>> it is in a good state to merge now. It is mostly new code, and if > > >>>> nothing > > >>>>>>> attempts to use EC, none of the EC code paths will be executed. > > >>>>>>> > > >>>>>>> +1 to merge from me. > > >>>>>>> > > >>>>>>> Stephen. > > >>>>>>> > > >>>>>>> On Wed, Apr 6, 2022 at 7:11 AM Uma gangumalla < > > umamah...@apache.org> > > >>>>>> wrote: > > >>>>>>>> =====Few Edits Below=================== > > >>>>>>>> > > >>>>>>>> Dear Ozone Devs, > > >>>>>>>> > > >>>>>>>> As you may know, we have been actively developing Ozone Erasure > > >>> Coding > > >>>>>>>> support in a separate branch HDDS-3816-ec. > > >>>>>>>> > > >>>>>>>> We have finished the development of EC key write and read > > >>>> functionality. > > >>>>>>>> The support of offline recovery( Recovering replica from node > > loss) > > >>>>>> will be > > >>>>>>>> part of second phase work. > > >>>>>>>> > > >>>>>>>> Since the code has already grown and increasingly started seeing > > >>> merge > > >>>>>>>> complications, we would like to merge the current EC branch into > > >>>> master. > > >>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work > and > > >>>>>> continued > > >>>>>>>> the offline recovery work there. (we have uploaded the design > doc > > >>>> there) > > >>>>>>>> Details on Changes: > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> Most of the EC core logic went to newly extended classes. Key > > >>>> changes > > >>>>>>>> went into EC*OutputStream and EC*InputStream classes for write > > and > > >>>>>> read > > >>>>>>>> respectively. Based on replication type, ECPipelineProvider > will > > >>> be > > >>>>>>>> chosen > > >>>>>>>> for creating EC pipelines. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> Since we cannot represent the EC replication in the existing > > >>>>>> replication > > >>>>>>>> factor, we have introduced ECReplicationConfig. The > > >>>> ReplicationConfig > > >>>>>>>> interface is already pushed to master, so it’s not a new idea > > >>> coming > > >>>>>>>> through this branch merge now. What is newly coming here is > the > > >>>>>>>> ECReplicationConfig class which can be used to express EC > > >>>> replication > > >>>>>>>> configuration. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> We wanted to provide the support to enable EC at bucket level. > > To > > >>>>>>>> simplify some complications, we have moved the default > > replication > > >>>>>>>> configurations from client to server. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> Client side replication type and replication factor removed > from > > >>> the > > >>>>>>>> configuration files and introduced the > > >>>>>> ozone.server.default.replication > > >>>>>>>> and ozone.server.default.replication.type.We would continue to > > >>>>>> respect > > >>>>>>>> if > > >>>>>>>> one configures at client side explicitly or passed through > APIs, > > >>>>>>>> otherwise > > >>>>>>>> server side bucket level properties or server side default > > >>>>>> configuration > > >>>>>>>> would take effect. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> Other than this change, the rest of EC side code should not > > impact > > >>>>>> any > > >>>>>>>> of the existing code flows. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering this > > >>>> feature > > >>>>>>>> and we will continue to improve further in master. > > >>>>>>>> > > >>>>>>>> Git Branch Name : HDDS-3816-ec > > >>>>>>>> > > >>>>>>>> JIRAs: HDDS-3816 and HDDS-5351 > > >>>>>>>> > > >>>>>>>> Completed tasks: ~ 142 > > >>>>>>>> > > >>>>>>>> + We are covering the following two mandatory JIRAs to come in: > > >>>>>>>> > > >>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to > > older > > >>>>>> server > > >>>>>>>> could fail due to the unavailability for client default > > replication > > >>>>>> config > > >>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > > >>>>>>>> > > >>>>>>>> PRs reviews in-progress and expected to close in a day or two. > > >>>>>>>> > > >>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe > they're > > >>> not > > >>>>>>>> blockers for merge. > > >>>>>>>> > > >>>>>>>> In short what you can do now with this feature: > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> You can enable EC at bucket level and cluster level. > > >>>>>>>> > > >>>>>>>> How to enable it at bucket level? Just create the bucket by > > passing > > >>>> the > > >>>>>> ec > > >>>>>>>> replication options. > > >>>>>>>> > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> You can create EC keys and read the same back. > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> You should be able to continue writing even when chosen nodes > > are > > >>>>>>>> failing. (Of Course minimum of Data+Parity live nodes should > be > > >>>>>>>> available > > >>>>>>>> in cluster for complete the write) > > >>>>>>>> - > > >>>>>>>> > > >>>>>>>> You should be able to read the file back even if a few nodes > > >>> failed > > >>>>>> in > > >>>>>>>> the same ec block group(Failures should not be more than > parity > > >>>>>> number > > >>>>>>>> of > > >>>>>>>> nodes.). > > >>>>>>>> > > >>>>>>>> What is pending? Offline recovery of lost/missing EC containers. > > As > > >>>>>>>> mentioned above, post merge of this branch, I will create a > > separate > > >>>>>> JIRA > > >>>>>>>> for starting the work for OfflineRecovery. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> There are automated acceptance test cases already added. > HDDS-6231 > > >>>>>>>> > > >>>>>>>> In addition to that, we have also performed basic Acceptance > > Testing > > >>>> in > > >>>>>>>> physical cluster: > > >>>>>>>> > > >>>>>>>> 1. > > >>>>>>>> > > >>>>>>>> Installed 10 nodes cluster and created EC bucket (3:2). > > >>>>>>>> > > >>>>>>>> Uploaded 10GB key. > > >>>>>>>> > > >>>>>>>> Downloaded the same key and checked the md5sum. > > >>>>>>>> > > >>>>>>>> 1. > > >>>>>>>> > > >>>>>>>> Uploaded 8GB key. > > >>>>>>>> > > >>>>>>>> Downloaded the same key and checked the md5sum. > > >>>>>>>> > > >>>>>>>> 1. > > >>>>>>>> > > >>>>>>>> Uploaded 3MB key > > >>>>>>>> > > >>>>>>>> Downloaded the same and verified md5sum. > > >>>>>>>> > > >>>>>>>> 1. > > >>>>>>>> > > >>>>>>>> Changed bucket to (6:3) > > >>>>>>>> > > >>>>>>>> Uploaded 8GB key > > >>>>>>>> > > >>>>>>>> Download the same. > > >>>>>>>> > > >>>>>>>> Also verified the new key should be in 6:3 policy and old keys > > must > > >>> be > > >>>>>>>> 3:2.Verified > > >>>>>>>> with several different size key writes and reads. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Since the merge discussion thread, we have well stabilized code > > and > > >>>>>> fixed > > >>>>>>>> several bugs. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Merge checklist items assessment is here: > > >>>>>>>> > > >>>>>>>> > > >>> > > > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > >>>>>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, > > Istvan > > >>>>>> Fajth > > >>>>>>>> <pi...@cloudera.com> for great efforts in core development and > > also > > >>>>>> thanks > > >>>>>>>> a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie, Attila for > > >>>>>> collaborating > > >>>>>>>> on some of the EC tasks. > > >>>>>>>> > > >>>>>>>> Thanks to Marton for design discussion and on some dev tasks as > > >>> well. > > >>>>>>>> Thanks to many others who were involved in design discussions, > > >>> Arpit, > > >>>>>> Sidd, > > >>>>>>>> Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, > > Prashanth, > > >>>>>> Rakesh, > > >>>>>>>> Yiqun Lin. > > >>>>>>>> Sorry if I miss anyone here, but your efforts are much > > appreciated. > > >>>>>> Without > > >>>>>>>> your tremendous help, we would have not reached this position > yet. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> To start with, here is my +1 > > >>>>>>>> > > >>>>>>>> The vote will run for 5 days. > > >>>>>>>> > > >>>>>>>> Regards, > > >>>>>>>> Uma > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Tue, Apr 5, 2022 at 10:58 PM Uma gangumalla < > > >>> umamah...@apache.org> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Dear Ozone Devs, > > >>>>>>>>> > > >>>>>>>>> As you may know, we have been actively developing Ozone Erasure > > >>>> Coding > > >>>>>>>>> support in a separate branch HDDS-3816-ec. > > >>>>>>>>> > > >>>>>>>>> We have finished the development of EC key write and read > > >>>>>> functionality. > > >>>>>>>>> The support of offline recovery( Recovering replica from node > > loss) > > >>>>>> will > > >>>>>>>> be > > >>>>>>>>> part of second phase work. > > >>>>>>>>> > > >>>>>>>>> Since the code has already grown and increasingly started > seeing > > >>>> merge > > >>>>>>>>> complications, we would like to propose to merge the current EC > > >>>> branch > > >>>>>>>> into > > >>>>>>>>> master. > > >>>>>>>>> > > >>>>>>>>> We filed the new JIRA(HDDS-6462) for the second phase of work > and > > >>>>>>>>> continued the offline recovery work there. > > >>>>>>>>> > > >>>>>>>>> Details on Changes: > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> Most of the EC core logic went to newly extended classes. Key > > >>>>>> changes > > >>>>>>>>> went into EC*OutputStream and EC*InputStream classes for > write > > >>> and > > >>>>>>>> read > > >>>>>>>>> respectively. Based on replication type, ECPipelineProvider > > will > > >>> be > > >>>>>>>> chosen > > >>>>>>>>> for creating EC pipelines. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> Since we cannot represent the EC replication in the existing > > >>>>>>>>> replication factor, we have introduced ECReplicationConfig. > The > > >>>>>>>>> ReplicationConfig interface is already pushed to master, so > > it’s > > >>>>>> not > > >>>>>>>> a new > > >>>>>>>>> idea coming through this branch merge now. What is newly > coming > > >>>>>> here > > >>>>>>>> is the > > >>>>>>>>> ECReplicationConfig class which can be used to express EC > > >>>>>> replication > > >>>>>>>>> configuration. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> We wanted to provide the support to enable EC at bucket > level. > > To > > >>>>>>>>> simplify some complications, we have moved the default > > >>> replication > > >>>>>>>>> configurations from client to server. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> Client side replication type and replication factor removed > > from > > >>>>>> the > > >>>>>>>>> configuration files and introduced the > > >>>>>>>> ozone.server.default.replication > > >>>>>>>>> and ozone.server.default.replication.type.We would continue > to > > >>>>>>>> respect if > > >>>>>>>>> one configures at client side explicitly or passed through > > APIs, > > >>>>>>>> otherwise > > >>>>>>>>> server side bucket level properties or server side default > > >>>>>>>> configuration > > >>>>>>>>> would take effect. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> Other than this change, the rest of EC side code should not > > >>> impact > > >>>>>> any > > >>>>>>>>> of the existing code flows. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> We have finished documentation JIRA(HDDS-6172) for covering > this > > >>>>>> feature > > >>>>>>>>> and we will continue to improve further in master. > > >>>>>>>>> > > >>>>>>>>> Git Branch Name : HDDS-3816-ec > > >>>>>>>>> > > >>>>>>>>> JIRAs: HDDS-3816 and HDDS-5351 > > >>>>>>>>> > > >>>>>>>>> Completed tasks: ~ 142 > > >>>>>>>>> > > >>>>>>>>> + We are covering the following two mandatory JIRAs: > > >>>>>>>>> > > >>>>>>>>> 1. HDDS-6209: EC: [Forward compatibility issue] New client to > > older > > >>>>>>>>> server could fail due to the unavailability for client default > > >>>>>>>> replication > > >>>>>>>>> config > > >>>>>>>>> > > >>>>>>>>> 2. HDDS-5909: EC: Onboard EC into upgrade framework. > > >>>>>>>>> > > >>>>>>>>> PRs reviews in-progress and expected to close in a day or two. > > >>>>>>>>> > > >>>>>>>>> Few other JIRAs in HDDS-3816 are still open but I believe > they're > > >>> not > > >>>>>>>>> blockers for merge. > > >>>>>>>>> > > >>>>>>>>> In short what you can do now with this feature: > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> You can enable EC at bucket level and cluster level. > > >>>>>>>>> > > >>>>>>>>> How to enable it at bucket level? Just create the bucket by > > passing > > >>>>>> the > > >>>>>>>> ec > > >>>>>>>>> replication options. > > >>>>>>>>> > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> You can create EC keys and read the same back. > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> You should be able to continue writing even when chosen nodes > > are > > >>>>>>>>> failing. (Of Course minimum of Data+Parity live nodes should > be > > >>>>>>>> available > > >>>>>>>>> in cluster for complete the write) > > >>>>>>>>> - > > >>>>>>>>> > > >>>>>>>>> You should be able to read the file back even if a few nodes > > >>>>>> failed in > > >>>>>>>>> the same ec block group(Failures should not be more than > parity > > >>>>>>>> number of > > >>>>>>>>> nodes.). > > >>>>>>>>> > > >>>>>>>>> What is pending? Offline recovery of lost/missing EC > containers. > > As > > >>>>>>>>> mentioned above, post merge of this branch, I will create a > > >>> separate > > >>>>>> JIRA > > >>>>>>>>> for starting the work for OfflineRecovery. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> There are automated acceptance test cases already added. > > HDDS-6231 > > >>>>>>>>> > > >>>>>>>>> In addition to that, we have also performed basic Acceptance > > >>> Testing > > >>>>>> in > > >>>>>>>>> physical cluster: > > >>>>>>>>> > > >>>>>>>>> 1. > > >>>>>>>>> > > >>>>>>>>> Installed 10 nodes cluster and created EC bucket (3:2). > > >>>>>>>>> > > >>>>>>>>> Uploaded 10GB key. > > >>>>>>>>> > > >>>>>>>>> Downloaded the same key and checked the md5sum. > > >>>>>>>>> > > >>>>>>>>> 1. > > >>>>>>>>> > > >>>>>>>>> Uploaded 8GB key. > > >>>>>>>>> > > >>>>>>>>> Downloaded the same key and checked the md5sum. > > >>>>>>>>> > > >>>>>>>>> 1. > > >>>>>>>>> > > >>>>>>>>> Uploaded 3MB key > > >>>>>>>>> > > >>>>>>>>> Downloaded the same and verified md5sum. > > >>>>>>>>> > > >>>>>>>>> 1. > > >>>>>>>>> > > >>>>>>>>> Changed bucket to (6:3) > > >>>>>>>>> > > >>>>>>>>> Uploaded 8GB key > > >>>>>>>>> > > >>>>>>>>> Download the same. > > >>>>>>>>> > > >>>>>>>>> Also verified the new key should be in 6:3 policy and old keys > > must > > >>>> be > > >>>>>>>> 3:2.Verified > > >>>>>>>>> with several different size key writes and reads. > > >>>>>>>>> > > >>>>>>>>> Merge checklist items assessment is here: > > >>>>>>>>> > > >>> > > > https://cwiki.apache.org/confluence/display/OZONE/Ozone+EC+Branch%28HDDS-3816-ec%29+Phase-1+%3A+Merge+Checklist > > >>>>>>>>> Big shoutout to Stephen O'Donnell <sodonn...@cloudera.com>, > > Istvan > > >>>>>> Fajth > > >>>>>>>>> <pi...@cloudera.com> for great efforts in core development and > > >>> also > > >>>>>>>>> thanks a lot to Sammi, Mingchao Zhao, Mark Gui, Kaijie for > > >>>>>> collaborating > > >>>>>>>>> on some of the EC tasks. > > >>>>>>>>> > > >>>>>>>>> Thanks to Marton for design discussion and on some dev tasks as > > >>> well. > > >>>>>>>>> Thanks to many others who were involved in design discussions, > > >>> Arpit, > > >>>>>>>>> Sidd, Jitendra, Mukul, Sanjay, Karthik, Bharat, Nanda, Shashi, > > >>>>>> Prashanth, > > >>>>>>>>> Rakesh, Yiqun Lin. > > >>>>>>>>> Sorry if I miss anyone here, but your efforts are much > > appreciated. > > >>>>>>>>> Without your tremendous help, we would have not reached this > > >>> position > > >>>>>>>> yet. > > >>>>>>>>> If there are no objections for the merge, I will start the > > official > > >>>>>> vote > > >>>>>>>>> later. > > >>>>>>>>> > > >>>>>>>>> Regards, > > >>>>>>>>> > > >>>>>>>>> EC Branch Devs > > >>>>>>>>> > > >>>> > > >>>> > --------------------------------------------------------------------- > > >>>> To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > > >>>> For additional commands, e-mail: dev-h...@ozone.apache.org > > >>>> > > >>>> > > >>> -- > > >>> Thanks & Regards, > > >>> Aravindan > > >>> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > > > For additional commands, e-mail: dev-h...@ozone.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@ozone.apache.org > > For additional commands, e-mail: dev-h...@ozone.apache.org > > > > >