Re: [VOTE] Release Hadoop-3.1.3-RC0
+1 (binding) Thanks Zhankun for all of your hard work on this release. I downloaded and built the source and ran it on an insecure multi-node pseudo cluster. I performed various YARN manual tests, including creating custom resources, creating queue submission ACLs, and queue refreshes. One concern is that preemption does not seem to be working when only the custom resources are over the queue capacity, but I don't think this is something introduced with this release. -Eric On Thursday, September 12, 2019, 3:04:44 AM CDT, Zhankun Tang wrote: Hi folks, Thanks to everyone's help on this release. Special thanks to Rohith, Wei-Chiu, Akira, Sunil, Wangda! I have created a release candidate (RC0) for Apache Hadoop 3.1.3. The RC release artifacts are available at: http://home.apache.org/~ztang/hadoop-3.1.3-RC0/ The maven artifacts are staged at: https://repository.apache.org/content/repositories/orgapachehadoop-1228/ The RC tag in git is here: https://github.com/apache/hadoop/tree/release-3.1.3-RC0 And my public key is at: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS *This vote will run for 7 days, ending on Sept.19th at 11:59 pm PST.* For the testing, I have run several Spark and distributed shell jobs in my pseudo cluster. My +1 (non-binding) to start. BR, Zhankun On Wed, 4 Sep 2019 at 15:56, zhankun tang wrote: > Hi all, > > Thanks for everyone helping in resolving all the blockers targeting Hadoop > 3.1.3[1]. We've cleaned all the blockers and moved out non-blockers issues > to 3.1.4. > > I'll cut the branch today and call a release vote soon. Thanks! > > > [1]. https://s.apache.org/5hj5i > > BR, > Zhankun > > > On Wed, 21 Aug 2019 at 12:38, Zhankun Tang wrote: > >> Hi folks, >> >> We have Apache Hadoop 3.1.2 released on Feb 2019. >> >> It's been more than 6 months passed and there're >> >> 246 fixes[1]. 2 blocker and 4 critical Issues [2] >> >> (As Wei-Chiu Chuang mentioned, HDFS-13596 will be another blocker) >> >> >> I propose my plan to do a maintenance release of 3.1.3 in the next few >> (one or two) weeks. >> >> Hadoop 3.1.3 release plan: >> >> Code Freezing Date: *25th August 2019 PDT* >> >> Release Date: *31th August 2019 PDT* >> >> >> Please feel free to share your insights on this. Thanks! >> >> >> [1] https://s.apache.org/zw8l5 >> >> [2] https://s.apache.org/fjol5 >> >> >> BR, >> >> Zhankun >> > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
Hi Jonathan, Thanks very much for all of your work on this release. I have a concern about cross-queue (inter-queue) preemption in 2.10. In 2.8, on a 6 node pseudo-cluster, preempting from one queue to meet the needs of another queue seems to work as expected. However, 2.10 in the same pseudo-cluster (with the same config properties), only one container was preempted for the AM and then nothing else. I don't know how the community feels about holding up the 2.10.0 release for this issue, but we need to get to the bottom of this before we can go to 2.10.x. I am still investigating. Thanks, -Eric On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung wrote: > Hi folks, > > This is the second release candidate for the first release of Apache Hadoop > 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes > features such as: > > - User-defined resource types > - Native GPU support as a schedulable resource type > - Consistent reads from standby node > - Namenode port based selective encryption > - Improvements related to rolling upgrade support from 2.x to 3.x > - Cost based fair call queue > > The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/ > > RC tag is release-2.10.0-RC1. > > The maven artifacts are hosted here: > https://repository.apache.org/content/repositories/orgapachehadoop-1243/ > > My public key is available here: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT. > > Thanks, > Jonathan Hung - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk) Unfortunately, I ran into the following problem: Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following error: 2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch): Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. (client = 19, server = 21) The AM happened to launch on the 3.3.0 node. Is this a protobuf issue? I thought we addressed that? -Eric Payne On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung wrote: Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and 2.10.0 clients and datanodes. Everything worked as expected. Jonathan Hung On Tue, Oct 22, 2019 at 3:04 PM Eric Badger wrote: > Hi Jonathan, > > Thanks for putting this RC together. You stated that there are > improvements related to rolling upgrades from 2.x to 3.x and I know I have > seen multiple JIRAs getting committed to that effect. Could you describe > any tests that you have done to verify rolling upgrade compatibility > for 3.x servers talking to 2.x clients and vice versa? > > Thanks, > > Eric > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > wrote: > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar >> (HDFS-14667). Since this is the first of a minor release, we would like to >> get it into 2.10.0. >> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 >> shortly. >> >> Jonathan Hung >> >> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: >> >> > Thanks for the effort, Jonathan! >> > >> > +1 (non-binding) on RC0. >> > - Set up a single node cluster with the binary tarball >> > - Run Spark Pi and pySpark job >> > >> > BR, >> > Zhankun >> > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > > >> > wrote: >> > >> >> +1 on RC0. >> >> - Verified signatures >> >> - Built from sources >> >> - Ran unit tests for new features >> >> - Checked artifacts on Nexus, made sure the sources are present. >> >> >> >> Thanks >> >> --Konstantin >> >> >> >> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >> >> wrote: >> >> >> >> > Hi folks, >> >> > >> >> > This is the first release candidate for the first release of Apache >> >> Hadoop >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It >> includes >> >> > features such as: >> >> > >> >> > - User-defined resource types >> >> > - Native GPU support as a schedulable resource type >> >> > - Consistent reads from standby node >> >> > - Namenode port based selective encryption >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> >> > >> >> > The RC0 artifacts are at: >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >> >> > >> >> > RC tag is release-2.10.0-RC0. >> >> > >> >> > The maven artifacts are hosted here: >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >> >> > >> >> > My public key is available here: >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> >> > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at >> 6:00 pm >> >> > PDT. >> >> > >> >> > Thanks, >> >> > Jonathan Hung >> >> > >> >> > [1] >> >> > >> >> > >> >> >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >> >> > >> >> >> > >> > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Ah! Yes! That makes sense. I will use the mapredonhdfs framework in my next set of tests. The other compatibility tests that I ran worked as expected. -Eric On Saturday, October 26, 2019, 12:29:54 PM CDT, Jonathan Hung wrote: Hi Eric, I took a quick look, are you using mapreduce.application.framework.path to run your MR jobs? If not, this seems like expected behavior if AM and tasks get launched on different NMs with different locally installed hadoop versions? Jonathan Hung On Sat, Oct 26, 2019 at 8:55 AM epa...@apache.org wrote: I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk) Unfortunately, I ran into the following problem: Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following error: 2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch): Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. (client = 19, server = 21) The AM happened to launch on the 3.3.0 node. Is this a protobuf issue? I thought we addressed that? -Eric Payne On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung wrote: Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and 2.10.0 clients and datanodes. Everything worked as expected. Jonathan Hung On Tue, Oct 22, 2019 at 3:04 PM Eric Badger wrote: > Hi Jonathan, > > Thanks for putting this RC together. You stated that there are > improvements related to rolling upgrades from 2.x to 3.x and I know I have > seen multiple JIRAs getting committed to that effect. Could you describe > any tests that you have done to verify rolling upgrade compatibility > for 3.x servers talking to 2.x clients and vice versa? > > Thanks, > > Eric > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > wrote: > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar >> (HDFS-14667). Since this is the first of a minor release, we would like to >> get it into 2.10.0. >> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 >> shortly. >> >> Jonathan Hung >> >> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: >> >> > Thanks for the effort, Jonathan! >> > >> > +1 (non-binding) on RC0. >> > - Set up a single node cluster with the binary tarball >> > - Run Spark Pi and pySpark job >> > >> > BR, >> > Zhankun >> > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > > >> > wrote: >> > >> >> +1 on RC0. >> >> - Verified signatures >> >> - Built from sources >> >> - Ran unit tests for new features >> >> - Checked artifacts on Nexus, made sure the sources are present. >> >> >> >> Thanks >> >> --Konstantin >> >> >> >> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >> >> wrote: >> >> >> >> > Hi folks, >> >> > >> >> > This is the first release candidate for the first release of Apache >> >> Hadoop >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It >> includes >> >> > features such as: >> >> > >> >> > - User-defined resource types >> >> > - Native GPU support as a schedulable resource type >> >> > - Consistent reads from standby node >> >> > - Namenode port based selective encryption >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> >> > >> >> > The RC0 artifacts are at: >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >> >> > >> >> > RC tag is release-2.10.0-RC0. >> >> > >> >> > The maven artifacts are hosted here: >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >> >> > >> >> > My public key is available here: >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> >> > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at >> 6:00 pm >> >> > PDT. >> >> > >> >> > Thanks, >> >> > Jonathan Hung >> >> > >> >> > [1] >> >> > >> >> > >> >> >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >> >> > >> >> >> > >> >
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Compatibility testing has gone well for me. - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 2.10.0 - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk - With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, I ran a word count job in each cluster whose inputs and outputs were from and to the opposite cluster. - I verified that HDFS replication works as expected in a trunk cluster that has one 2.10.0 datanode. Thanks, -Eric On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung wrote: Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and 2.10.0 clients and datanodes. Everything worked as expected. Jonathan Hung On Tue, Oct 22, 2019 at 3:04 PM Eric Badger wrote: > Hi Jonathan, > > Thanks for putting this RC together. You stated that there are > improvements related to rolling upgrades from 2.x to 3.x and I know I have > seen multiple JIRAs getting committed to that effect. Could you describe > any tests that you have done to verify rolling upgrade compatibility > for 3.x servers talking to 2.x clients and vice versa? > > Thanks, > > Eric > > On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung > wrote: > >> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar >> (HDFS-14667). Since this is the first of a minor release, we would like to >> get it into 2.10.0. >> >> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 >> shortly. >> >> Jonathan Hung >> >> >> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: >> >> > Thanks for the effort, Jonathan! >> > >> > +1 (non-binding) on RC0. >> > - Set up a single node cluster with the binary tarball >> > - Run Spark Pi and pySpark job >> > >> > BR, >> > Zhankun >> > >> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > > >> > wrote: >> > >> >> +1 on RC0. >> >> - Verified signatures >> >> - Built from sources >> >> - Ran unit tests for new features >> >> - Checked artifacts on Nexus, made sure the sources are present. >> >> >> >> Thanks >> >> --Konstantin >> >> >> >> >> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >> >> wrote: >> >> >> >> > Hi folks, >> >> > >> >> > This is the first release candidate for the first release of Apache >> >> Hadoop >> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It >> includes >> >> > features such as: >> >> > >> >> > - User-defined resource types >> >> > - Native GPU support as a schedulable resource type >> >> > - Consistent reads from standby node >> >> > - Namenode port based selective encryption >> >> > - Improvements related to rolling upgrade support from 2.x to 3.x >> >> > >> >> > The RC0 artifacts are at: >> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >> >> > >> >> > RC tag is release-2.10.0-RC0. >> >> > >> >> > The maven artifacts are hosted here: >> >> > >> >> >> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >> >> > >> >> > My public key is available here: >> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >> >> > >> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at >> 6:00 pm >> >> > PDT. >> >> > >> >> > Thanks, >> >> > Jonathan Hung >> >> > >> >> > [1] >> >> > >> >> > >> >> >> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >> >> > >> >> >> > >> > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)
Jonathan, I actually did all my testing on RC1. Sorry for the confusion. I'll respond on the RC1 thread. -Eric On Monday, October 28, 2019, 8:00:20 PM CDT, Jonathan Hung wrote: Thanks Eric! I sent out an RC1 earlier last week, not sure if you saw that. The only diff between RC1 and RC0 is HDFS-14667. If RC1 looks good to you then it'd be great to get your testing results on that thread. Jonathan Hung On Mon, Oct 28, 2019 at 1:06 PM epa...@apache.org wrote: > Compatibility testing has gone well for me. > > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and > 2.10.0 > - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and > trunk > - With one 4-node cluster running 2.10.0 and one 4-node cluster running > trunk, I ran a word count job in each cluster whose inputs and outputs were > from and to the opposite cluster. > - I verified that HDFS replication works as expected in a trunk cluster that > has one 2.10.0 datanode. > > Thanks, > -Eric > > On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung > wrote: > > > > > > Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and > 2.10.0 clients and datanodes. Everything worked as expected. > > Jonathan Hung > > > On Tue, Oct 22, 2019 at 3:04 PM Eric Badger > wrote: > >> Hi Jonathan, >> >> Thanks for putting this RC together. You stated that there are >> improvements related to rolling upgrades from 2.x to 3.x and I know I have >> seen multiple JIRAs getting committed to that effect. Could you describe >> any tests that you have done to verify rolling upgrade compatibility >> for 3.x servers talking to 2.x clients and vice versa? >> >> Thanks, >> >> Eric >> >> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung >> wrote: >> >>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar >>> (HDFS-14667). Since this is the first of a minor release, we would like to >>> get it into 2.10.0. >>> >>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1 >>> shortly. >>> >>> Jonathan Hung >>> >>> >>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang wrote: >>> >>> > Thanks for the effort, Jonathan! >>> > >>> > +1 (non-binding) on RC0. >>> > - Set up a single node cluster with the binary tarball >>> > - Run Spark Pi and pySpark job >>> > >>> > BR, >>> > Zhankun >>> > >>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko >> > >>> > wrote: >>> > >>> >> +1 on RC0. >>> >> - Verified signatures >>> >> - Built from sources >>> >> - Ran unit tests for new features >>> >> - Checked artifacts on Nexus, made sure the sources are present. >>> >> >>> >> Thanks >>> >> --Konstantin >>> >> >>> >> >>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung >>> >> wrote: >>> >> >>> >> > Hi folks, >>> >> > >>> >> > This is the first release candidate for the first release of Apache >>> >> Hadoop >>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It >>> includes >>> >> > features such as: >>> >> > >>> >> > - User-defined resource types >>> >> > - Native GPU support as a schedulable resource type >>> >> > - Consistent reads from standby node >>> >> > - Namenode port based selective encryption >>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x >>> >> > >>> >> > The RC0 artifacts are at: >>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/ >>> >> > >>> >> > RC tag is release-2.10.0-RC0. >>> >> > >>> >> > The maven artifacts are hosted here: >>> >> > >>> >> >>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/ >>> >> > >>> >> > My public key is available here: >>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS >>> >> > >>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at >>> 6:00 pm >>> >> > PDT. >>> >> > >>> >> > Thanks, >>> >> > Jonathan Hung >>> >> > >>> >> > [1] >>> >> > >>> >> > >>> >> >>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0) >>> >> > >>> >> >>> > >>> >> > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)
Compatibility testing has gone well for me. - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 2.10.0 - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk - With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, I ran a word count job in each cluster whose inputs and outputs were from and to the opposite cluster. - I verified that HDFS replication works as expected in a trunk cluster that has one 2.10.0 datanode. Thanks, -Eric > On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung > wrote: > Hi folks, > >This is the second release candidate for the first release of Apache Hadoop >2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes >features such as: > > - User-defined resource types > - Native GPU support as a schedulable resource type > - Consistent reads from standby node > - Namenode port based selective encryption > - Improvements related to rolling upgrade support from 2.x to 3.x > - Cost based fair call queue > > The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/ > > RC tag is release-2.10.0-RC1. > > The maven artifacts are hosted here: > https://repository.apache.org/content/repositories/orgapachehadoop-1243/ > > My public key is available here: > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS > > The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT. > > Thanks, > Jonathan Hung - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Thanks Jonathan for opening the discussion. I am not in favor of this proposal. 2.10 was very recently released, and moving to 2.10 will take some time for the community. It seems premature to make a decision at this point that there will never be a need for a 2.11 release. -Eric On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung wrote: Hi folks, Given the release of 2.10.0, and the fact that it's intended to be a bridge release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor release line in branch-2. Currently, the main issue is that there's many fixes going into branch-2 (the theoretical 2.11.0) that's not going into branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will likely never see the light of day unless they are backported to branch-2.10. To do this, I propose we: - Delete branch-2.10 - Rename branch-2 to branch-2.10 - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT This way we get all the current branch-2 fixes into the 2.10.x release line. Then the commit chain will look like: trunk -> branch-3.2 -> branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 Thoughts? Jonathan Hung [1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] Making 2.10 the last minor 2.x release
Hi Konstantin, Sure, I understand those concerns. On the other hand, I worry about the stability of 2.10, since we will be on it for a couple of years at least. I worry that some committers may want to put new features into a branch 2 release, and without a branch-2, they will go directly into 2.10. Since we don't always catch corner cases or performance problems for some time (usually not until the release is deployed to a busy, 4-thousand node cluster), it may be very difficult to back out those changes. It sounds like I'm in the minority here, so I'm not nixing the idea, but I do have these reservations. Thanks, -Eric On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko wrote: Hi Eric, We had a long discussion on this list regarding making the 2.10 release the last of branch-2 releases. We intended 2.10 as a bridge release between Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in the picture right now, and many people may object this idea. I understand Jonathan's proposal as an attempt to 1. eliminate confusion which branches people should commit their back-ports to 2. save engineering effort committing to more branches than necessary "Branches are cheap" as our founder used to say. If we ever decide to release 2.11 we can resurrect the branch. Until then I am in favor of Jonathan's proposal +1. Thanks, --Konstantin On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung wrote: > Thanks Eric for the comments - regarding your concerns, I feel the pros > outweigh the cons. To me, the chances of patch releases on 2.10.x are much > higher than a new 2.11 minor release. (There didn't seem to be many people > outside of our company who expressed interest in getting new features to > branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0 > release, there's 29 patches that have gone into branch-2 and 9 in > branch-2.10, so it's already diverged quite a bit. > > In any case, we can always reverse this decision if we really need to, by > recreating branch-2. But this proposal would reduce a lot of confusion IMO. > > Jonathan Hung > > > On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org > wrote: > > > Thanks Jonathan for opening the discussion. > > > > I am not in favor of this proposal. 2.10 was very recently released, and > > moving to 2.10 will take some time for the community. It seems premature > to > > make a decision at this point that there will never be a need for a 2.11 > > release. > > > > -Eric > > > > > > On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung < > > jyhung2...@gmail.com> wrote: > > > > Hi folks, > > > > Given the release of 2.10.0, and the fact that it's intended to be a > bridge > > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor > > release line in branch-2. Currently, the main issue is that there's many > > fixes going into branch-2 (the theoretical 2.11.0) that's not going into > > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will > > likely never see the light of day unless they are backported to > > branch-2.10. > > > > To do this, I propose we: > > > > - Delete branch-2.10 > > - Rename branch-2 to branch-2.10 > > - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT > > > > This way we get all the current branch-2 fixes into the 2.10.x release > > line. Then the commit chain will look like: trunk -> branch-3.2 -> > > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8 > > > > Thoughts? > > > > Jonathan Hung > > > > [1] > https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html > > > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
[ANNOUNCE] Jim Brennan is a new Hadoop Committer
I am pleased to announce that Jim Brennan has accepted the invitation to become a Hadoop committer focusing on the YARN space. Please reach out to Jim and welcome him in his new role. Congratulations, Jim! Well-deserved! -Eric Payne - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [E] Re: the v2 commit algorithm
Thanks Steve and Jim for bringing this issue to our attention. IIUC, Serial commit takes minutes with mrv1, whereas with mrv2 it is very quick. With this kind of performance difference, is wise to change the default behavior for released versions of Hadoop? Should this be limited to trunk? Thanks, -Eric Payne On Wednesday, September 23, 2020, 2:16:14 PM CDT, Jim Brennan wrote: I replied in the Jira. The speed up provided by the v2 commit algorithm is very important to us at Verizon Media (Yahoo). Please do not remove it. I referred to this comment from Jason Lowe on the original Jira: https://issues.apache.org/jira/browse/MAPREDUCE-4815?focusedCommentId=14271115&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14271115 I think it would be appropriate to better document the limitations of the v2 algorithm and possibly make it not be the default, as long as we can still use it. On Wed, Sep 23, 2020 at 2:07 PM Igor Dvorzhak wrote: > What will be the solution for object stores to have fast and correct > commit algorithms? > > On Wed, Sep 23, 2020 at 11:42 AM Steve Loughran > wrote: > >> I've got a PR up to completely remove the v2 commit algorithm >> >> https://github.com/apache/hadoop/pull/2320 >> >> That may seem overkill, but while *we* know there's a small window of risk >> (task attempt 1 failing partway through a nonatomic commit), that's not >> known/appreciated by others. >> >> The patch removes the v2 codepath from FileOutputCommitter, making it a >> lot >> less complicated, and when v2 is requested, a warning is printed and the >> option ignored. >> >> Overkill? Maybe. But it guarantees correctness >> > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [ANNOUNCE] Hui Fei is a new Apache Hadoop Committer
Congratulations Hui Fei! On Wednesday, September 23, 2020, 1:07:11 PM CDT, Wei-Chiu Chuang wrote: I am pleased to announce that Hui Fei has accepted the invitation to become a Hadoop committer. He started contributing to the project in October 2016. Over the past 4 years he has contributed a lot in HDFS, especially in Erasure Coding, Hadoop 3 upgrade, RBF and Standby Serving reads. One of the biggest contributions is Hadoop 2->3 rolling upgrade support. This was a major blocker for any existing Hadoop users to adopt Hadoop 3. The adoption of Hadoop 3 has gone up after this. In the past the community discussed a lot about Hadoop 3 rolling upgrade being a must-have, but no one took the initiative to make it happen. I am personally very grateful for this. The work on EC is impressive as well. He managed to onboard EC in production at scale, fixing tricky problems. Again, I am impressed and grateful for the contribution in EC. In addition to code contributions, he invested a lot in the community: > > - Apache Hadoop Community 2019 Beijing Meetup > https://blogs.apache.org/hadoop/entry/hadoop-community-meetup-beijing-aug >where > he discussed the operational experience of RBF in production > > > - Apache Hadoop Storage Community Sync Online > >https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit#heading=h.irqxw1iy16zo > where > he discussed the Hadoop 3 rolling upgrade support > > Let's congratulate Hui for this new role! Cheers, Wei-Chiu Chuang (on behalf of the Apache Hadoop PMC) - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] check style changes
I would be fine with a discussion and vote on relaxing some checkstyle restrictions. Regarding line length, my personal preference is to leave it at 80, but 80 is arbitrary and I would not oppose 100 if that's what people want. Another one that I think should be relaxed is the limit on number of arguments to a method. I understand that a ton of arguments makes a method messy, but I find it irritating when I add an argument to something that is already over the limit and I get penalized for it. The ones I have seen are all constructor methods. -Eric On Thursday, May 13, 2021, 10:10:27 AM CDT, Sean Busbey wrote: Hi folks! I’d like to start cleaning up our nightly tests. As a bit of low hanging fruit I’d like to alter some of our check style rules to match what I think we’ve been doing in the community. How would folks prefer I make sure we have consensus on such changes? As an example, our last nightly run had ~81k check style violations (it’s a big number but it’s not that bad given the size of the repo) and roughly 16% of those were for line lengths in excess of 80 characters but <= 100 characters. If I wanted to change our line length check to be 100 characters rather than the default of 80, would folks rather I have a DISCUSS thread first? Or would they rather a Jira + PR with the discussion of the merits happening there? — busbey - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Hadoop 3.1.x EOL
+1 (binding) -Eric On Thursday, June 3, 2021, 1:14:51 AM CDT, Akira Ajisaka wrote: Dear Hadoop developers, Given the feedback from the discussion thread [1], I'd like to start an official vote thread for the community to vote and start the 3.1 EOL process. What this entails: (1) an official announcement that no further regular Hadoop 3.1.x releases will be made after 3.1.4. (2) resolve JIRAs that specifically target 3.1.5 as won't fix. This vote will run for 7 days and conclude by June 10th, 16:00 JST [2]. Committers are eligible to cast binding votes. Non-committers are welcomed to cast non-binding votes. Here is my vote, +1 [1] https://s.apache.org/w9ilb [2] https://www.timeanddate.com/worldclock/fixedtime.html?msg=4&iso=20210610T16&p1=248 Regards, Akira - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [VOTE] Release Apache Hadoop 3.3.1 RC3
+1 (binding) Eric On Tuesday, June 1, 2021, 5:29:49 AM CDT, Wei-Chiu Chuang wrote: Hi community, This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker issues have been resolved [1] again. There are 2 additional issues resolved for RC3: * Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after HADOOP-16878 * Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same There are 4 issues resolved for RC2: * HADOOP-17666. Update LICENSE for 3.3.1 * MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053) * Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055) * HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064) The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra fixes compared to hadoop-thirdparty 1.1.0: * HADOOP-17707. Remove jaeger document from site index. * HADOOP-17730. Add back error_prone *RC tag is release-3.3.1-RC3 https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3 *The RC3 artifacts are at*: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/ ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/ *The maven artifacts are hosted here:* https://repository.apache.org/content/repositories/orgapachehadoop-1320/ *My public key is available here:* https://dist.apache.org/repos/dist/release/hadoop/common/KEYS Things I've verified: * all blocker issues targeting 3.3.1 have been resolved. * stable/evolving API changes between 3.3.0 and 3.3.1 are compatible. * LICENSE and NOTICE files checked * RELEASENOTES and CHANGELOG * rat check passed. * Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests. * Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests. * Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2. Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and dependency divergence. Issues are being identified but so far nothing blocker for Hadoop itself. Please try the release and vote. The vote will run for 5 days. My +1 to start, [1] https://issues.apache.org/jira/issues/?filter=12350491 [2] https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3