Re: [VOTE] Release Hadoop-3.1.3-RC0

2019-09-19 Thread epa...@apache.org



+1 (binding)

Thanks Zhankun for all of your hard work on this release.

I downloaded and built the source and ran it on an insecure multi-node pseudo 
cluster.

I performed various YARN manual tests, including creating custom resources, 
creating queue submission ACLs, and queue refreshes.

One concern is that preemption does not seem to be working when only the custom 
resources are over the queue capacity, but I don't think this is something 
introduced with this release.

-Eric



On Thursday, September 12, 2019, 3:04:44 AM CDT, Zhankun Tang 
 wrote: 





Hi folks,

Thanks to everyone's help on this release. Special thanks to Rohith,
Wei-Chiu, Akira, Sunil, Wangda!

I have created a release candidate (RC0) for Apache Hadoop 3.1.3.

The RC release artifacts are available at:
http://home.apache.org/~ztang/hadoop-3.1.3-RC0/

The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1228/

The RC tag in git is here:
https://github.com/apache/hadoop/tree/release-3.1.3-RC0

And my public key is at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

*This vote will run for 7 days, ending on Sept.19th at 11:59 pm PST.*

For the testing, I have run several Spark and distributed shell jobs in my
pseudo cluster.

My +1 (non-binding) to start.

BR,
Zhankun

On Wed, 4 Sep 2019 at 15:56, zhankun tang  wrote:

> Hi all,
>
> Thanks for everyone helping in resolving all the blockers targeting Hadoop
> 3.1.3[1]. We've cleaned all the blockers and moved out non-blockers issues
> to 3.1.4.
>
> I'll cut the branch today and call a release vote soon. Thanks!
>
>
> [1]. https://s.apache.org/5hj5i
>
> BR,
> Zhankun
>
>
> On Wed, 21 Aug 2019 at 12:38, Zhankun Tang  wrote:
>
>> Hi folks,
>>
>> We have Apache Hadoop 3.1.2 released on Feb 2019.
>>
>> It's been more than 6 months passed and there're
>>
>> 246 fixes[1]. 2 blocker and 4 critical Issues [2]
>>
>> (As Wei-Chiu Chuang mentioned, HDFS-13596 will be another blocker)
>>
>>
>> I propose my plan to do a maintenance release of 3.1.3 in the next few
>> (one or two) weeks.
>>
>> Hadoop 3.1.3 release plan:
>>
>> Code Freezing Date: *25th August 2019 PDT*
>>
>> Release Date: *31th August 2019 PDT*
>>
>>
>> Please feel free to share your insights on this. Thanks!
>>
>>
>> [1] https://s.apache.org/zw8l5
>>
>> [2] https://s.apache.org/fjol5
>>
>>
>> BR,
>>
>> Zhankun
>>
>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-23 Thread epa...@apache.org
Hi Jonathan,

Thanks very much for all of your work on this release.

I have a concern about cross-queue (inter-queue) preemption in 2.10.

In 2.8, on a 6 node pseudo-cluster, preempting from one queue to meet the needs 
of another queue seems to work as expected. However, 2.10 in the same 
pseudo-cluster (with the same config properties), only one container was 
preempted for the AM and then nothing else.

I don't know how the community feels about holding up the 2.10.0 release for 
this issue, but we need to get to the bottom of this before we can go to 
2.10.x. I am still investigating.

Thanks,
-Eric




 On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung 
 wrote: 
> Hi folks,
> 
> This is the second release candidate for the first release of Apache Hadoop
> 2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
> features such as:
> 
> - User-defined resource types
> - Native GPU support as a schedulable resource type
> - Consistent reads from standby node
> - Namenode port based selective encryption
> - Improvements related to rolling upgrade support from 2.x to 3.x
> - Cost based fair call queue
> 
> The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
> 
> RC tag is release-2.10.0-RC1.
> 
> The maven artifacts are hosted here:
> https://repository.apache.org/content/repositories/orgapachehadoop-1243/
> 
> My public key is available here:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT.
> 
> Thanks,
> Jonathan Hung

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-26 Thread epa...@apache.org
I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk)

Unfortunately, I ran into the following problem:

Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following 
error:

2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch):
 Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. 
(client = 19, server = 21)

The AM happened to launch on the 3.3.0 node.

Is this a protobuf issue? I thought we addressed that?

-Eric Payne



On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-27 Thread epa...@apache.org
 Ah! Yes! That makes sense. I will use the mapredonhdfs framework in my next 
set of tests.
The other compatibility tests that I ran worked as expected.
-Eric

On Saturday, October 26, 2019, 12:29:54 PM CDT, Jonathan Hung 
 wrote:  
 
 Hi Eric, I took a quick look, are you using 
mapreduce.application.framework.path to run your MR jobs? If not, this seems 
like expected behavior if AM and tasks get launched on different NMs with 
different locally installed hadoop versions?

Jonathan Hung

On Sat, Oct 26, 2019 at 8:55 AM epa...@apache.org  wrote:

I ran a few compatibility tests between 2.10.0 and 3.3.0 (trunk)

Unfortunately, I ran into the following problem:

Running with 2.10 RM and 3.3.0 (trunk) NM fails attempts with the following 
error:

2019-10-26 15:44:06,885 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RPC$VersionMismatch):
 Protocol org.apache.hadoop.mapred.TaskUmbilicalProtocol version mismatch. 
(client = 19, server = 21)

The AM happened to launch on the 3.3.0 node.

Is this a protobuf issue? I thought we addressed that?

-Eric Payne



On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

  

Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-28 Thread epa...@apache.org
Compatibility testing has gone well for me.

- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 2.10.0
- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk
- With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, 
I ran a word count job in each cluster whose inputs and outputs were from and 
to the opposite cluster.
- I verified that HDFS replication works as expected in a trunk cluster that 
has one 2.10.0 datanode.

Thanks,
-Eric

On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
 wrote: 





Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
2.10.0 clients and datanodes. Everything worked as expected.

Jonathan Hung


On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
wrote:

> Hi Jonathan,
>
> Thanks for putting this RC together. You stated that there are
> improvements related to rolling upgrades from 2.x to 3.x and I know I have
> seen multiple JIRAs getting committed to that effect. Could you describe
> any tests that you have done to verify rolling upgrade compatibility
> for 3.x servers talking to 2.x clients and vice versa?
>
> Thanks,
>
> Eric
>
> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
> wrote:
>
>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>> (HDFS-14667). Since this is the first of a minor release, we would like to
>> get it into 2.10.0.
>>
>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>> shortly.
>>
>> Jonathan Hung
>>
>>
>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>
>> > Thanks for the effort, Jonathan!
>> >
>> > +1 (non-binding) on RC0.
>> >  - Set up a single node cluster with the binary tarball
>> >  - Run Spark Pi and pySpark job
>> >
>> > BR,
>> > Zhankun
>> >
>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko > >
>> > wrote:
>> >
>> >> +1 on RC0.
>> >> - Verified signatures
>> >> - Built from sources
>> >> - Ran unit tests for new features
>> >> - Checked artifacts on Nexus, made sure the sources are present.
>> >>
>> >> Thanks
>> >> --Konstantin
>> >>
>> >>
>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>> >> wrote:
>> >>
>> >> > Hi folks,
>> >> >
>> >> > This is the first release candidate for the first release of Apache
>> >> Hadoop
>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>> includes
>> >> > features such as:
>> >> >
>> >> > - User-defined resource types
>> >> > - Native GPU support as a schedulable resource type
>> >> > - Consistent reads from standby node
>> >> > - Namenode port based selective encryption
>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>> >> >
>> >> > The RC0 artifacts are at:
>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>> >> >
>> >> > RC tag is release-2.10.0-RC0.
>> >> >
>> >> > The maven artifacts are hosted here:
>> >> >
>> >>
>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>> >> >
>> >> > My public key is available here:
>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>> >> >
>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>> 6:00 pm
>> >> > PDT.
>> >> >
>> >> > Thanks,
>> >> > Jonathan Hung
>> >> >
>> >> > [1]
>> >> >
>> >> >
>> >>
>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>> >> >
>> >>
>> >
>>
>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC0)

2019-10-29 Thread epa...@apache.org
Jonathan,

I actually did all my testing on RC1. Sorry for the confusion. I'll respond on 
the RC1 thread.

-Eric

 On Monday, October 28, 2019, 8:00:20 PM CDT, Jonathan Hung 
 wrote: 

Thanks Eric! I sent out an RC1 earlier last week, not sure if you saw that. The 
only diff between RC1 and RC0 is HDFS-14667. If RC1 looks good to you then it'd 
be great to get your testing results on that thread.

Jonathan Hung


On Mon, Oct 28, 2019 at 1:06 PM epa...@apache.org  wrote:
> Compatibility testing has gone well for me.
> 
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 
> 2.10.0
> - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and 
> trunk
> - With one 4-node cluster running 2.10.0 and one 4-node cluster running 
> trunk, I ran a word count job in each cluster whose inputs and outputs were 
> from and to the opposite cluster.
> - I verified that HDFS replication works as expected in a trunk cluster that 
> has one 2.10.0 datanode.
> 
> Thanks,
> -Eric
> 
> On Tuesday, October 22, 2019, 8:39:38 PM CDT, Jonathan Hung 
>  wrote: 
> 
> 
> 
> 
> 
> Hi Eric, we've run some basic HDFS commands with a 3.2.1 namenode and
> 2.10.0 clients and datanodes. Everything worked as expected.
> 
> Jonathan Hung
> 
> 
> On Tue, Oct 22, 2019 at 3:04 PM Eric Badger 
> wrote:
> 
>> Hi Jonathan,
>>
>> Thanks for putting this RC together. You stated that there are
>> improvements related to rolling upgrades from 2.x to 3.x and I know I have
>> seen multiple JIRAs getting committed to that effect. Could you describe
>> any tests that you have done to verify rolling upgrade compatibility
>> for 3.x servers talking to 2.x clients and vice versa?
>>
>> Thanks,
>>
>> Eric
>>
>> On Tue, Oct 22, 2019 at 1:49 PM Jonathan Hung 
>> wrote:
>>
>>> Thanks Konstantin and Zhankun. Unfortunately a feature slipped our radar
>>> (HDFS-14667). Since this is the first of a minor release, we would like to
>>> get it into 2.10.0.
>>>
>>> HDFS-14667 has been committed to branch-2.10.0, I will be rolling an RC1
>>> shortly.
>>>
>>> Jonathan Hung
>>>
>>>
>>> On Tue, Oct 22, 2019 at 1:39 AM Zhankun Tang  wrote:
>>>
>>> > Thanks for the effort, Jonathan!
>>> >
>>> > +1 (non-binding) on RC0.
>>> >  - Set up a single node cluster with the binary tarball
>>> >  - Run Spark Pi and pySpark job
>>> >
>>> > BR,
>>> > Zhankun
>>> >
>>> > On Tue, 22 Oct 2019 at 14:31, Konstantin Shvachko >> >
>>> > wrote:
>>> >
>>> >> +1 on RC0.
>>> >> - Verified signatures
>>> >> - Built from sources
>>> >> - Ran unit tests for new features
>>> >> - Checked artifacts on Nexus, made sure the sources are present.
>>> >>
>>> >> Thanks
>>> >> --Konstantin
>>> >>
>>> >>
>>> >> On Wed, Oct 16, 2019 at 6:01 PM Jonathan Hung 
>>> >> wrote:
>>> >>
>>> >> > Hi folks,
>>> >> >
>>> >> > This is the first release candidate for the first release of Apache
>>> >> Hadoop
>>> >> > 2.10 line. It contains 361 fixes/improvements since 2.9 [1]. It
>>> includes
>>> >> > features such as:
>>> >> >
>>> >> > - User-defined resource types
>>> >> > - Native GPU support as a schedulable resource type
>>> >> > - Consistent reads from standby node
>>> >> > - Namenode port based selective encryption
>>> >> > - Improvements related to rolling upgrade support from 2.x to 3.x
>>> >> >
>>> >> > The RC0 artifacts are at:
>>> >> http://home.apache.org/~jhung/hadoop-2.10.0-RC0/
>>> >> >
>>> >> > RC tag is release-2.10.0-RC0.
>>> >> >
>>> >> > The maven artifacts are hosted here:
>>> >> >
>>> >>
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1241/
>>> >> >
>>> >> > My public key is available here:
>>> >> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>>> >> >
>>> >> > The vote will run for 5 weekdays, until Wednesday, October 23 at
>>> 6:00 pm
>>> >> > PDT.
>>> >> >
>>> >> > Thanks,
>>> >> > Jonathan Hung
>>> >> >
>>> >> > [1]
>>> >> >
>>> >> >
>>> >>
>>> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.0%20AND%20fixVersion%20not%20in%20(2.9.2%2C%202.9.1%2C%202.9.0)
>>> >> >
>>> >>
>>> >
>>>
>>
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 
> 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.10.0 (RC1)

2019-10-29 Thread epa...@apache.org
Compatibility testing has gone well for me.

 - In a 4-node cluster, I ran YARN rolling upgrade tests between 2.8.5 and 
2.10.0
- In a 4-node cluster, I ran YARN rolling upgrade tests between 2.10.0 and trunk
- With one 4-node cluster running 2.10.0 and one 4-node cluster running trunk, 
I ran a word count job in each cluster whose inputs and outputs were from and 
to the opposite cluster.
- I verified that HDFS replication works as expected in a trunk cluster that 
has one 2.10.0 datanode.

 Thanks,
-Eric


> On Tuesday, October 22, 2019, 4:55:29 PM CDT, Jonathan Hung 
>  wrote: 
> Hi folks,
> 
>This is the second release candidate for the first release of Apache Hadoop
>2.10 line. It contains 362 fixes/improvements since 2.9 [1]. It includes
>features such as:
>
> - User-defined resource types
> - Native GPU support as a schedulable resource type
> - Consistent reads from standby node
> - Namenode port based selective encryption
> - Improvements related to rolling upgrade support from 2.x to 3.x
> - Cost based fair call queue
> 
> The RC1 artifacts are at: http://home.apache.org/~jhung/hadoop-2.10.0-RC1/
> 
> RC tag is release-2.10.0-RC1.
> 
> The maven artifacts are hosted here:
> https://repository.apache.org/content/repositories/orgapachehadoop-1243/
> 
> My public key is available here:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> 
> The vote will run for 5 weekdays, until Tuesday, October 29 at 3:00 pm PDT.
> 
> Thanks,
> Jonathan Hung
 

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-15 Thread epa...@apache.org
Thanks Jonathan for opening the discussion.

I am not in favor of this proposal. 2.10 was very recently released, and moving 
to 2.10 will take some time for the community. It seems premature to make a 
decision at this point that there will never be a need for a 2.11 release.

-Eric


 On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung 
 wrote: 

Hi folks,

Given the release of 2.10.0, and the fact that it's intended to be a bridge
release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
release line in branch-2. Currently, the main issue is that there's many
fixes going into branch-2 (the theoretical 2.11.0) that's not going into
branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
likely never see the light of day unless they are backported to branch-2.10.

To do this, I propose we:

  - Delete branch-2.10
  - Rename branch-2 to branch-2.10
  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT

This way we get all the current branch-2 fixes into the 2.10.x release
line. Then the commit chain will look like: trunk -> branch-3.2 ->
branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8

Thoughts?

Jonathan Hung

[1] https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] Making 2.10 the last minor 2.x release

2019-11-19 Thread epa...@apache.org
Hi Konstantin,

Sure, I understand those concerns. On the other hand, I worry about the
stability of 2.10, since we will be on it for a couple of years at least. I 
worry
 that some committers may want to put new features into a branch 2 release,
 and without a branch-2, they will go directly into 2.10. Since we don't always
 catch corner cases or performance problems for some time (usually not until
 the release is deployed to a busy, 4-thousand node cluster), it may be very
 difficult to back out those changes.

It sounds like I'm in the minority here, so I'm not nixing the idea, but I do
 have these reservations.

Thanks,
-Eric



On Tuesday, November 19, 2019, 1:04:15 AM CST, Konstantin Shvachko 
 wrote: 
Hi Eric,

We had a long discussion on this list regarding making the 2.10 release the
last of branch-2 releases. We intended 2.10 as a bridge release between
Hadoop 2 and 3. We may have bug-fix releases or 2.10, but 2.11 is not in
the picture right now, and many people may object this idea.

I understand Jonathan's proposal as an attempt to
1. eliminate confusion which branches people should commit their back-ports
to
2. save engineering effort committing to more branches than necessary

"Branches are cheap" as our founder used to say. If we ever decide to
release 2.11 we can resurrect the branch.
Until then I am in favor of Jonathan's proposal +1.

Thanks,
--Konstantin


On Mon, Nov 18, 2019 at 10:41 AM Jonathan Hung  wrote:

> Thanks Eric for the comments - regarding your concerns, I feel the pros
> outweigh the cons. To me, the chances of patch releases on 2.10.x are much
> higher than a new 2.11 minor release. (There didn't seem to be many people
> outside of our company who expressed interest in getting new features to
> branch-2 prior to the 2.10.0 release.) Even now, a few weeks after 2.10.0
> release, there's 29 patches that have gone into branch-2 and 9 in
> branch-2.10, so it's already diverged quite a bit.
>
> In any case, we can always reverse this decision if we really need to, by
> recreating branch-2. But this proposal would reduce a lot of confusion IMO.
>
> Jonathan Hung
>
>
> On Fri, Nov 15, 2019 at 11:41 AM epa...@apache.org 
> wrote:
>
> > Thanks Jonathan for opening the discussion.
> >
> > I am not in favor of this proposal. 2.10 was very recently released, and
> > moving to 2.10 will take some time for the community. It seems premature
> to
> > make a decision at this point that there will never be a need for a 2.11
> > release.
> >
> > -Eric
> >
> >
> >  On Thursday, November 14, 2019, 8:51:59 PM CST, Jonathan Hung <
> > jyhung2...@gmail.com> wrote:
> >
> > Hi folks,
> >
> > Given the release of 2.10.0, and the fact that it's intended to be a
> bridge
> > release to Hadoop 3.x [1], I'm proposing we make 2.10.x the last minor
> > release line in branch-2. Currently, the main issue is that there's many
> > fixes going into branch-2 (the theoretical 2.11.0) that's not going into
> > branch-2.10 (which will become 2.10.1), so the fixes in branch-2 will
> > likely never see the light of day unless they are backported to
> > branch-2.10.
> >
> > To do this, I propose we:
> >
> >  - Delete branch-2.10
> >  - Rename branch-2 to branch-2.10
> >  - Set version in the new branch-2.10 to 2.10.1-SNAPSHOT
> >
> > This way we get all the current branch-2 fixes into the 2.10.x release
> > line. Then the commit chain will look like: trunk -> branch-3.2 ->
> > branch-3.1 -> branch-2.10 -> branch-2.9 -> branch-2.8
> >
> > Thoughts?
> >
> > Jonathan Hung
> >
> > [1]
> https://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg29479.html
> >
>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[ANNOUNCE] Jim Brennan is a new Hadoop Committer

2020-08-03 Thread epa...@apache.org
I am pleased to announce that Jim Brennan has accepted the invitation to become 
a Hadoop committer focusing on the YARN space.

Please reach out to Jim and welcome him in his new role.

Congratulations, Jim! Well-deserved!

-Eric Payne

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [E] Re: the v2 commit algorithm

2020-09-24 Thread epa...@apache.org
Thanks Steve and Jim for bringing this issue to our attention.

IIUC, Serial commit takes minutes with mrv1, whereas with mrv2 it is very 
quick. With this kind of performance
difference, is wise to change the default behavior for released versions of 
Hadoop? Should this be limited to
trunk?

Thanks,
-Eric Payne


On Wednesday, September 23, 2020, 2:16:14 PM CDT, Jim Brennan 
 wrote: 

I replied in the Jira.  The speed up provided by the v2 commit algorithm
is very important to us at Verizon Media (Yahoo).  Please do not remove it.
I referred to this comment from Jason Lowe on the original Jira:
https://issues.apache.org/jira/browse/MAPREDUCE-4815?focusedCommentId=14271115&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14271115

I think it would be appropriate to better document the limitations of the
v2 algorithm and possibly make it not be the default, as long as we can
still use it.

On Wed, Sep 23, 2020 at 2:07 PM Igor Dvorzhak 
wrote:

> What will be the solution for object stores to have fast and correct
> commit algorithms?
>
> On Wed, Sep 23, 2020 at 11:42 AM Steve Loughran
>  wrote:
>
>> I've got a PR up to completely remove the v2 commit algorithm
>>
>> https://github.com/apache/hadoop/pull/2320
>>
>> That may seem overkill, but while *we* know there's a small window of risk
>> (task attempt 1 failing partway through a nonatomic commit), that's not
>> known/appreciated by others.
>>
>> The patch removes the v2 codepath from FileOutputCommitter, making it a
>> lot
>> less complicated, and when v2 is requested, a warning is printed and the
>> option ignored.
>>
>> Overkill? Maybe. But it guarantees correctness
>>
>

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [ANNOUNCE] Hui Fei is a new Apache Hadoop Committer

2020-09-24 Thread epa...@apache.org
Congratulations Hui Fei!

On Wednesday, September 23, 2020, 1:07:11 PM CDT, Wei-Chiu Chuang 
 wrote: 
I am pleased to announce that Hui Fei has accepted the invitation to become
a Hadoop committer.

He started contributing to the project in October 2016. Over the past 4
years he has contributed a lot in HDFS, especially in Erasure Coding,
Hadoop 3 upgrade, RBF and Standby Serving reads.

One of the biggest contributions is Hadoop 2->3 rolling upgrade support.
This was a major blocker for any existing Hadoop users to adopt Hadoop 3.
The adoption of Hadoop 3 has gone up after this. In the past the community
discussed a lot about Hadoop 3 rolling upgrade being a must-have, but no
one took the initiative to make it happen. I am personally very grateful
for this.

The work on EC is impressive as well. He managed to onboard EC in
production at scale, fixing tricky problems. Again, I am impressed and
grateful for the contribution in EC.

In addition to code contributions, he invested a lot in the community:

>
>    - Apache Hadoop Community 2019 Beijing Meetup
>    https://blogs.apache.org/hadoop/entry/hadoop-community-meetup-beijing-aug 
>where
>    he discussed the operational experience of RBF in production
>
>
>    - Apache Hadoop Storage Community Sync Online
>    
>https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit#heading=h.irqxw1iy16zo
> where
>    he discussed the Hadoop 3 rolling upgrade support
>
>
Let's congratulate Hui for this new role!

Cheers,
Wei-Chiu Chuang (on behalf of the Apache Hadoop PMC)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] check style changes

2021-05-15 Thread epa...@apache.org
I would be fine with a discussion and vote on relaxing some checkstyle 
restrictions.

Regarding line length, my personal preference is to leave it at 80, but 80 is 
arbitrary and I would not oppose 100 if that's what people want.

Another one that I think should be relaxed is the limit on number of arguments 
to a method. I understand that a ton of arguments makes a method messy, but I 
find it irritating when I add an argument to something that is already over the 
limit and I get penalized for it. The ones I have seen are all constructor 
methods.

-Eric







On Thursday, May 13, 2021, 10:10:27 AM CDT, Sean Busbey 
 wrote: 





Hi folks!

I’d like to start cleaning up our nightly tests. As a bit of low hanging fruit 
I’d like to alter some of our check style rules to match what I think we’ve 
been doing in the community. How would folks prefer I make sure we have 
consensus on such changes?

As an example, our last nightly run had ~81k check style violations (it’s a big 
number but it’s not that bad given the size of the repo) and roughly 16% of 
those were for line lengths in excess of 80 characters but <= 100 characters.

If I wanted to change our line length check to be 100 characters rather than 
the default of 80, would folks rather I have a DISCUSS thread first? Or would 
they rather a Jira + PR with the discussion of the merits happening there?

—
busbey



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Hadoop 3.1.x EOL

2021-06-07 Thread epa...@apache.org
+1 (binding)
-Eric


   On Thursday, June 3, 2021, 1:14:51 AM CDT, Akira Ajisaka 
 wrote:  
 
 Dear Hadoop developers,

Given the feedback from the discussion thread [1], I'd like to start
an official vote
thread for the community to vote and start the 3.1 EOL process.

What this entails:

(1) an official announcement that no further regular Hadoop 3.1.x releases
will be made after 3.1.4.
(2) resolve JIRAs that specifically target 3.1.5 as won't fix.

This vote will run for 7 days and conclude by June 10th, 16:00 JST [2].

Committers are eligible to cast binding votes. Non-committers are welcomed
to cast non-binding votes.

Here is my vote, +1

[1] https://s.apache.org/w9ilb
[2] 
https://www.timeanddate.com/worldclock/fixedtime.html?msg=4&iso=20210610T16&p1=248

Regards,
Akira

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

  

Re: [VOTE] Release Apache Hadoop 3.3.1 RC3

2021-06-11 Thread epa...@apache.org
+1 (binding)
Eric


   On Tuesday, June 1, 2021, 5:29:49 AM CDT, Wei-Chiu Chuang 
 wrote:  
 
 Hi community,

This is the release candidate RC3 of Apache Hadoop 3.3.1 line. All blocker
issues have been resolved [1] again.

There are 2 additional issues resolved for RC3:
* Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after
HADOOP-16878
* Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source
and destination are the same

There are 4 issues resolved for RC2:
* HADOOP-17666. Update LICENSE for 3.3.1
* MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
* Revert "HADOOP-17563. Update Bouncy Castle to 1.68. (#2740)" (#3055)
* HADOOP-17739. Use hadoop-thirdparty 1.1.1. (#3064)

The Hadoop-thirdparty 1.1.1, as previously mentioned, contains two extra
fixes compared to hadoop-thirdparty 1.1.0:
* HADOOP-17707. Remove jaeger document from site index.
* HADOOP-17730. Add back error_prone

*RC tag is release-3.3.1-RC3
https://github.com/apache/hadoop/releases/tag/release-3.3.1-RC3

*The RC3 artifacts are at*:
https://home.apache.org/~weichiu/hadoop-3.3.1-RC3/
ARM artifacts: https://home.apache.org/~weichiu/hadoop-3.3.1-RC3-arm/

*The maven artifacts are hosted here:*
https://repository.apache.org/content/repositories/orgapachehadoop-1320/

*My public key is available here:*
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS


Things I've verified:
* all blocker issues targeting 3.3.1 have been resolved.
* stable/evolving API changes between 3.3.0 and 3.3.1 are compatible.
* LICENSE and NOTICE files checked
* RELEASENOTES and CHANGELOG
* rat check passed.
* Built HBase master branch on top of Hadoop 3.3.1 RC2, ran unit tests.
* Built Ozone master on top fo Hadoop 3.3.1 RC2, ran unit tests.
* Extra: built 50 other open source projects on top of Hadoop 3.3.1 RC2.
Had to patch some of them due to commons-lang migration (Hadoop 3.2.0) and
dependency divergence. Issues are being identified but so far nothing
blocker for Hadoop itself.

Please try the release and vote. The vote will run for 5 days.

My +1 to start,

[1] https://issues.apache.org/jira/issues/?filter=12350491
[2]
https://github.com/apache/hadoop/compare/release-3.3.1-RC1...release-3.3.1-RC3