I was waiting for Daniel to post the minutes from YARN meetup to talk about 
this. Anyways, in that discussion, we identified a bunch of key upgrade related 
scenarios that no-one seems to have validated - atleast from the representation 
in the YARN meetup. I'm going to create a wiki-page listing all these scenarios.

But back to the bug that Junping raised. At this point, we don't have a clear 
path towards running 2.x applications on 3.0.0 clusters. So, our claim of 
rolling-upgrades already working is not accurate.

One of the two options that Junping proposed should be pursued before we close 
the release. I'm in favor of calling out rolling-upgrade support be with-drawn 
or caveated and push for progress instead of blocking the release.

Thanks
+Vinod

> On Dec 12, 2017, at 5:44 PM, Junping Du <j...@hortonworks.com> wrote:
> 
> Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just get 
> chance to validate new RC now.
> 
> Basically, I found two critical issues with the same rolling upgrade scenario 
> as where HADOOP-15059 get found previously:
> HDFS-12920, we changed value format for some hdfs configurations that old 
> version MR client doesn't understand when fetching these configurations. Some 
> quick workarounds are to add old value (without time unit) in hdfs-site.xml 
> to override new default values but will generate many annoying warnings. I 
> provided my fix suggestions on the JIRA already for more discussion.
> The other one is YARN-7646. After we workaround HDFS-12920, will hit the 
> issue that old version MR AppMaster cannot communicate with new version of 
> YARN RM - could be related to resource profile changes from YARN side but 
> root cause are still in investigation.
> 
> The first issue may not belong to a blocker given we can workaround this 
> without code change. I am not sure if we can workaround 2nd issue so far. If 
> not, we may have to fix this or compromise with withdrawing support of 
> rolling upgrade or calling it a stable release.
> 
> 
> Thanks,
> 
> Junping
> 
> ________________________________________
> From: Robert Kanter <rkan...@cloudera.com>
> Sent: Tuesday, December 12, 2017 3:10 PM
> To: Arun Suresh
> Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron T. 
> Myers; common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
> yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org
> Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
> 
> +1 (binding)
> 
> + Downloaded the binary release
> + Deployed on a 3 node cluster on CentOS 7.3
> + Ran some MR jobs, clicked around the UI, etc
> + Ran some CLI commands (yarn logs, etc)
> 
> Good job everyone on Hadoop 3!
> 
> 
> - Robert
> 
> On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh <asur...@apache.org> wrote:
> 
>> +1 (binding)
>> 
>> - Verified signatures of the source tarball.
>> - built from source - using the docker build environment.
>> - set up a pseudo-distributed test cluster.
>> - ran basic HDFS commands
>> - ran some basic MR jobs
>> 
>> Cheers
>> -Arun
>> 
>> On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang <andrew.w...@cloudera.com>
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> As a reminder, this vote closes tomorrow at 12:31pm, so please give it a
>>> whack if you have time. There are already enough binding +1s to pass this
>>> vote, but it'd be great to get additional validation.
>>> 
>>> Thanks to everyone who's voted thus far!
>>> 
>>> Best,
>>> Andrew
>>> 
>>> 
>>> 
>>> On Tue, Dec 12, 2017 at 11:08 AM, Lei Xu <l...@cloudera.com> wrote:
>>> 
>>>> +1 (binding)
>>>> 
>>>> * Verified src tarball and bin tarball, verified md5 of each.
>>>> * Build source with -Pdist,native
>>>> * Started a pseudo cluster
>>>> * Run ec -listPolicies / -getPolicy / -setPolicy on /  , and run hdfs
>>>> dfs put/get/cat on "/" with XOR-2-1 policy.
>>>> 
>>>> Thanks Andrew for this great effort!
>>>> 
>>>> Best,
>>>> 
>>>> 
>>>> On Tue, Dec 12, 2017 at 9:55 AM, Andrew Wang <andrew.w...@cloudera.com
>>> 
>>>> wrote:
>>>>> Hi Wei-Chiu,
>>>>> 
>>>>> The patchprocess directory is left over from the create-release
>>> process,
>>>>> and it looks empty to me. We should still file a create-release JIRA
>> to
>>>> fix
>>>>> this, but I think this is not a blocker. Would you agree?
>>>>> 
>>>>> Best,
>>>>> Andrew
>>>>> 
>>>>> On Tue, Dec 12, 2017 at 9:44 AM, Wei-Chiu Chuang <
>> weic...@cloudera.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hi Andrew, thanks the tremendous effort.
>>>>>> I found an empty "patchprocess" directory in the source tarball,
>> that
>>> is
>>>>>> not there if you clone from github. Any chance you might have some
>>>> leftover
>>>>>> trash when you made the tarball?
>>>>>> Not wanting to nitpicking, but you might want to double check so we
>>>> don't
>>>>>> ship anything private to you in public :)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Dec 12, 2017 at 7:48 AM, Ajay Kumar <
>>> ajay.ku...@hortonworks.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> +1 (non-binding)
>>>>>>> Thanks for driving this, Andrew Wang!!
>>>>>>> 
>>>>>>> - downloaded the src tarball and verified md5 checksum
>>>>>>> - built from source with jdk 1.8.0_111-b14
>>>>>>> - brought up a pseudo distributed cluster
>>>>>>> - did basic file system operations (mkdir, list, put, cat) and
>>>>>>> confirmed that everything was working
>>>>>>> - Run word count, pi and DFSIOTest
>>>>>>> - run hdfs and yarn, confirmed that the NN, RM web UI worked
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Ajay
>>>>>>> 
>>>>>>> On 12/11/17, 9:35 PM, "Xiao Chen" <x...@cloudera.com> wrote:
>>>>>>> 
>>>>>>>    +1 (binding)
>>>>>>> 
>>>>>>>    - downloaded src tarball, verified md5
>>>>>>>    - built from source with jdk1.8.0_112
>>>>>>>    - started a pseudo cluster with hdfs and kms
>>>>>>>    - sanity checked encryption related operations working
>>>>>>>    - sanity checked webui and logs.
>>>>>>> 
>>>>>>>    -Xiao
>>>>>>> 
>>>>>>>    On Mon, Dec 11, 2017 at 6:10 PM, Aaron T. Myers <
>> a...@apache.org>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> +1 (binding)
>>>>>>>> 
>>>>>>>> - downloaded the src tarball and built the source (-Pdist
>>>> -Pnative)
>>>>>>>> - verified the checksum
>>>>>>>> - brought up a secure pseudo distributed cluster
>>>>>>>> - did some basic file system operations (mkdir, list, put,
>> cat)
>>>> and
>>>>>>>> confirmed that everything was working
>>>>>>>> - confirmed that the web UI worked
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Aaron
>>>>>>>> 
>>>>>>>> On Fri, Dec 8, 2017 at 12:31 PM, Andrew Wang <
>>>>>>> andrew.w...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> Let me start, as always, by thanking the efforts of all the
>>>>>>> contributors
>>>>>>>>> who contributed to this release, especially those who
>> jumped
>>> on
>>>>>>> the
>>>>>>>> issues
>>>>>>>>> found in RC0.
>>>>>>>>> 
>>>>>>>>> I've prepared RC1 for Apache Hadoop 3.0.0. This release
>>>>>>> incorporates 302
>>>>>>>>> fixed JIRAs since the previous 3.0.0-beta1 release.
>>>>>>>>> 
>>>>>>>>> You can find the artifacts here:
>>>>>>>>> 
>>>>>>>>> http://home.apache.org/~wang/3.0.0-RC1/
>>>>>>>>> 
>>>>>>>>> I've done the traditional testing of building from the
>> source
>>>>>>> tarball and
>>>>>>>>> running a Pi job on a single node cluster. I also verified
>>> that
>>>>>>> the
>>>>>>>> shaded
>>>>>>>>> jars are not empty.
>>>>>>>>> 
>>>>>>>>> Found one issue that create-release (probably due to the
>> mvn
>>>>>>> deploy
>>>>>>>> change)
>>>>>>>>> didn't sign the artifacts, but I fixed that by calling mvn
>>> one
>>>>>>> more time.
>>>>>>>>> Available here:
>>>>>>>>> 
>>>>>>>>> https://repository.apache.org/
>> content/repositories/orgapache
>>>>>>> hadoop-1075/
>>>>>>>>> 
>>>>>>>>> This release will run the standard 5 days, closing on Dec
>>> 13th
>>>> at
>>>>>>> 12:31pm
>>>>>>>>> Pacific. My +1 to start.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ------------------------------------------------------------
>>> ---------
>>>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lei (Eddy) Xu
>>>> Software Engineer, Cloudera
>>>> 
>>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Reply via email to