Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)

2017-11-06 Thread Rohith Sharma K S
Thanks Subru/Arun for the great work!

Downloaded source and built from it. Deployed RM HA non-secured cluster
along with new YARN UI and ATSv2.

I am facing basic RM HA switch issue after first time successful start. *Can
anyone else is facing this issue?*

When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to
active successfully. Exception trace I see from the log is

2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector:
Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to
Active
at
org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
at
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
at
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when
transitioning to Active mode
at
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:325)
at
org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
... 4 more
Caused by: org.apache.hadoop.service.ServiceStateException:
org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode =
NoAuth
at
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:205)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1131)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1171)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1167)
at
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
... 5 more
Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
KeeperErrorCode = NoAuth
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949)
at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129)
at
org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at
org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122)
at
org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransaction.commit(ZKCuratorManager.java:403)
at
org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(ZKCuratorManager.java:372)
at
org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.getAndIncrementEpoch(ZKRMStateStore.java:493)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:754)
at
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
... 13 more

Thanks & Regards
Rohith Sharma K S

On 4 November 2017 at 04:20, Arun Suresh  wrote:

> Hi folks,
>
>  Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line and
> will be the latest stable/production release for Apache Hadoop - it
> includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug
> fixes new fixed issues since 2.8.2 .
>
>   More information about the 2.9.0 release plan can be found here:
> *https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9
>  Roadmap#Roadmap-Version2.9>*
>
>   New RC is available at:
> http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
>
>   The RC tag in git is: release-2.9.0-RC0, and the latest commit id is:
> 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
>
>   The maven artifacts are available via repository.apache.org at:
> *https://repository.apache.org/content/repositories/orgapachehadoop-1065/
> 

RE: [DISCUSS] A final minor release off branch-2?

2017-11-06 Thread Zheng, Kai
Thanks Vinod.

>> Of the top of my head, one of the biggest areas is application 
>> compatibility. When folks move from 2.x to 3.x, are their apps binary 
>> compatible? Source compatible? Or need changes?
I thought these are good concerns from overall perspective. On the other hand, 
I've discussed with quite a few 3.0 potential users, it looks like most of them 
are interested in the erasure coding feature and a major scenario for that is 
to back up their large volume of data to save storage cost. They might run 
analytics workload using Hive, Spark, Impala and Kylin on the new cluster based 
on the version, but it's not a must at the first time. They understand there 
might be some gaps so they'd migrate their workloads incrementally. For the 
major analytics workload, we've performed lots of benchmark and integration 
tests as well as other sides I believe, we did find some issues but they should 
be fixed in downstream projects. I thought the release of GA will accelerate 
the progress and expose the issues if any. We couldn't wait for it being 
matured. There isn't perfectness.

>> The main goal of the bridging release is to ease transition on stuff that is 
>> guaranteed to be broken.
This sounds a good consideration. I'm thinking if I'm a Hadoop user, for 
example, I'm using 2.7.4 or 2.8.2 or whatever 2.x version, would I first 
upgrade to this bridging release then use the bridge support to upgrade to 3.x 
version? I'm not sure. On the other hand, I might tend to look for some guides 
or supports in 3.x docs about how to upgrade from 2.7 to 3.x. 

Frankly speaking, working on some bridging release not targeting any feature 
isn't so attractive to me as a contributor. Overall, the final minor release 
off branch-2 is good, we should also give 3.x more time to evolve and mature, 
therefore it looks to me we would have to work on two release lines meanwhile 
for some time. I'd like option C), and suggest we focus on the recent releases.

Just some thoughts.

Regards,
Kai

-Original Message-
From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org] 
Sent: Tuesday, November 07, 2017 9:43 AM
To: Andrew Wang 
Cc: Arun Suresh ; common-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; Hdfs-dev ; 
mapreduce-dev@hadoop.apache.org
Subject: Re: [DISCUSS] A final minor release off branch-2?

The main goal of the bridging release is to ease transition on stuff that is 
guaranteed to be broken.

Of the top of my head, one of the biggest areas is application compatibility. 
When folks move from 2.x to 3.x, are their apps binary compatible? Source 
compatible? Or need changes?

In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be 
source compatible. This means relooking at the API compatibility in 3.x and 
their impact of migrating applications. We will have to revist and un-deprecate 
old APIs, un-delete old APIs and write documentation on how apps can be 
migrated.

Most of this work will be in 3.x line. The bridging release on the other hand 
will have deprecation for APIs that cannot be undeleted. This may be already 
have been done in many places. But we need to make sure and fill gaps if any.

Other areas that I can recall from the old days
 - Config migration: Many configs are deprecated or deleted. We need 
documentation to help folks to move. We also need deprecations in the bridging 
release for configs that cannot be undeleted.
 - You mentioned rolling-upgrades: It will be good to exactly outline the type 
of testing. For e.g., the rolling-upgrades orchestration order has direct 
implication on the testing done.
 - Story for downgrades?
 - Copying data between 2.x clusters and 3.x clusters: Does this work already? 
Is it broken anywhere that we cannot fix? Do we need bridging features for this 
work?

+Vinod

> On Nov 6, 2017, at 12:49 PM, Andrew Wang  wrote:
> 
> What are the known gaps that need bridging between 2.x and 3.x?
> 
> From an HDFS perspective, we've tested wire compat, rolling upgrade, 
> and rollback.
> 
> From a YARN perspective, we've tested wire compat and rolling upgrade. 
> Arun just mentioned an NM rollback issue that I'm not familiar with.
> 
> Anything else? External to this discussion, these should be documented 
> as known issues for 3.0.
> 
> Best.
> Andrew
> 
> On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh  wrote:
> 
>> Thanks for starting this discussion VInod.
>> 
>> I agree (C) is a bad idea.
>> I would prefer (A) given that ATM, branch-2 is still very close to
>> branch-2.9 - and it is a good time to make a collective decision to 
>> lock down commits to branch-2.
>> 
>> I think we should also clearly define what the 'bridging' release 
>> should be.
>> I assume it means the following:
>> * Any 2.x user wanting to move to 3.x must first upgrade to the 
>> bridging release first and then upgrade to the 3.x release.
>> * With 

Re: [DISCUSS] A final minor release off branch-2?

2017-11-06 Thread Vinod Kumar Vavilapalli
The main goal of the bridging release is to ease transition on stuff that is 
guaranteed to be broken.

Of the top of my head, one of the biggest areas is application compatibility. 
When folks move from 2.x to 3.x, are their apps binary compatible? Source 
compatible? Or need changes?

In 1.x -> 2.x upgrade, we did a bunch of work to atleast make old apps be 
source compatible. This means relooking at the API compatibility in 3.x and 
their impact of migrating applications. We will have to revist and un-deprecate 
old APIs, un-delete old APIs and write documentation on how apps can be 
migrated.

Most of this work will be in 3.x line. The bridging release on the other hand 
will have deprecation for APIs that cannot be undeleted. This may be already 
have been done in many places. But we need to make sure and fill gaps if any.

Other areas that I can recall from the old days
 - Config migration: Many configs are deprecated or deleted. We need 
documentation to help folks to move. We also need deprecations in the bridging 
release for configs that cannot be undeleted.
 - You mentioned rolling-upgrades: It will be good to exactly outline the type 
of testing. For e.g., the rolling-upgrades orchestration order has direct 
implication on the testing done.
 - Story for downgrades?
 - Copying data between 2.x clusters and 3.x clusters: Does this work already? 
Is it broken anywhere that we cannot fix? Do we need bridging features for this 
work?

+Vinod

> On Nov 6, 2017, at 12:49 PM, Andrew Wang  wrote:
> 
> What are the known gaps that need bridging between 2.x and 3.x?
> 
> From an HDFS perspective, we've tested wire compat, rolling upgrade, and
> rollback.
> 
> From a YARN perspective, we've tested wire compat and rolling upgrade. Arun
> just mentioned an NM rollback issue that I'm not familiar with.
> 
> Anything else? External to this discussion, these should be documented as
> known issues for 3.0.
> 
> Best.
> Andrew
> 
> On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh  wrote:
> 
>> Thanks for starting this discussion VInod.
>> 
>> I agree (C) is a bad idea.
>> I would prefer (A) given that ATM, branch-2 is still very close to
>> branch-2.9 - and it is a good time to make a collective decision to lock
>> down commits to branch-2.
>> 
>> I think we should also clearly define what the 'bridging' release should
>> be.
>> I assume it means the following:
>> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging
>> release first and then upgrade to the 3.x release.
>> * With regard to state store upgrades (at least NM state stores) the
>> bridging state stores should be aware of all new 3.x keys so the implicit
>> assumption would be that a user can only rollback from the 3.x release to
>> the bridging release and not to the old 2.x release.
>> * Use the opportunity to clean up deprecated API ?
>> * Do we even want to consider a separate bridging release for 2.7, 2.8 an
>> 2.9 lines ?
>> 
>> Cheers
>> -Arun
>> 
>> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli <
>> vino...@apache.org>
>> wrote:
>> 
>>> Hi all,
>>> 
>>> With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out
>>> (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we
>> have
>>> a discussion on how we manage our developmental bandwidth between 2.x
>> line
>>> and 3.x lines.
>>> 
>>> Once 3.0 GA goes out, we will have two parallel and major release lines.
>>> The last time we were in this situation was back when we did 1.x -> 2.x
>>> jump.
>>> 
>>> The parallel releases implies overhead of decisions, branch-merges and
>>> back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1,
>>> 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of
>>> these lines - for e.g 2.8, 2.9 - are going to be used for a while at a
>>> bunch of large sites! At the same time, our users won't migrate to 3.0 GA
>>> overnight - so we do have to support two parallel lines.
>>> 
>>> I propose we start thinking of the fate of branch-2. The idea is to have
>>> one final release that helps our users migrate from 2.x to 3.x. This
>>> includes any changes on the older line to bridge compatibility issues,
>>> upgrade issues, layout changes, tooling etc.
>>> 
>>> We have a few options I think
>>> (A)
>>>-- Make 2.9.x the last minor release off branch-2
>>>-- Have a maintenance release that bridges 2.9 to 3.x
>>>-- Continue to make more maintenance releases on 2.8 and 2.9 as
>>> necessary
>>>-- All new features obviously only go into the 3.x line as no
>> features
>>> can go into the maint line.
>>> 
>>> (B)
>>>-- Create a new 2.10 release which doesn't have any new features, but
>>> as a bridging release
>>>-- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as
>>> necessary
>>>-- All new features, other than the bridging changes, go into the 3.x
>>> line
>>> 
>>> (C)
>>>-- Continue making branch-2 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-11-06 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/

[Nov 6, 2017 7:28:38 AM] (naganarasimha_gr) MAPREDUCE-6975. Logging task 
counters. Contributed by Prabhu Joseph.




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Unreaped Processes :

   hadoop-mapreduce-client-jobclient:1 

Failed junit tests :

   hadoop.hdfs.TestReadStripedFileWithMissingBlocks 
   hadoop.mapreduce.v2.TestMRJobs 
   hadoop.fs.TestDFSIO 
   hadoop.mapreduce.v2.TestMiniMRProxyUser 
   hadoop.mapreduce.v2.TestSpeculativeExecution 
   hadoop.mapreduce.lib.join.TestJoinDatamerge 
   hadoop.mapred.lib.TestDelegatingInputFormat 
   hadoop.mapreduce.lib.input.TestDelegatingInputFormat 
   hadoop.mapreduce.security.ssl.TestEncryptedShuffle 
   hadoop.mapred.join.TestDatamerge 
   hadoop.mapreduce.v2.TestUberAM 
   hadoop.mapreduce.lib.join.TestJoinProperties 
   hadoop.mapreduce.security.TestBinaryTokenFile 
   hadoop.streaming.TestMultipleArchiveFiles 
   hadoop.streaming.TestMultipleCachefiles 
   hadoop.streaming.TestSymLink 
   hadoop.contrib.utils.join.TestDataJoin 
   hadoop.yarn.sls.nodemanager.TestNMSimulator 

Timed out junit tests :

   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestReservationSystemWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
   org.apache.hadoop.mapred.pipes.TestPipeApplication 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-compile-javac-root.txt
  [280K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-checkstyle-root.txt
  [17M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/whitespace-eol.txt
  [8.8M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/whitespace-tabs.txt
  [288K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/diff-javadoc-javadoc-root.txt
  [760K]

   UnreapedProcessesLog:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-reaper.txt
  [4.0K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [232K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [64K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt
  [212K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-tools_hadoop-streaming.txt
  [32K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-tools_hadoop-datajoin.txt
  [780K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
  [16K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/582/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Vinod Kumar Vavilapalli
Congratulations to all the contributors involved, this is a great step forward!

+Vinod

> On Nov 6, 2017, at 2:40 PM, Jian He  wrote:
> 
> Okay, I just merged the branch to trunk (108 commits in total !)
> Again, thanks for all who contributed to this feature!
> 
> Jian
> 
> On Nov 6, 2017, at 1:26 PM, Jian He 
> > wrote:
> 
> Here’s +1 from myself.
> The vote passes with 7 (+1) bindings and 2 (+1) non-bindings.
> 
> Thanks for all who voted. I’ll merge to trunk by the end of today.
> 
> Jian
> 
> On Nov 6, 2017, at 8:38 AM, Billie Rinaldi 
> > wrote:
> 
> +1 (binding)
> 
> On Mon, Oct 30, 2017 at 1:06 PM, Jian He 
> > wrote:
> Hi All,
> 
> I would like to restart the vote for merging yarn-native-services to trunk.
> Since last vote, we have been working on several issues in documentation, 
> DNS, CLI modifications etc. We believe now the feature is in a much better 
> shape.
> 
> Some back ground:
> At a high level, the following are the key feautres implemented.
> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate 
> existing services to YARN either docker or non-docker based.
> - YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to 
> deploy a service via a simple JSON spec
> - YARN-4757[3]. Extending today's service registry with a simple DNS service 
> to enable users to discover services deployed on YARN via standard DNS lookup
> - YARN-6419[4]. UI support for native-services on the new YARN UI
> All these new services are optional and are sitting outside of the existing 
> system, and have no impact on existing system if disabled.
> 
> Special thanks to a team of folks who worked hard towards this: Billie 
> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma K 
> S, Sunil G, Akhil PB, Eric Yang. This effort could not be possible without 
> their ideas and hard work.
> Also thanks Allen for some review and verifications.
> 
> Thanks,
> Jian
> 
> [1] https://issues.apache.org/jira/browse/YARN-5079
> [2] https://issues.apache.org/jira/browse/YARN-4793
> [3] https://issues.apache.org/jira/browse/YARN-4757
> [4] https://issues.apache.org/jira/browse/YARN-6419
> 
> 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Jian He
Okay, I just merged the branch to trunk (108 commits in total !)
Again, thanks for all who contributed to this feature!

Jian

On Nov 6, 2017, at 1:26 PM, Jian He 
> wrote:

Here’s +1 from myself.
The vote passes with 7 (+1) bindings and 2 (+1) non-bindings.

Thanks for all who voted. I’ll merge to trunk by the end of today.

Jian

On Nov 6, 2017, at 8:38 AM, Billie Rinaldi 
> wrote:

+1 (binding)

On Mon, Oct 30, 2017 at 1:06 PM, Jian He 
> wrote:
Hi All,

I would like to restart the vote for merging yarn-native-services to trunk.
Since last vote, we have been working on several issues in documentation, DNS, 
CLI modifications etc. We believe now the feature is in a much better shape.

Some back ground:
At a high level, the following are the key feautres implemented.
- YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate 
existing services to YARN either docker or non-docker based.
- YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to deploy 
a service via a simple JSON spec
- YARN-4757[3]. Extending today's service registry with a simple DNS service to 
enable users to discover services deployed on YARN via standard DNS lookup
- YARN-6419[4]. UI support for native-services on the new YARN UI
All these new services are optional and are sitting outside of the existing 
system, and have no impact on existing system if disabled.

Special thanks to a team of folks who worked hard towards this: Billie Rinaldi, 
Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma K S, Sunil G, 
Akhil PB, Eric Yang. This effort could not be possible without their ideas and 
hard work.
Also thanks Allen for some review and verifications.

Thanks,
Jian

[1] https://issues.apache.org/jira/browse/YARN-5079
[2] https://issues.apache.org/jira/browse/YARN-4793
[3] https://issues.apache.org/jira/browse/YARN-4757
[4] https://issues.apache.org/jira/browse/YARN-6419





Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Jian He
Here’s +1 from myself.
The vote passes with 7 (+1) bindings and 2 (+1) non-bindings.

Thanks for all who voted. I’ll merge to trunk by the end of today.

Jian

On Nov 6, 2017, at 8:38 AM, Billie Rinaldi 
> wrote:

+1 (binding)

On Mon, Oct 30, 2017 at 1:06 PM, Jian He 
> wrote:
Hi All,

I would like to restart the vote for merging yarn-native-services to trunk.
Since last vote, we have been working on several issues in documentation, DNS, 
CLI modifications etc. We believe now the feature is in a much better shape.

Some back ground:
At a high level, the following are the key feautres implemented.
- YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate 
existing services to YARN either docker or non-docker based.
- YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to deploy 
a service via a simple JSON spec
- YARN-4757[3]. Extending today's service registry with a simple DNS service to 
enable users to discover services deployed on YARN via standard DNS lookup
- YARN-6419[4]. UI support for native-services on the new YARN UI
All these new services are optional and are sitting outside of the existing 
system, and have no impact on existing system if disabled.

Special thanks to a team of folks who worked hard towards this: Billie Rinaldi, 
Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma K S, Sunil G, 
Akhil PB, Eric Yang. This effort could not be possible without their ideas and 
hard work.
Also thanks Allen for some review and verifications.

Thanks,
Jian

[1] https://issues.apache.org/jira/browse/YARN-5079
[2] https://issues.apache.org/jira/browse/YARN-4793
[3] https://issues.apache.org/jira/browse/YARN-4757
[4] https://issues.apache.org/jira/browse/YARN-6419




Re: [DISCUSS] A final minor release off branch-2?

2017-11-06 Thread Andrew Wang
What are the known gaps that need bridging between 2.x and 3.x?

>From an HDFS perspective, we've tested wire compat, rolling upgrade, and
rollback.

>From a YARN perspective, we've tested wire compat and rolling upgrade. Arun
just mentioned an NM rollback issue that I'm not familiar with.

Anything else? External to this discussion, these should be documented as
known issues for 3.0.

Best.
Andrew

On Sun, Nov 5, 2017 at 1:46 PM, Arun Suresh  wrote:

> Thanks for starting this discussion VInod.
>
> I agree (C) is a bad idea.
> I would prefer (A) given that ATM, branch-2 is still very close to
> branch-2.9 - and it is a good time to make a collective decision to lock
> down commits to branch-2.
>
> I think we should also clearly define what the 'bridging' release should
> be.
> I assume it means the following:
> * Any 2.x user wanting to move to 3.x must first upgrade to the bridging
> release first and then upgrade to the 3.x release.
> * With regard to state store upgrades (at least NM state stores) the
> bridging state stores should be aware of all new 3.x keys so the implicit
> assumption would be that a user can only rollback from the 3.x release to
> the bridging release and not to the old 2.x release.
> * Use the opportunity to clean up deprecated API ?
> * Do we even want to consider a separate bridging release for 2.7, 2.8 an
> 2.9 lines ?
>
> Cheers
> -Arun
>
> On Fri, Nov 3, 2017 at 5:07 PM, Vinod Kumar Vavilapalli <
> vino...@apache.org>
> wrote:
>
> > Hi all,
> >
> > With 3.0.0 GA around the corner (tx for the push, Andrew!), 2.9.0 RC out
> > (tx Arun / Subru!) and 2.8.2 (tx Junping!), I think it's high time we
> have
> > a discussion on how we manage our developmental bandwidth between 2.x
> line
> > and 3.x lines.
> >
> > Once 3.0 GA goes out, we will have two parallel and major release lines.
> > The last time we were in this situation was back when we did 1.x -> 2.x
> > jump.
> >
> > The parallel releases implies overhead of decisions, branch-merges and
> > back-ports. Right now we already do backports for 2.7.5, 2.8.2, 2.9.1,
> > 3.0.1 and potentially a 3.1.0 in a few months after 3.0.0 GA. And many of
> > these lines - for e.g 2.8, 2.9 - are going to be used for a while at a
> > bunch of large sites! At the same time, our users won't migrate to 3.0 GA
> > overnight - so we do have to support two parallel lines.
> >
> > I propose we start thinking of the fate of branch-2. The idea is to have
> > one final release that helps our users migrate from 2.x to 3.x. This
> > includes any changes on the older line to bridge compatibility issues,
> > upgrade issues, layout changes, tooling etc.
> >
> > We have a few options I think
> >  (A)
> > -- Make 2.9.x the last minor release off branch-2
> > -- Have a maintenance release that bridges 2.9 to 3.x
> > -- Continue to make more maintenance releases on 2.8 and 2.9 as
> > necessary
> > -- All new features obviously only go into the 3.x line as no
> features
> > can go into the maint line.
> >
> >  (B)
> > -- Create a new 2.10 release which doesn't have any new features, but
> > as a bridging release
> > -- Continue to make more maintenance releases on 2.8, 2.9 and 2.10 as
> > necessary
> > -- All new features, other than the bridging changes, go into the 3.x
> > line
> >
> >  (C)
> > -- Continue making branch-2 releases and postpone this discussion for
> > later
> >
> > I'm leaning towards (A) or to a lesser extent (B). Willing to hear
> > otherwise.
> >
> > Now, this obviously doesn't mean blocking of any more minor releases on
> > branch-2. Obviously, any interested committer / PMC can roll up his/her
> > sleeves, create a release plan and release, but we all need to
> acknowledge
> > that versions are not cheap and figure out how the community bandwidth is
> > split overall.
> >
> > Thanks
> > +Vinod
> > PS: The proposal is obviously not to force everyone to go in one
> direction
> > but more of a nudging the community to figure out if we can focus a major
> > part of of our bandwidth on one line. I had a similar concern when we
> were
> > doing 2.8 and 3.0 in parallel, but the impending possibility of spreading
> > too thin is much worse IMO.
> > PPS: (C) is a bad choice. With 2.8 and 2.9 we are already seeing user
> > adoption splintering between two lines. With 2.10, 2.11 etc coexisting
> with
> > 3.0, 3.1 etc, we will revisit the mad phase years ago when we had 0.20.x,
> > 0.20-security coexisting with 0.21, 0.22 etc.
>


Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2017-11-06 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/32/

[Nov 6, 2017 7:40:41 AM] (naganarasimha_gr) MAPREDUCE-6975. Logging task 
counters. Contributed by Prabhu Joseph.


[Error replacing 'FILE' - Workspace is not accessible]

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)

2017-11-06 Thread Carlo Aldo Curino
+1 from me.

I have:
1) setup a small cluster
2) enabled the reservation system
3) submitted reservation through REST
4) ran a job within reservation
5) did few negative tests (what happens if res system is disabled and you
try to reserve, what happens if a job is submitted with invalid res id,
etc.)

Everything I tested worked as expected.

Cheers,
Carlo

On Nov 6, 2017 7:06 AM, "Arun Suresh"  wrote:

> Here is my +1 to start.
>
> - Setup a small 4 node cluster
> - Verified some basic HDFS commands
> - Ran Pi / sleep jobs (with some mix of Opportunistic containers - both
> distributed and centralized scheduling)
>
> Cheers
> -Arun
>
>
> On Fri, Nov 3, 2017 at 4:38 PM, Arun Suresh  wrote:
>
> > Hey Vinod,
> >
> > I've cleaned up the RC directory as you requested.
> >
> > Cheers
> > -Arun
> >
> > On Fri, Nov 3, 2017 at 4:09 PM, Vinod Kumar Vavilapalli <
> > vino...@apache.org> wrote:
> >
> >> Arun / Subru,
> >>
> >> Thanks for the great work!
> >>
> >> Few quick comments
> >>  - Can you cleanup the RC folder to only have tar.gz and src.tar.gz and
> >> their signatures and delete everything else? So that it's easy to pick
> up
> >> the important bits for the voters. For e.g, like this
> >> http://people.apache.org/~vinodkv/hadoop-2.8.1-RC3/
> >>  - Can you put the generated CHANGES.html and releasenotes.html instead
> >> of the md files, for quicker perusal?
> >>
> >> Thanks
> >> +Vinod
> >>
> >> On Nov 3, 2017, at 3:50 PM, Arun Suresh  wrote:
> >>
> >> Hi folks,
> >>
> >> Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line
> and
> >> will be the latest stable/production release for Apache Hadoop - it
> >> includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug
> >> fixes new fixed issues since 2.8.2 .
> >>
> >>  More information about the 2.9.0 release plan can be found here:
> >> *https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#
> >> Roadmap-Version2.9
> >>  >> Roadmap-Version2.9>*
> >>
> >>  New RC is available at:
> >> http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
> >>
> >>  The RC tag in git is: release-2.9.0-RC0, and the latest commit id
> is:
> >> 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
> >>
> >>  The maven artifacts are available via repository.apache.org at:
> >> *https://repository.apache.org/content/repositories/
> orgapachehadoop-1065/
> >>  orgapachehadoop-1065/
> >> >*
> >>
> >>  Please try the release and vote; the vote will run for the usual 5
> >> days, ending on 11/10/2017 4pm PST time.
> >>
> >> Thanks,
> >>
> >> Arun/Subru
> >>
> >>
> >>
> >
>


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Billie Rinaldi
+1 (binding)

On Mon, Oct 30, 2017 at 1:06 PM, Jian He  wrote:

> Hi All,
>
> I would like to restart the vote for merging yarn-native-services to trunk.
> Since last vote, we have been working on several issues in documentation,
> DNS, CLI modifications etc. We believe now the feature is in a much better
> shape.
>
> Some back ground:
> At a high level, the following are the key feautres implemented.
> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate
> existing services to YARN either docker or non-docker based.
> - YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to
> deploy a service via a simple JSON spec
> - YARN-4757[3]. Extending today's service registry with a simple DNS
> service to enable users to discover services deployed on YARN via standard
> DNS lookup
> - YARN-6419[4]. UI support for native-services on the new YARN UI
> All these new services are optional and are sitting outside of the
> existing system, and have no impact on existing system if disabled.
>
> Special thanks to a team of folks who worked hard towards this: Billie
> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma
> K S, Sunil G, Akhil PB, Eric Yang. This effort could not be possible
> without their ideas and hard work.
> Also thanks Allen for some review and verifications.
>
> Thanks,
> Jian
>
> [1] https://issues.apache.org/jira/browse/YARN-5079
> [2] https://issues.apache.org/jira/browse/YARN-4793
> [3] https://issues.apache.org/jira/browse/YARN-4757
> [4] https://issues.apache.org/jira/browse/YARN-6419
>


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Gour Saha
+1 (non-binding)


-Gour

On 10/30/17, 1:49 PM, "Jian He"  wrote:

>Few more things:
>
>This is the document for trying a non-docker service on YARN.
>https://github.com/apache/hadoop/blob/yarn-native-services/hadoop-yarn-pro
>ject/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/QuickStar
>t.md
>
>And the document for a docker based service
>https://github.com/apache/hadoop/blob/yarn-native-services/hadoop-yarn-pro
>ject/hadoop-yarn/hadoop-yarn-site/src/site/markdown/yarn-service/Examples.
>md
>
>And the vote lasts 7 days as usual.
>
>Thanks,
>Jian
>
>On Oct 30, 2017, at 1:06 PM, Jian He
>> wrote:
>
>Hi All,
>
>I would like to restart the vote for merging yarn-native-services to
>trunk.
>Since last vote, we have been working on several issues in documentation,
>DNS, CLI modifications etc. We believe now the feature is in a much
>better shape.
>
>Some back ground:
>At a high level, the following are the key feautres implemented.
>- YARN-5079[1]. A native YARN framework (ApplicationMaster) to
>orchestrate existing services to YARN either docker or non-docker based.
>- YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to
>deploy a service via a simple JSON spec
>- YARN-4757[3]. Extending today's service registry with a simple DNS
>service to enable users to discover services deployed on YARN via
>standard DNS lookup
>- YARN-6419[4]. UI support for native-services on the new YARN UI
>All these new services are optional and are sitting outside of the
>existing system, and have no impact on existing system if disabled.
>
>Special thanks to a team of folks who worked hard towards this: Billie
>Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith
>Sharma K S, Sunil G, Akhil PB, Eric Yang. This effort could not be
>possible without their ideas and hard work.
>Also thanks Allen for some review and verifications.
>
>Thanks,
>Jian
>
>[1] https://issues.apache.org/jira/browse/YARN-5079
>[2] https://issues.apache.org/jira/browse/YARN-4793
>[3] https://issues.apache.org/jira/browse/YARN-4757
>[4] https://issues.apache.org/jira/browse/YARN-6419
>


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)

2017-11-06 Thread Arun Suresh
Here is my +1 to start.

- Setup a small 4 node cluster
- Verified some basic HDFS commands
- Ran Pi / sleep jobs (with some mix of Opportunistic containers - both
distributed and centralized scheduling)

Cheers
-Arun


On Fri, Nov 3, 2017 at 4:38 PM, Arun Suresh  wrote:

> Hey Vinod,
>
> I've cleaned up the RC directory as you requested.
>
> Cheers
> -Arun
>
> On Fri, Nov 3, 2017 at 4:09 PM, Vinod Kumar Vavilapalli <
> vino...@apache.org> wrote:
>
>> Arun / Subru,
>>
>> Thanks for the great work!
>>
>> Few quick comments
>>  - Can you cleanup the RC folder to only have tar.gz and src.tar.gz and
>> their signatures and delete everything else? So that it's easy to pick up
>> the important bits for the voters. For e.g, like this
>> http://people.apache.org/~vinodkv/hadoop-2.8.1-RC3/
>>  - Can you put the generated CHANGES.html and releasenotes.html instead
>> of the md files, for quicker perusal?
>>
>> Thanks
>> +Vinod
>>
>> On Nov 3, 2017, at 3:50 PM, Arun Suresh  wrote:
>>
>> Hi folks,
>>
>> Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line and
>> will be the latest stable/production release for Apache Hadoop - it
>> includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug
>> fixes new fixed issues since 2.8.2 .
>>
>>  More information about the 2.9.0 release plan can be found here:
>> *https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#
>> Roadmap-Version2.9
>> > Roadmap-Version2.9>*
>>
>>  New RC is available at:
>> http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
>>
>>  The RC tag in git is: release-2.9.0-RC0, and the latest commit id is:
>> 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
>>
>>  The maven artifacts are available via repository.apache.org at:
>> *https://repository.apache.org/content/repositories/orgapachehadoop-1065/
>> > >*
>>
>>  Please try the release and vote; the vote will run for the usual 5
>> days, ending on 11/10/2017 4pm PST time.
>>
>> Thanks,
>>
>> Arun/Subru
>>
>>
>>
>


Re: [VOTE] Merge yarn-native-services branch into trunk

2017-11-06 Thread Rohith Sharma K S
+1 (binding) thanks Jian for all the great work!

Built from branch and deployed, able to bring up services along with atsv2
enabled and new YARN UI integration.
Tried flexing, start, stop operations using REST api's.

- Rohith Sharma K S

On 31 October 2017 at 01:36, Jian He  wrote:

> Hi All,
>
> I would like to restart the vote for merging yarn-native-services to trunk.
> Since last vote, we have been working on several issues in documentation,
> DNS, CLI modifications etc. We believe now the feature is in a much better
> shape.
>
> Some back ground:
> At a high level, the following are the key feautres implemented.
> - YARN-5079[1]. A native YARN framework (ApplicationMaster) to orchestrate
> existing services to YARN either docker or non-docker based.
> - YARN-4793[2]. A Rest API service embeded in RM (optional)  for user to
> deploy a service via a simple JSON spec
> - YARN-4757[3]. Extending today's service registry with a simple DNS
> service to enable users to discover services deployed on YARN via standard
> DNS lookup
> - YARN-6419[4]. UI support for native-services on the new YARN UI
> All these new services are optional and are sitting outside of the
> existing system, and have no impact on existing system if disabled.
>
> Special thanks to a team of folks who worked hard towards this: Billie
> Rinaldi, Gour Saha, Vinod Kumar Vavilapalli, Jonathan Maron, Rohith Sharma
> K S, Sunil G, Akhil PB, Eric Yang. This effort could not be possible
> without their ideas and hard work.
> Also thanks Allen for some review and verifications.
>
> Thanks,
> Jian
>
> [1] https://issues.apache.org/jira/browse/YARN-5079
> [2] https://issues.apache.org/jira/browse/YARN-4793
> [3] https://issues.apache.org/jira/browse/YARN-4757
> [4] https://issues.apache.org/jira/browse/YARN-6419
>


[jira] [Created] (MAPREDUCE-7003) Indefinite retries of getJobSummary() if a job summary file is corrupt

2017-11-06 Thread Attila Sasvari (JIRA)
Attila Sasvari created MAPREDUCE-7003:
-

 Summary: Indefinite retries of getJobSummary() if a job summary 
file is corrupt
 Key: MAPREDUCE-7003
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7003
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Reporter: Attila Sasvari


Having a corrupt job summary file in the {{/user/history/done_intermediate}} 
directory in HDFS, e.g. 
{{/user/history/done_intermediate/oozie/job_1_11.summary}} 
before moving it to {{/user/history/done}}, results in indefinite retries of 
{{org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary()}}. JHS 
will log recurring exceptions like:
{code}
2017-11-03 01:01:01,124 WARN org.apache.hadoop.hdfs.BlockReaderFactory: I/O 
error constructing remote block reader.
java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, 
self=/ABC.DEF.GHI:JKLMN, remote=/ABC.DEF.GHI:JKLMN, for file 
/user/history/done_intermediate/admin/job_1_.summary, for pool 
XX-9-ABC.DEF.GHI-1 block 11_2
at 
org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
at 
org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:652)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:879)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:932)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:732)
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:337)
at java.io.DataInputStream.readUTF(DataInputStream.java:589)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at 
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary(HistoryFileManager.java:1059)
{code}
(INFO and ERROR logs are omitted)

To reproduce it:
- start JHS in debug mode (use JVM parameter 
{{-agentlib:jdwp=transport=dt_socket,server=y,address=4,suspend=n}} when 
starting it)
- attach debugger to the process and add a break point to stop in 
{{org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getJobSummary()}}
- start a map reduce job and wait until breakpoint is hit
- delete or rename physical block on the datanode(s) for the job summary file 
(e.g. use {{hdfs fsck 
/user/history/done_intermediate/oozie/job_1_11.summary -blocks 
-locations -files}} to get the block name; search for the block the on 
datanode(s) and remove/ rename it)
- detach debugger
- examine JHS log files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org