[jira] [Created] (YARN-6106) Add doc for tag 'allowPreemptionFrom' in Fair Scheduler

2017-01-17 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-6106:
--

 Summary: Add doc for tag 'allowPreemptionFrom' in Fair Scheduler
 Key: YARN-6106
 URL: https://issues.apache.org/jira/browse/YARN-6106
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Yufei Gu
Assignee: Yufei Gu
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6105) Support for new REST end point /clusterids

2017-01-17 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-6105:
---

 Summary: Support for new REST end point /clusterids
 Key: YARN-6105
 URL: https://issues.apache.org/jira/browse/YARN-6105
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Rohith Sharma K S


As discussed in YARN-5378 and YARN-6095, it is required to have */clusterids* 
that returns list of clusterids that back end has is useful. 

Use case : In cloud, clusters are arbitrarily spin up and destroyed. Each 
cluster has its own clusterId which UI never knows about it. To all those newly 
spin up cluster, same ATS server has been used. And sam web UI has been used. 
Admin can select the clusterId and navigate to any pages. So, it is worth to 
list ClusterId's from ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Release cadence and EOL

2017-01-17 Thread Karthik Kambatla
+1

I would also like to see some process guidelines. I should have brought
this up on the discussion thread, but I thought of them only now :(

   - Is an RM responsible for all maintenance releases against a minor
   release, or finding another RM to drive a maintenance release? In the past,
   this hasn't been a major issue.
   - When do we pick/volunteer to RM a minor release? IMO, this should be
   right after the previous release goes out. For example, Junping is driving
   2.8.0 now. As soon as that is done, we need to find a volunteer to RM 2.9.0
   6 months after.
   - The release process has multiple steps, based on
   major/minor/maintenance. It would be nice to capture/track how long each
   step takes so the RM can be prepared. e.g. herding the cats for an RC takes
   x weeks, compatibility checks take y days of work.


On Tue, Jan 17, 2017 at 10:05 AM, Sangjin Lee  wrote:

> Thanks for correcting me! I left out a sentence by mistake. :)
>
> To correct the proposal we're voting for:
>
> A minor release on the latest major line should be every 6 months, and a
> maintenance release on a minor release (as there may be concurrently
> maintained minor releases) every 2 months.
>
> A minor release line is end-of-lifed 2 years after it is released or there
> are 2 newer minor releases, whichever is sooner. The community reserves the
> right to extend or shorten the life of a release line if there is a good
> reason to do so.
>
> Sorry for the snafu.
>
> Regards,
> Sangjin
>
> On Tue, Jan 17, 2017 at 9:58 AM, Daniel Templeton 
> wrote:
>
> > Thanks for driving this, Sangjin. Quick question, though: the subject
> line
> > is "Release cadence and EOL," but I don't see anything about cadence in
> the
> > proposal.  Did I miss something?
> >
> > Daniel
> >
> >
> > On 1/17/17 8:35 AM, Sangjin Lee wrote:
> >
> >> Following up on the discussion thread on this topic (
> >> https://s.apache.org/eFOf), I'd like to put the proposal for a vote for
> >> the
> >> release cadence and EOL. The proposal is as follows:
> >>
> >> "A minor release line is end-of-lifed 2 years after it is released or
> >> there
> >> are 2 newer minor releases, whichever is sooner. The community reserves
> >> the
> >> right to extend or shorten the life of a release line if there is a good
> >> reason to do so."
> >>
> >> This also entails that we the Hadoop community commit to following this
> >> practice and solving challenges to make it possible. Andrew Wang laid
> out
> >> some of those challenges and what can be done in the discussion thread
> >> mentioned above.
> >>
> >> I'll set the voting period to 7 days. I understand a majority rule would
> >> apply in this case. Your vote is greatly appreciated, and so are
> >> suggestions!
> >>
> >> Thanks,
> >> Sangjin
> >>
> >>
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>


Re: [VOTE] Release cadence and EOL

2017-01-17 Thread Sangjin Lee
Thanks for correcting me! I left out a sentence by mistake. :)

To correct the proposal we're voting for:

A minor release on the latest major line should be every 6 months, and a
maintenance release on a minor release (as there may be concurrently
maintained minor releases) every 2 months.

A minor release line is end-of-lifed 2 years after it is released or there
are 2 newer minor releases, whichever is sooner. The community reserves the
right to extend or shorten the life of a release line if there is a good
reason to do so.

Sorry for the snafu.

Regards,
Sangjin

On Tue, Jan 17, 2017 at 9:58 AM, Daniel Templeton 
wrote:

> Thanks for driving this, Sangjin. Quick question, though: the subject line
> is "Release cadence and EOL," but I don't see anything about cadence in the
> proposal.  Did I miss something?
>
> Daniel
>
>
> On 1/17/17 8:35 AM, Sangjin Lee wrote:
>
>> Following up on the discussion thread on this topic (
>> https://s.apache.org/eFOf), I'd like to put the proposal for a vote for
>> the
>> release cadence and EOL. The proposal is as follows:
>>
>> "A minor release line is end-of-lifed 2 years after it is released or
>> there
>> are 2 newer minor releases, whichever is sooner. The community reserves
>> the
>> right to extend or shorten the life of a release line if there is a good
>> reason to do so."
>>
>> This also entails that we the Hadoop community commit to following this
>> practice and solving challenges to make it possible. Andrew Wang laid out
>> some of those challenges and what can be done in the discussion thread
>> mentioned above.
>>
>> I'll set the voting period to 7 days. I understand a majority rule would
>> apply in this case. Your vote is greatly appreciated, and so are
>> suggestions!
>>
>> Thanks,
>> Sangjin
>>
>>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release cadence and EOL

2017-01-17 Thread Daniel Templeton
Thanks for driving this, Sangjin. Quick question, though: the subject 
line is "Release cadence and EOL," but I don't see anything about 
cadence in the proposal.  Did I miss something?


Daniel

On 1/17/17 8:35 AM, Sangjin Lee wrote:

Following up on the discussion thread on this topic (
https://s.apache.org/eFOf), I'd like to put the proposal for a vote for the
release cadence and EOL. The proposal is as follows:

"A minor release line is end-of-lifed 2 years after it is released or there
are 2 newer minor releases, whichever is sooner. The community reserves the
right to extend or shorten the life of a release line if there is a good
reason to do so."

This also entails that we the Hadoop community commit to following this
practice and solving challenges to make it possible. Andrew Wang laid out
some of those challenges and what can be done in the discussion thread
mentioned above.

I'll set the voting period to 7 days. I understand a majority rule would
apply in this case. Your vote is greatly appreciated, and so are
suggestions!

Thanks,
Sangjin




-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6104) RegistrySecurity overrides zookeeper sasl system properties

2017-01-17 Thread Billie Rinaldi (JIRA)
Billie Rinaldi created YARN-6104:


 Summary: RegistrySecurity overrides zookeeper sasl system 
properties
 Key: YARN-6104
 URL: https://issues.apache.org/jira/browse/YARN-6104
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Billie Rinaldi
Assignee: Billie Rinaldi


If the RM is configured with JAVA_OPTS setting the 
zookeeper.sasl.client.username and zookeeper.sasl.clientconfig properties, 
these are ignored and overwritten by RegistrySecurity in 
setZKSaslClientProperties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6103) Logging update for ZKRMStateStore

2017-01-17 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-6103:
--

 Summary: Logging update for ZKRMStateStore
 Key: YARN-6103
 URL: https://issues.apache.org/jira/browse/YARN-6103
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bibin A Chundatt
Priority: Trivial


{code}
  LOG.debug(appId + " znode didn't exist. Created a new znode to"
  + " update the application state.");
{code}
Check is debug enabled

{code}
if (LOG.isDebugEnabled()) {
  LOG.debug((isUpdate ? "Storing " : "Updating ")
  + dtSequenceNumberPath + ". SequenceNumber: "
  + rmDTIdentifier.getSequenceNumber());
}
{code}
isUpdate will be always false



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[VOTE] Release cadence and EOL

2017-01-17 Thread Sangjin Lee
Following up on the discussion thread on this topic (
https://s.apache.org/eFOf), I'd like to put the proposal for a vote for the
release cadence and EOL. The proposal is as follows:

"A minor release line is end-of-lifed 2 years after it is released or there
are 2 newer minor releases, whichever is sooner. The community reserves the
right to extend or shorten the life of a release line if there is a good
reason to do so."

This also entails that we the Hadoop community commit to following this
practice and solving challenges to make it possible. Andrew Wang laid out
some of those challenges and what can be done in the discussion thread
mentioned above.

I'll set the voting period to 7 days. I understand a majority rule would
apply in this case. Your vote is greatly appreciated, and so are
suggestions!

Thanks,
Sangjin


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-01-17 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/

[Jan 16, 2017 7:11:53 AM] (lei) Revert "HDFS-11259. Update fsck to display 
maintenance state info.
[Jan 16, 2017 9:45:22 PM] (arp) HDFS-11342. Fix FileInputStream leak in 
loadLastPartialChunkChecksum.
[Jan 16, 2017 10:43:29 PM] (arp) HDFS-11339. Support File IO sampling for 
Datanode IO profiling hooks.
[Jan 16, 2017 10:53:53 PM] (jitendra) HDFS-11307. The rpc to portmap service 
for NFS has hardcoded timeout.
[Jan 17, 2017 12:20:24 AM] (junping_du) YARN-6011. Add a new web service to 
list the files on a container in
[Jan 17, 2017 1:10:23 AM] (aajisaka) HADOOP-13933. Add haadmin 
-getAllServiceState option to get the HA state




-1 overall


The following subsystems voted -1:
asflicense unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.yarn.server.timeline.webapp.TestTimelineWebServices 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-compile-javac-root.txt
  [168K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-checkstyle-root.txt
  [16M]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-patch-shellcheck.txt
  [24K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-patch-shelldocs.txt
  [16K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/whitespace-eol.txt
  [11M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/diff-javadoc-javadoc-root.txt
  [2.2M]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  [324K]

   asflicense:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/289/artifact/out/patch-asflicense-problems.txt
  [4.0K]

Powered by Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org



-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

[jira] [Created] (YARN-6102) On failover RM can crash due to unregistered event to AsyncDispatcher

2017-01-17 Thread Ajith S (JIRA)
Ajith S created YARN-6102:
-

 Summary: On failover RM can crash due to unregistered event to 
AsyncDispatcher
 Key: YARN-6102
 URL: https://issues.apache.org/jira/browse/YARN-6102
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ajith S
Assignee: Ajith S
Priority: Critical


{code}2017-01-17 16:42:17,911 FATAL [AsyncDispatcher event handler] 
event.AsyncDispatcher (AsyncDispatcher.java:dispatch(200)) - Error in 
dispatcher thread
java.lang.Exception: No handler for registered for class 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:196)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:120)
at java.lang.Thread.run(Thread.java:745)
2017-01-17 16:42:17,914 INFO  [AsyncDispatcher ShutDown handler] 
event.AsyncDispatcher (AsyncDispatcher.java:run(303)) - Exiting, bbye..{code}

The same stack i was also noticed in {{TestResourceTrackerOnHA}} exits 
abnormally, after some analysis, i was able to reproduce.

Once the nodeHeartBeat is sent to RM, inside 
{{org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.nodeHeartbeat(NodeHeartbeatRequest)}},
 before sending it to dispatcher through
{{this.rmContext.getDispatcher().getEventHandler().handle(nodeStatusEvent);}} 
if RM failover is called, the dispatcher is reset
The new dispatcher is however first started and then the events are registered 
at 
{{org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(boolean)}}

So event order will look like
1. Send Node heartbeat to {{ResourceTrackerService}}
2. In {{ResourceTrackerService.nodeHeartbeat}}, before passing to dispatcher 
call RM failover
3. In RM Failover, current active will reset dispatcher @reinitialize i.e ( 
{{resetDispatcher();}} + {{createAndInitActiveServices();}} )

Now between {{resetDispatcher();}} and {{createAndInitActiveServices();}} , the 
{{ResourceTrackerService.nodeHeartbeat}} invokes dipatcher

This will cause the above error as at point of time when {{STATUS_UPDATE}} 
event is given to dispatcher in {{ResourceTrackerService}} , the new 
dispatcher(from the failover) may be started but not yet registered for events
Using same steps(with pausing JVM at debug), i was able to reproduce this in 
production cluster also. for {{STATUS_UPDATE}} active service event, when the 
service is yet to forward the event to RM dispatcher but a failover is called 
and dispatcher reset is between {{resetDispatcher();}} & 
{{createAndInitActiveServices();}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-6101) Delay scheduling for node resource balance

2017-01-17 Thread He Tianyi (JIRA)
He Tianyi created YARN-6101:
---

 Summary: Delay scheduling for node resource balance
 Key: YARN-6101
 URL: https://issues.apache.org/jira/browse/YARN-6101
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: He Tianyi
Priority: Minor


We observed that, in today's cluster, usage of Spark has dramatically 
increased. 
This introduced a new issue that CPU/MEM utilization for single node may become 
imbalanced due to Spark is generally more memory intensive. For example, after 
a node with capability (48 cores, 192 GB memory) cannot satisfy a (1 core, 2 GB 
memory) request if current used resource is (20 cores, 190 GB memory), with 
plenty of total available resource across the whole cluster.
A thought for avoiding the situation is to introduce some strategy during 
scheduling.
This JIRA proposes a delay-scheduling-alike approach to achieve better balance 
between different type of resources on each node.
The basic idea is consider dominant resource for each node, and when a 
scheduling opportunity on a particular node is offered to a resource request, 
better make sure the allocation is changing dominant resource of the node, or, 
in worst case, allocate at once when number of offered scheduling opportunities 
exceeds a certain number.
With YARN SLS and a simulation file with hybrid workload (MR+Spark), the 
approach improved cluster resource usage by nearly 5%. And after deployed to 
production, we observed a 8% improvement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org