[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-23 Thread YuJie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022299#comment-16022299
 ] 

YuJie Huang commented on YARN-6111:
---

Is this patch YARN-6111.001.patch ?

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6637) Deadlock in NativeIO

2017-05-23 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022273#comment-16022273
 ] 

Ajith S commented on YARN-6637:
---

A flavor of https://bugs.openjdk.java.net/browse/JDK-8037567  but not same, 
rather caused by application code

> Deadlock in NativeIO
> 
>
> Key: YARN-6637
> URL: https://issues.apache.org/jira/browse/YARN-6637
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6637) Deadlock in NativeIO

2017-05-23 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022270#comment-16022270
 ] 

Ajith S commented on YARN-6637:
---

Below are bits from stacktrace:

Thread1
{code}
  java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:739)
at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.(RawLocalFileSystem.java:224)
at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.(RawLocalFileSystem.java:208)
{code}

Thread2
{code}
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.mapred.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:160)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle$1.operationComplete(ShuffleHandler.java:1166)
at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
at 
org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:413)
{code}


The above threads looks to be blocked by below two stacks

Stack1:
{code}
"New I/O worker #1" #135 prio=5 os_prio=0 tid=0x7f1f60817800 nid=0x697d in 
Object.wait() [0x7f1f4429a000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.io.nativeio.NativeIO$POSIX.(NativeIO.java:184)
at 
org.apache.hadoop.mapred.FadvisedFileRegion.transferSuccessful(FadvisedFileRegion.java:160)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle$1.operationComplete(ShuffleHandler.java:1166)
{code}

Stack2:
{code}
"ContainersLauncher #16" #365 prio=5 os_prio=0 tid=0x7f1f49c8a800 
nid=0x7cd0 in Object.wait() [0x7f1f32891000]
   java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.io.nativeio.NativeIO.initNative(Native Method)
at org.apache.hadoop.io.nativeio.NativeIO.(NativeIO.java:645)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:739)
{code}

*Stack1* is blocked by *Stack2* as *Stack1* thread needs *NativeIO* class 
initialization to finish, so the problematic stack looks to be Stack 2
Next in Stack 2 @  NativeIO.java:645 initNative in a native call, and it tries 
to initialize the native-hadoop library

so in Stack 2: it try to do this in NativeIO.c via 
*Java_org_apache_hadoop_io_nativeio_NativeIO_initNative*
{code}static void consts_init(JNIEnv *env) {
  jclass clazz = (*env)->FindClass(env, NATIVE_IO_POSIX_CLASS);
  #define NATIVE_IO_POSIX_CLASS "org/apache/hadoop/io/nativeio/NativeIO$POSIX"
   i.e create class org.apache.hadoop.io.nativeio.NativeIO$POSIX{code}
   
but *Stack1* is already in 
{{org.apache.hadoop.io.nativeio.NativeIO$POSIX.}}
so it deadlock and all threads hang

> Deadlock in NativeIO
> 
>
> Key: YARN-6637
> URL: https://issues.apache.org/jira/browse/YARN-6637
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6637) Deadlock in NativeIO

2017-05-23 Thread Ajith S (JIRA)
Ajith S created YARN-6637:
-

 Summary: Deadlock in NativeIO
 Key: YARN-6637
 URL: https://issues.apache.org/jira/browse/YARN-6637
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ajith S
Assignee: Ajith S
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6630) Container worker dir could not recover when NM restart

2017-05-23 Thread Yang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated YARN-6630:

Description: 
When yarn.nodemanager.recovery.enabled is true and ContainerRetryPolicy is 
NEVER_RETRY, container worker dir will not be saved in NM state store. 

{code:title=ContainerLaunch.java}
...
  private void recordContainerWorkDir(ContainerId containerId,
  String workDir) throws IOException{
container.setWorkDir(workDir);
if (container.isRetryContextSet()) {
  context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
}
  }
{code}

Then NM restarts, container.workDir is null, and may cause other exceptions.

{code:title=ContainerImpl.java}
  static class ResourceLocalizedWhileRunningTransition
  extends ContainerTransition {
...
  String linkFile = new Path(container.workDir, link).toString();
...
{code}

{code}
java.lang.IllegalArgumentException: Can not create a Path from a null string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
at org.apache.hadoop.fs.Path.(Path.java:175)
at org.apache.hadoop.fs.Path.(Path.java:110)
... ...
{code}

  was:
When ContainerRetryPolicy is NEVER_RETRY, container worker dir will not be 
saved in NM state store. Then NM restarts, container.workDir is null, and may 
cause other exceptions.

{code:title=ContainerLaunch.java}
...
  private void recordContainerWorkDir(ContainerId containerId,
  String workDir) throws IOException{
container.setWorkDir(workDir);
if (container.isRetryContextSet()) {
  context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
}
  }
{code}

{code:title=ContainerImpl.java}
  static class ResourceLocalizedWhileRunningTransition
  extends ContainerTransition {
...
  String linkFile = new Path(container.workDir, link).toString();
...
{code}


> Container worker dir could not recover when NM restart
> --
>
> Key: YARN-6630
> URL: https://issues.apache.org/jira/browse/YARN-6630
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>
> When yarn.nodemanager.recovery.enabled is true and ContainerRetryPolicy is 
> NEVER_RETRY, container worker dir will not be saved in NM state store. 
> {code:title=ContainerLaunch.java}
> ...
>   private void recordContainerWorkDir(ContainerId containerId,
>   String workDir) throws IOException{
> container.setWorkDir(workDir);
> if (container.isRetryContextSet()) {
>   context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
> }
>   }
> {code}
> Then NM restarts, container.workDir is null, and may cause other exceptions.
> {code:title=ContainerImpl.java}
>   static class ResourceLocalizedWhileRunningTransition
>   extends ContainerTransition {
> ...
>   String linkFile = new Path(container.workDir, link).toString();
> ...
> {code}
> {code}
> java.lang.IllegalArgumentException: Can not create a Path from a null string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159)
> at org.apache.hadoop.fs.Path.(Path.java:175)
> at org.apache.hadoop.fs.Path.(Path.java:110)
> ... ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022241#comment-16022241
 ] 

Hadoop QA commented on YARN-6634:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 67 new + 4 unchanged - 36 fixed = 71 total (was 40) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 48 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
23s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 182 new + 852 unchanged - 22 fixed = 1034 total (was 874) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 40s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 66m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesSchedulerActivities |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6634 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869560/YARN-6634.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 20c076d5 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 52661e0 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/16003/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 

[jira] [Commented] (YARN-6607) YARN Resource Manager quits with the exception java.util.concurrent.RejectedExecutionException:

2017-05-23 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1605#comment-1605
 ] 

Feng Yuan commented on YARN-6607:
-

Is there any stacktrace?

> YARN Resource Manager quits with the exception 
> java.util.concurrent.RejectedExecutionException: 
> 
>
> Key: YARN-6607
> URL: https://issues.apache.org/jira/browse/YARN-6607
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.1
>Reporter: Anandhaprabhu
>
> ResourceManager goes down frequently with the below exception
> 2017-05-16 03:32:36,897 FATAL event.AsyncDispatcher 
> (AsyncDispatcher.java:dispatch(189)) - Error in dispatcher thread
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.FutureTask@9efeac9 rejected from 
> java.util.concurrent.ThreadPoolExecutor@42ab30[Shutting down, pool size = 16, 
> active threads = 0, queued tasks = 0, completed tasks = 223337]
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
> at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
> at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
> at 
> org.apache.hadoop.registry.server.services.RegistryAdminService.submit(RegistryAdminService.java:176)
> at 
> org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.purgeRecordsAsync(RMRegistryOperationsService.java:200)
> at 
> org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.purgeRecordsAsync(RMRegistryOperationsService.java:170)
> at 
> org.apache.hadoop.registry.server.integration.RMRegistryOperationsService.onContainerFinished(RMRegistryOperationsService.java:146)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService.handleAppAttemptEvent(RMRegistryService.java:151)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService$AppEventHandler.handle(RMRegistryService.java:183)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.registry.RMRegistryService$AppEventHandler.handle(RMRegistryService.java:177)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$MultiListenerHandler.handle(AsyncDispatcher.java:276)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
> at java.lang.Thread.run(Thread.java:745)
> 2017-05-16 03:32:36,898 INFO  zookeeper.ClientCnxn (ClientCnxn.java:run(524)) 
> - EventThread shut down
> 2017-05-16 03:32:36,898 INFO  zookeeper.ZooKeeper (ZooKeeper.java:close(684)) 
> - Session: 0x15b8703e986b750 closed
> 2017-05-16 03:32:36,898 INFO  capacity.ParentQueue 
> (ParentQueue.java:completedContainer(623)) - completedContainer queue=high 
> usedCapacity=0.41496983 absoluteUsedCapacity=0.29047886 used= vCores:847> cluster=
> 2017-05-16 03:32:36,905 INFO  capacity.ParentQueue 
> (ParentQueue.java:completedContainer(640)) - Re-sorting completed queue: 
> root.high.lawful stats: lawful: capacity=0.3, absoluteCapacity=0.2101, 
> usedResources=, usedCapacity=0.16657583, 
> absoluteUsedCapacity=0.034980923, numApps=19, numContainers=102
> 2017-05-16 03:32:36,905 INFO  capacity.ParentQueue 
> (ParentQueue.java:completedContainer(623)) - completedContainer queue=root 
> usedCapacity=0.41565567 absoluteUsedCapacity=0.41565567 used= vCores:1212> cluster=
> 2017-05-16 03:32:36,906 INFO  capacity.ParentQueue 
> (ParentQueue.java:completedContainer(640)) - Re-sorting completed queue: 
> root.high stats: high: numChildQueue= 4, capacity=0.7, absoluteCapacity=0.7, 
> usedResources=usedCapacity=0.41496983, 
> numApps=61, numContainers=847
> 2017-05-16 03:32:36,906 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:completedContainer(1562)) - Application attempt 
> appattempt_1494886223429_7023_01 released container 
> container_e43_1494886223429_7023_01_43 on node: host: 
> r13d8.hadoop.log10.blackberry:45454 #containers=1 available= vCores:23> used= with event: FINISHED



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6630) Container worker dir could not recover when NM restart

2017-05-23 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1603#comment-1603
 ] 

Feng Yuan commented on YARN-6630:
-

IMO,by default when nm starts it will clear all workdirs, if we should skip 
some workdirs those container is recovering?
Any ideas?

> Container worker dir could not recover when NM restart
> --
>
> Key: YARN-6630
> URL: https://issues.apache.org/jira/browse/YARN-6630
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yang Wang
>
> When ContainerRetryPolicy is NEVER_RETRY, container worker dir will not be 
> saved in NM state store. Then NM restarts, container.workDir is null, and may 
> cause other exceptions.
> {code:title=ContainerLaunch.java}
> ...
>   private void recordContainerWorkDir(ContainerId containerId,
>   String workDir) throws IOException{
> container.setWorkDir(workDir);
> if (container.isRetryContextSet()) {
>   context.getNMStateStore().storeContainerWorkDir(containerId, workDir);
> }
>   }
> {code}
> {code:title=ContainerImpl.java}
>   static class ResourceLocalizedWhileRunningTransition
>   extends ContainerTransition {
> ...
>   String linkFile = new Path(container.workDir, link).toString();
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2017-05-23 Thread Feng Yuan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1600#comment-1600
 ] 

Feng Yuan commented on YARN-2497:
-

Could someone disassemble subtasks or attach design doc?

> Changes for fair scheduler to support allocate resource respect labels
> --
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-23 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022203#comment-16022203
 ] 

Rohith Sharma K S commented on YARN-6555:
-

bq. Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state
There are 2 cents. 
# IMO, we should NOT set default values for flow context. There are 2 cases, 
## Master container launched : RM sets flow context in container launch context 
and start it. This required to be recovered during NM restart. 
## AM launches containers : Flow context details are not set. So, it is not 
required to store and recover during NM restart and no use also. 
# additional null check for strings before creating a proto is because setter 
method for strings in proto throws NPE if  flowName or flowVersion are null. 

bq. FlowContext.toString(). Can we do something like {k1=v1, k2=v2, k3=v3} for 
better readability in the log?
make sense, I will change it next patch after Vrushal review it. 


> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6575) Support global configuration mutation in MutableConfProvider

2017-05-23 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022189#comment-16022189
 ] 

Jonathan Hung commented on YARN-6575:
-

Thanks for the comments [~leftnoteasy],
bq. could we rename queues changes (in xml) to added-queues, removed-queues, 
updated-queues
How about add-queue, remove-queue, update-queue? Since each xml object will be 
for a single queue.
bq. We should add support of changing preemption configs via refreshQueues 
shortly.
Do you mean in this feature, or a separate feature? I'm not sure 
preemption-related configs are in the scope of this feature, since the 
preemption configs are set on monitor initialization so can only be changed on 
RM restart, while this feature is for scheduler restart. We'd have to add extra 
support apart from calling CS.reinitialize to change preemption configs at 
runtime.

Not sure if there are any other configs without the yarn.scheduler.capacity 
prefix, as far as I could tell looking at CapacitySchedulerConfiguration. (will 
need to double check this)

> Support global configuration mutation in MutableConfProvider
> 
>
> Key: YARN-6575
> URL: https://issues.apache.org/jira/browse/YARN-6575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-6575-YARN-5734.001.patch
>
>
> Right now mutating configs assumes they are only queue configs. Support 
> should be added to mutate global scheduler configs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-6634:
---
Attachment: YARN-6634.v1.patch

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch, YARN-6634.v1.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6471) Support to add min/max resource configuration for a queue

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022164#comment-16022164
 ] 

Wangda Tan commented on YARN-6471:
--

Thanks [~sunilg],

Questions/comments:

1. CSQueue:
- Naming is not consistent: should we add "normalize-up/down" to all 
"get-effective-*" or we should remove all "normalize*" but keep it in the 
javadocs? 

2. What is the story of non-exclusive node labels?

3. PriorityUtilizationQueueOrderingPolicy
{code}
Resource minEffRes1 = q1.getQueueResourceQuotas()
.getEffectiveMinResource(p);
Resource minEffRes2 = q2.getQueueResourceQuotas()
.getEffectiveMinResource(p);
if (!minEffRes1.equals(Resources.none())
&& !minEffRes2.equals(Resources.none())) {
  return minEffRes2.compareTo(minEffRes1);
}
{code}
Should we compare configured-resource instead of effective-resource? And should 
we use the flag which indicate it is percentage or absolute?

4. Does this include changes to Preemption or it is just try not to break 
existing preemption logic when percentage capacity is used.

> Support to add min/max resource configuration for a queue
> -
>
> Key: YARN-6471
> URL: https://issues.apache.org/jira/browse/YARN-6471
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-6471.001.patch, YARN-6471.002.patch, 
> YARN-6471.003.patch, YARN-6471.004.patch, YARN-6471.005.patch, 
> YARN-6471.006.patch
>
>
> This jira will track the new configurations which are needed to configure min 
> resource and max resource of various resource types in a queue.
> For eg: 
> {noformat}
> yarn.scheduler.capacity.root.default.memory.min-resource
> yarn.scheduler.capacity.root.default.memory.max-resource
> yarn.scheduler.capacity.root.default.vcores.min-resource
> yarn.scheduler.capacity.root.default.vcores.max-resource
> {noformat}
> Uploading a patch soon



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6575) Support global configuration mutation in MutableConfProvider

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022154#comment-16022154
 ] 

Wangda Tan commented on YARN-6575:
--

Thanks [~jhung], 

Comments:

1) For SchedConfUpdateInfo, could we rename queues changes (in xml) to 
added-queues, removed-queues, updated-queues. And "global" to "global-updates".

2) Probably we should not add CapacitySchedulerConfiguration.PREFIX to given 
keys. For example, preemption-related configs are not start with 
CapacitySchedulerConfiguration.PREFIX. We should add support of changing 
preemption configs via refreshQueues shortly.

> Support global configuration mutation in MutableConfProvider
> 
>
> Key: YARN-6575
> URL: https://issues.apache.org/jira/browse/YARN-6575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-6575-YARN-5734.001.patch
>
>
> Right now mutating configs assumes they are only queue configs. Support 
> should be added to mutate global scheduler configs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6575) Support global configuration mutation in MutableConfProvider

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022155#comment-16022155
 ] 

Wangda Tan commented on YARN-6575:
--

[~xgong] could u take a look at the patch as well?

> Support global configuration mutation in MutableConfProvider
> 
>
> Key: YARN-6575
> URL: https://issues.apache.org/jira/browse/YARN-6575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-6575-YARN-5734.001.patch
>
>
> Right now mutating configs assumes they are only queue configs. Support 
> should be added to mutate global scheduler configs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022151#comment-16022151
 ] 

Wangda Tan commented on YARN-5892:
--

Thanks [~eepayne], one minor comment:

1) Could you move CapacitySchedulerQueueManager#updateUserWeights to 
LeafQueue#setupQueueConfigs.

> Capacity Scheduler: Support user-specific minimum user limit percent
> 
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022072#comment-16022072
 ] 

Wangda Tan commented on YARN-6593:
--

bq. Let's avoid bold in the JIRAs without a reason guys 
I don't think so, next time I will put the reasons to fat / large / red font so 
you won't miss it :).

bq. We have a different approach in mind, but that's OK. Let's try to see the 
other person's point of view.
Yes agree, looping [~jianhe]/[~sunilg]/[~vinodkv] for suggestions.

bq. So, Wangda Tan, you would prefer different subclasses for each type of 
constraint. Therefore, we will have one for the target, one for the 
cardinality, and one that combines both, right?
Yes, that's correct. 

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-23 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022046#comment-16022046
 ] 

Vrushali C commented on YARN-6555:
--

Thanks for picking this up Rohith, I will take a look at the patch and get back 
shortly. 

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6316) Provide help information and documentation for TimelineSchemaCreator

2017-05-23 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022044#comment-16022044
 ] 

Vrushali C commented on YARN-6316:
--

Thanks [~haibo.chen]! The patch 00 looks good. 

I will wait for a day before committing this in.

> Provide help information and documentation for TimelineSchemaCreator
> 
>
> Key: YARN-6316
> URL: https://issues.apache.org/jira/browse/YARN-6316
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Haibo Chen
> Attachments: YARN-6316.00.patch, YARN-6316.prelim.patch
>
>
> Right now there is no help information for timeline schema creator. We may 
> probably want to provide an option to print help. Also, ideally, if users 
> passed in no argument, we may want to print out help, instead of directly 
> create the tables. This will simplify cluster operations and timeline v2 
> deployments. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6047) Documentation updates for TimelineService v2

2017-05-23 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022043#comment-16022043
 ] 

Vrushali C commented on YARN-6047:
--

Thanks [~rohithsharma] for the patch.

I am wondering if we might want to say "records" instead of "flows" in the 
wording: "retrieve the next set of apps from the given fromid." 

Specifically for lines like 823, 954, 1157, the records returned are not 
actually flows but apps or entities. The fromid belongs to the flow, but it is 
basically like a row key prefix, if I understand correctly. 

At line 1299, there is:

{code}
If userid, flowname and flowrunid are not specified, we would have to fetch 
flow context information based on cluster and appid while executing the query.
{code}
Can this be reworded to say who would have to fetch? We meaning ats will 
automatically fetch it, is that right.  May be we can say, if these parameters 
are not specified, they will be retrieved based on cluster + app id and used to 
look up the information.


> Documentation updates for TimelineService v2
> 
>
> Key: YARN-6047
> URL: https://issues.apache.org/jira/browse/YARN-6047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation, timelineserver
>Reporter: Varun Saxena
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6047-YARN-5355.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022030#comment-16022030
 ] 

Konstantinos Karanasos commented on YARN-6593:
--

Let's avoid bold in the JIRAs without a reason guys :)
We have a different approach in mind, but that's OK. Let's try to see the other 
person's point of view, it will help the end users too :)

The difference is minor; we have a POC that does not require differentiating 
constraint classes, so that will break the symmetry from the other side, but 
that's OK, if it will help us converge. I do find it a pity to add unnecessary 
subclasses, but I want to move forward.

So, [~leftnoteasy], you would prefer different subclasses for each type of 
constraint. Therefore, we will have one for the target, one for the 
cardinality, and one that combines both, right?

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16022006#comment-16022006
 ] 

Wangda Tan commented on YARN-6593:
--

bq. I don't see why we should not expose SimplePlacementConstraint externally 
as well.
That's because it's not understandable by users. We should not expose it to as 
a part of user facing API.

bq. In any case, we can have Builders/Validators that can take care of this 
internally right ?
bq. only once we get convinced that they will lead to more efficient 
implementation
Like what I emphasized many times, builder can only solve the setting issue, 
*we need a symmetric getter* as well.

bq. Since this feature is still nascent and in the process of development, we 
should probably err on the side of flexibility rather than stricter syntax. All 
dependent code (YARN native services) work can go on in parallel. One we have a 
working implementation, and we are close to a release, we can "late-bind" to a 
more restrictive API.
This is foundation of the feature, I don't think we should move ahead before 
get consensus. I'm fine with delaying string-based api and other fancier 
builders.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021971#comment-16021971
 ] 

Konstantinos Karanasos commented on YARN-6593:
--

bq. If the scheduler can handle a single class efficiently, we don't need a 
separate representation in scheduler side. However as I stated, we don't know 
this yet.
We should then create subclasses (that will not be consistent with the 
protobufs) only once we get convinced that they will lead to more efficient 
implementation. Why add subclasses without knowing this is true?

The users will see different constraint types through the builder, as in 
{{addTargetConstraint}}, {{addCardinalityConstraint}}. If we see that adding 
them as subclasses is important for implementation, we can sure go ahead and 
add them later. The API in the builders will not need to change.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021954#comment-16021954
 ] 

Arun Suresh edited comment on YARN-6593 at 5/23/17 10:09 PM:
-

Given that what ideally should be perceived as the API is the proto 
definitions, and since we agreed to merge both TargetConstraint and 
CardinalityConstraint as a single proto struct, I don't see why we should not 
expose SimplePlacementConstraint externally as well.

In any case, we can have Builders/Validators that can take care of this 
internally right ? 

Since this feature is still nascent and in the process of development, we 
should probably err on the side of flexibility rather than stricter syntax. All 
dependent code (YARN native services) work can go on in parallel. One we have a 
working implementation, and we are close to a release, we can "late-bind" to a 
more restrictive API.


was (Author: asuresh):
Given that what ideally should be perceived as the API is the proto 
definitions, and since we agreed to merge both TargetConstraint and 
CardinalityConstraint as a single proto struct, I don't see why we should not 
expose SimplePlacementConstraint externally as well.

In any case, we can have Builders/Validators that can take care of this 
internally right ? 

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021954#comment-16021954
 ] 

Arun Suresh commented on YARN-6593:
---

Given that what ideally should be perceived as the API is the proto 
definitions, and since we agreed to merge both TargetConstraint and 
CardinalityConstraint as a single proto struct, I don't see why we should not 
expose SimplePlacementConstraint externally as well.

In any case, we can have Builders/Validators that can take care of this 
internally right ? 

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3409) Add constraint node labels

2017-05-23 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021942#comment-16021942
 ] 

Daniel Templeton commented on YARN-3409:


Sorry for coming late to the conversation.  Last week I had a quick chat with 
[~Naganarasimha] offline about the plans, and I wanted to share an alternate 
perspective.

If you go look at the way HPC job schedulers (like Grid Engine et al) handle 
this requirement, it's an extension of resources.  The work that [~vvasudev] 
has done on resource types opens up a natural path to add "static" resource 
types with the characteristics described here.  The advantage is that the 
plumbing for resources is already very mature, and extending it to support 
static resources would not introduce much in the way of new logic.  The 
implementation of constraints then naturally becomes a superset of resource 
matching for the consumable resources.  The disadvantage that [~Naganarasimha] 
pointed out is that users would have to understand that resources can be static 
or consumable, which is a higher bar than just asserting that all resources are 
consumable. Given that all the major HPC job schedulers have been using static 
resources for this purpose successfully for decades, I don't see that being a 
major issue.

To add a little more detail, here's the what Grid Engine does (that's relevant 
to us).  (See 
http://gridscheduler.sourceforge.net/htmlman/htmlman5/complex.html)
* All resources have a type, e.g. string, double, boolean, etc.
* All resources have an associated relational operator.  For example the memory 
resource has >= as a relational operator, meaning that a request for 4GB of 
memory is treated as >= 4GB of memory.  In general, resources can only be 
meaningfully compared one direction.
* All resources are either consumable or static.  Only numeric resources can be 
consumable.
* Memory and CPU (and a couple others) are provided implicitly by the system.
* It's possible to configure the agents to run scripts periodically to 
programmatically determine values for any resources. Consumable resources 
decrement from that value.
* The scheduler uses the relational operator for all resources to determine 
whether resource requests fit a destination queue/host.

Putting static resources and consumables in the same boat saves a fair bit of 
logic duplication in implementing things like programmatically determined 
values.

> Add constraint node labels
> --
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, capacityscheduler, client
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021927#comment-16021927
 ] 

Giovanni Matteo Fumarola commented on YARN-6634:


Thanks [~leftnoteasy] for the quick review. I agree with you for the three 
comments.

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021919#comment-16021919
 ] 

Wangda Tan commented on YARN-6593:
--

[~arun.sur...@gmail.com], 

Regardless of implementation, I think we should clearly define it before moving 
forward. I'm OK with moving some language-sugar APIs like string-based or some 
fancier build patterns, etc. However the TargetConstrant/CardinalityConstraint 
are what we clearly defined on the design doc, we should follow that. This API 
is targeted for YARN app developers to use in the future, if we cannot get 
converged between several of us, how we can make it widely used by YARN 
community developers.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021902#comment-16021902
 ] 

Arun Suresh commented on YARN-6593:
---

Guys, I feel this discussion is not tending to convergence..

To move this forward, I recommend for this JIRA, let us just stick to 
SimplePlacementConstraint class thereby keeping it as close to the .proto 
definitions.
[~leftnoteasy], I see your argument about how specializing the 
SimplePlacementConsrtaint into sub class might lead to the implementation 
optimizations, but let us track that perhaps in a separate JIRA and tackle that 
maybe in conjunction with when we are working on the implementation ?

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021880#comment-16021880
 ] 

Wangda Tan commented on YARN-6593:
--

[~kkaranasos],
bq. You see my point?
I saw your point, however I don't think we should assume scheduler changes 
before start writing code. I can still see values to handle separate 
constraints (like optimize logics) instead of operate on a 3-fields one.

bq. The end user API can still evolve, as long as it remains backwards 
compatible.
I think my suggestion won't change how we handle backward compatible either. 
Since my suggested approach won't change structures defined in proto file. 

bq. If the scheduler does not need to do any if statement (as I explain above), 
what is the advantage of having separate classes for each constraint type 
(target, cardinality, and targetCardinality/admin)? Is there an implementation 
concern?
If the scheduler can handle a single class efficiently, we don't need a 
separate representation in scheduler side. However as I stated, we don't know 
this yet. 

Also as I mentioned several times, user can access/understand constraint easier 
with separate classes.



> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6628) Unexpected jackson-core-2.2.3 dependency introduced

2017-05-23 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-6628:
--
Attachment: YARN-6628.2.patch

> Unexpected jackson-core-2.2.3 dependency introduced
> ---
>
> Key: YARN-6628
> URL: https://issues.apache.org/jira/browse/YARN-6628
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.1
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
>Priority: Blocker
> Attachments: YARN-6628.1.patch
>
>
> The change in YARN-5894 caused jackson-core-2.2.3.jar to be added in 
> share/hadoop/yarn/lib/. This added dependency seems to be incompatible with 
> jackson-core-asl-1.9.13.jar which is also shipped as a dependency.  This new 
> jackson-core jar ends up breaking jobs that ran fine on 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6628) Unexpected jackson-core-2.2.3 dependency introduced

2017-05-23 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-6628:
--
Attachment: (was: YARN-6628.2.patch)

> Unexpected jackson-core-2.2.3 dependency introduced
> ---
>
> Key: YARN-6628
> URL: https://issues.apache.org/jira/browse/YARN-6628
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.8.1
>Reporter: Jason Lowe
>Assignee: Jonathan Eagles
>Priority: Blocker
> Attachments: YARN-6628.1.patch
>
>
> The change in YARN-5894 caused jackson-core-2.2.3.jar to be added in 
> share/hadoop/yarn/lib/. This added dependency seems to be incompatible with 
> jackson-core-asl-1.9.13.jar which is also shipped as a dependency.  This new 
> jackson-core jar ends up breaking jobs that ran fine on 2.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021870#comment-16021870
 ] 

Wangda Tan commented on YARN-6634:
--

Thanks [~giovanni.fumarola].

Talked to [~subru] offline, in general the approach looks good to me. 

Few thoughts:

1) IIUC, ClientWebServiceProtocol is for RMWebServices, so do you think is it 
better to call RMWebServicesProtocol?

2) For REST API compatibility, it is described in 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs.
 So instead of marking {{ClientWebServiceProtocol}} to {{public/stable}}, I 
suggest to make ClientWebServiceProtocol to be an internal API which will be 
shared by RMWebServices implementations.

3) ClientWebServiceProtocolUtil, is it better to rename it to {$class-name, 
such as {{RMWebServicesProtocol}}}Constants.

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021851#comment-16021851
 ] 

Giovanni Matteo Fumarola edited comment on YARN-6634 at 5/23/17 9:00 PM:
-

Thanks [~subru] for creating this JIRA. I attached a prototype of the possible 
solution. I have to add some missing javadocs in the protocol class and check 
all the jUnits to decrease the amount of hard code strings. By using 
hadoop-format.xml the RMWebServices.java changed a bit.


was (Author: giovanni.fumarola):
Thanks [~subru] for creating this JIRA. I attached a prototype of the possible 
solution. I have to add some missing javadocs in the protocol class and check 
all the jUnits to decrease the amount of hard code strings.  

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021851#comment-16021851
 ] 

Giovanni Matteo Fumarola commented on YARN-6634:


Thanks [~subru] for creating this JIRA. I attached a prototype of the possible 
solution. I have to add some missing javadocs in the protocol class and check 
all the jUnits to decrease the amount of hard code strings.  

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6634) [API] Define an API for ResourceManager WebServices

2017-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-6634:
---
Attachment: YARN-6634.proto.patch

> [API] Define an API for ResourceManager WebServices
> ---
>
> Key: YARN-6634
> URL: https://issues.apache.org/jira/browse/YARN-6634
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Subru Krishnan
>Assignee: Giovanni Matteo Fumarola
>Priority: Critical
> Attachments: YARN-6634.proto.patch
>
>
> The RM exposes few REST queries but there's no clear API interface defined. 
> This makes it painful to build either clients or extension components like 
> Router (YARN-5412) that expose REST interfaces themselves. This jira proposes 
> adding a RM WebServices protocol similar to the one we have for RPC, i.e. 
> {{ApplicationClientProtocol}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021836#comment-16021836
 ] 

Konstantinos Karanasos commented on YARN-6593:
--

[~leftnoteasy], what I am trying to say is that the scheduler does not need to 
do any if.
It will handle internally one type of constraint that includes cardinality, 
scope, and target. If one of them happens to be a default value (self for 
target, 0 for min cardinality, inf for max cardinality), it does not change 
anything from the scheduler's perspective. After all the scheduler has to deal 
with the constraint that specifies all three fields (what we called operator 
constraints in the doc), so it will just duplicate code if we have three ifs.
You see my point?

The end user API can still evolve, as long as it remains backwards compatible. 
For instance, when we add the admin constraints, we can simply add an 
additional builder utility method, without requiring changes in any other part 
of the code. Similar when we do the string representation of the constraints.

If the scheduler does not need to do any if statement (as I explain above), 
what is the advantage of having separate classes for each constraint type 
(target, cardinality, and targetCardinality/admin)? Is there an implementation 
concern?

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021765#comment-16021765
 ] 

Wangda Tan commented on YARN-6593:
--

[~kkaranasos], 

bq. That's exactly the advantage of having a single simple constraint. When it 
is a target constraint, min cardinality is 0 and max cardinality is infinite. 
When it is a cardinality constraint, target is yourself. This way the scheduler 
has to deal with a single placement constraint. No need for checking 
constraint-types at any point.
This is what I want to avoid, implicitly define internal fields makes harder 
for scheduler to handle and user to understand.

Pseudo code of you were describing is:
{code}
SimplePlacementConstraint c = ...;

if (c.minCardinality == 0 && c.maxCardinality == inf) {
  // this is a target constraint
  TargetExp t = c.getTargetExpression ...
} else if (c.target == null /* or self */ ) {
  int minCardinality = c.getMinCardinality();
  int maxCardinality = c.getMaxCardinality();
} else {
  // throw exception, not allow to set all 3 fields
}

// If we need get TargetConstraint / Cardinality in other places, we have to 
copy above logics everywhere.
{code}

Instead, what we can do is
{code}
SimplePalcementConstraint c = ...;
if (c.getType() == TargetConstraint) {
  TargetConstraint tc = (TargetConstraint) c;
} else if (c.getType() == CardinalityConstraint) {
  CardinalityConstraint cc = (CardinalityConstraint) c;
}

// tc and cc can be saved, we don't have to do the cast every time ..
{code}

bq. We keep it simple for the applications to express constraints through the 
utility methods, and we keep it simple for the scheduler to deal with a single 
constraint type.
But it cannot let applications to access constraints added easily.

bq. Moreover, if we have multiple classes for simple constraints, when building 
the objects from the protobufs in the scheduler side, we will have to cast to 
each constraint class. And if we decide to change those classes in the future 
(to expose a different way for users to write constraints), we will have to 
change the way the scheduler deals with constraints, which should really be 
avoided.

This is end user API, I don't think we should rapidly change this. YARN 
applications (like native services) will consume this once it is available. 
That is why I want to keep it stable and easy to use from the beginning.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6636) Fair Scheduler: respect node labels at resource request level

2017-05-23 Thread Ashwin Shankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwin Shankar updated YARN-6636:
-
Description: This ticket is to track changes to fair scheduler to respect 
node labels at resource request level. When the client sets labels at resource 
request level, the scheduler must schedule those containers only on those nodes 
with that label.   (was: This ticket is to track changes to fair scheduler to 
respect node labels at resource request level. When the client sets labels at 
resource request level, the scheduler must schedule those containers only on 
those nodes.)

> Fair Scheduler: respect node labels at resource request level
> -
>
> Key: YARN-6636
> URL: https://issues.apache.org/jira/browse/YARN-6636
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Ashwin Shankar
>Assignee: Ashwin Shankar
>
> This ticket is to track changes to fair scheduler to respect node labels at 
> resource request level. When the client sets labels at resource request 
> level, the scheduler must schedule those containers only on those nodes with 
> that label. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6636) Fair Scheduler: respect node labels at resource request level

2017-05-23 Thread Ashwin Shankar (JIRA)
Ashwin Shankar created YARN-6636:


 Summary: Fair Scheduler: respect node labels at resource request 
level
 Key: YARN-6636
 URL: https://issues.apache.org/jira/browse/YARN-6636
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Ashwin Shankar
Assignee: Ashwin Shankar


This ticket is to track changes to fair scheduler to respect node labels at 
resource request level. When the client sets labels at resource request level, 
the scheduler must schedule those containers only on those nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-23 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021664#comment-16021664
 ] 

Karthik Kambatla commented on YARN-5531:


Sorry for the delay on this, Botong. Will take a look today if nothing urgent 
comes up at work. 

> UnmanagedAM pool manager for federating application across clusters
> ---
>
> Key: YARN-5531
> URL: https://issues.apache.org/jira/browse/YARN-5531
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Botong Huang
> Attachments: YARN-5531-YARN-2915.v10.patch, 
> YARN-5531-YARN-2915.v11.patch, YARN-5531-YARN-2915.v1.patch, 
> YARN-5531-YARN-2915.v2.patch, YARN-5531-YARN-2915.v3.patch, 
> YARN-5531-YARN-2915.v4.patch, YARN-5531-YARN-2915.v5.patch, 
> YARN-5531-YARN-2915.v6.patch, YARN-5531-YARN-2915.v7.patch, 
> YARN-5531-YARN-2915.v8.patch, YARN-5531-YARN-2915.v9.patch
>
>
> One of the main tenets the YARN Federation is to *transparently* scale 
> applications across multiple clusters. This is achieved by running UAMs on 
> behalf of the application on other clusters. This JIRA tracks the addition of 
> a UnmanagedAM pool manager for federating application across clusters which 
> will be used the FederationInterceptor (YARN-3666) which is part of the 
> AMRMProxy pipeline introduced in YARN-2884.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-23 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021658#comment-16021658
 ] 

Haibo Chen edited comment on YARN-6555 at 5/23/17 6:51 PM:
---

Thanks [~rohithsharma] for the patch! A few comments.
1) Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state. Thoughts?
2) FlowContext.toString(). Can we do something like \{k1=v1, k2=v2, k3=v3\} for 
better readability in the log?


was (Author: haibochen):
Thanks [~rohithsharma] for the patch! A few comments.
1) Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state. Thoughts?
2) FlowContext.toString(). Can we do something like {k1=v1, k2=v2, k3=v3} \for 
better readability in the log?

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-23 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021658#comment-16021658
 ] 

Haibo Chen edited comment on YARN-6555 at 5/23/17 6:51 PM:
---

Thanks [~rohithsharma] for the patch! A few comments.
1) Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state. Thoughts?
2) FlowContext.toString(). Can we do something like {k1=v1, k2=v2, k3=v3} \for 
better readability in the log?


was (Author: haibochen):
Thanks [~rohithsharma] for the patch! A few comments.
1) Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state. Thoughts?
2) FlowContext.toString(). Can we do something like {k1=v1, k2=v2, k3=v3} for 
better readability in the log?

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6555) Enable flow context read (& corresponding write) for recovering application with NM restart

2017-05-23 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021658#comment-16021658
 ] 

Haibo Chen commented on YARN-6555:
--

Thanks [~rohithsharma] for the patch! A few comments.
1) Do you think we should preserve as much flow context information as 
possible? The patch only stores flow context in the state store only if all 
three fields of flow context is present. We could sanitize the flow context and 
fill in default values for whatever field is missing and then just check if 
flowcontext !=null before storing application state. Thoughts?
2) FlowContext.toString(). Can we do something like {k1=v1, k2=v2, k3=v3} for 
better readability in the log?

> Enable flow context read (& corresponding write) for recovering application 
> with NM restart 
> 
>
> Key: YARN-6555
> URL: https://issues.apache.org/jira/browse/YARN-6555
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-5355, YARN-5355-branch-2, 3.0.0-alpha3
>Reporter: Vrushali C
>Assignee: Rohith Sharma K S
> Attachments: YARN-6555.001.patch, YARN-6555.002.patch
>
>
> If timeline service v2 is enabled and NM is restarted with recovery enabled, 
> then NM fails to start and throws an error as  "flow context can't be null".
> This is happening because the flow context did not exist before but now that 
> timeline service v2 is enabled, ApplicationImpl expects it to exist. 
> This would also happen even if flow context existed before but since we are 
> not persisting it / reading it during 
> ContainerManagerImpl#recoverApplication, it does not get passed in to 
> ApplicationImpl.
> full stack trace
> {code}
> 2017-05-03 21:51:52,178 FATAL 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting 
> NodeManager
> java.lang.IllegalArgumentException: flow context cannot be null
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:104)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.(ApplicationImpl.java:90)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recoverApplication(ContainerManagerImpl.java:318)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:280)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:267)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:276)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:588)
> at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:649)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021657#comment-16021657
 ] 

Konstantinos Karanasos commented on YARN-6593:
--

bq. otherwise we have to check constraint-type everywhere
That's exactly the advantage of having a single simple constraint. When it is a 
target constraint, min cardinality is 0 and max cardinality is infinite. When 
it is a cardinality constraint, target is yourself. This way the scheduler has 
to deal with a single placement constraint. No need for checking 
constraint-types at any point.

We keep it simple for the applications to express constraints through the 
utility methods, and we keep it simple for the scheduler to deal with a single 
constraint type.

Moreover, if we have multiple classes for simple constraints, when building the 
objects from the protobufs in the scheduler side, we will have to cast to each 
constraint class. And if we decide to change those classes in the future (to 
expose a different way for users to write constraints), we will have to change 
the way the scheduler deals with constraints, which should really be avoided.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-5531) UnmanagedAM pool manager for federating application across clusters

2017-05-23 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5531:
---
Comment: was deleted

(was: Hi Karthik,

Can you please take another look when you have time? Thanks in advance!

Best,
Botong

On Wed, May 10, 2017 at 3:30 PM, Karthik Kambatla (JIRA) 

)

> UnmanagedAM pool manager for federating application across clusters
> ---
>
> Key: YARN-5531
> URL: https://issues.apache.org/jira/browse/YARN-5531
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Botong Huang
> Attachments: YARN-5531-YARN-2915.v10.patch, 
> YARN-5531-YARN-2915.v11.patch, YARN-5531-YARN-2915.v1.patch, 
> YARN-5531-YARN-2915.v2.patch, YARN-5531-YARN-2915.v3.patch, 
> YARN-5531-YARN-2915.v4.patch, YARN-5531-YARN-2915.v5.patch, 
> YARN-5531-YARN-2915.v6.patch, YARN-5531-YARN-2915.v7.patch, 
> YARN-5531-YARN-2915.v8.patch, YARN-5531-YARN-2915.v9.patch
>
>
> One of the main tenets the YARN Federation is to *transparently* scale 
> applications across multiple clusters. This is achieved by running UAMs on 
> behalf of the application on other clusters. This JIRA tracks the addition of 
> a UnmanagedAM pool manager for federating application across clusters which 
> will be used the FederationInterceptor (YARN-3666) which is part of the 
> AMRMProxy pipeline introduced in YARN-2884.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021590#comment-16021590
 ] 

Wangda Tan commented on YARN-6593:
--

[~kkaranasos], 

bq. What I realized is that we are trying to define at the same time the 
internal representation that will be used by the scheduler and the one that 
will be user-friendly.

Exactly. 

bq. I think we should split the two. I suggest to keep the java classes 
implementing the PBImpls to be in sync with them, and then add a utility class 
that allows users to create constraints in a more intuitive way. 

I still want to add my thoughts to this part. I agree that we should try to 
make Java PB implementation sync with .proto. But instead of a utility class to 
create, I think we should make user-facing API as simple as possible. I think 
it is also important to let user can access added constraints.

I'm open to the suggestions to make user can specify a string-based constraints 
or fluent style APIs. But my comment is, if an API is marked to {{@Public}}, it 
has to be simple and easy to be understood by YARN users. The 
SimplePlacementConstraint with 3 fields is not simple to be understand.

bq. BTW, the reason I have not created separate classes for target and 
cardinality constraints is that I we also have the more general constraint (the 
one we mention as "cluster operator constraint" in the document, such as "don't 
allow more than 5 ZooKeeper containers per rack") that includes all three.

I'm not quite agree with this part, in our implementation to support 
constraints can also benefit from split constraints, otherwise we have to check 
constraint-type everywhere. The syntax of cluster operator constraint is quite 
different from TargetConstraint/CardinalityConstraint (cardinality to myself 
v.s. cardinality to others). I'm fine with adding a ClusterOperatorConstraint 
based on SimplePlacementConstraint if needed.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021546#comment-16021546
 ] 

Arun Suresh commented on YARN-6593:
---

+1 on separating the Builder functionality into separate util classes. Apart 
from the reasons provided by [~kkaranasos], I also feel simple Builders (that 
promote the [fluent|https://www.martinfowler.com/bliki/FluentInterface.html] 
style of coding) might not really be super useful for nested structures.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6599) Support rich placement constraints in scheduler

2017-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021536#comment-16021536
 ] 

Wangda Tan commented on YARN-6599:
--

[~curino],

Apologize for the late responses, somehow I missed your last comment.

IIRC, the only thing we decided is to try best to make placement logics can be 
shared between schedulers. This include and not limited to, finding a node to 
place a container, score nodes according to constraint, etc. To me placing a 
container should be common logic of schedulers.

However, this doesn't mean nothing will be added to scheduler. As I mentioned 
to [~kkaranasos]/[~asuresh] offline, the reason we cannot add this logics 
completely outside of scheduler is, how and when to do placement for requests 
depends on internal state, such as limit/ordering, etc. of queue/app/user's. 
These logics are related to different scheduler implementations, and I don't 
think logics out side of scheduler can access it without duplicating a large 
part of scheduler logics. Just like preemption logics of CS, it is completely 
outside of scheduler, however we have to duplicate a lot logics to make its 
behavior is synced to how scheduler behaves, which is also expensive to 
maintain, that's my last two years experiences.

To me, the solution should be something in-between: not all in scheduler, and 
not all in outside of scheduler. I don't think we're ready for an answer now, I 
suggest to do some investigation/POC before make a decision.

Thoughts?

> Support rich placement constraints in scheduler
> ---
>
> Key: YARN-6599
> URL: https://issues.apache.org/jira/browse/YARN-6599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6593) [API] Introduce Placement Constraint object

2017-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021533#comment-16021533
 ] 

Konstantinos Karanasos commented on YARN-6593:
--

Thanks, [~leftnoteasy].
What I realized is that we are trying to define at the same time the internal 
representation that will be used by the scheduler and the one that will be 
user-friendly.
I think we should split the two. I suggest to keep the java classes 
implementing the PBImpls to be in sync with them, and then add a utility class 
that allows users to create constraints in a more intuitive way. The utility 
class can expose all the constraint creation methods and hide all the protobuf 
details (e.g., the fact that we have a PlacementConstraint class internally). 
This will allow us to evolve separately the way users specify constraints 
without needing to change the PBImpl classes or their subclasses etc.
Moreover, we can create a string utility method that can parse a string 
representation of the constraints and create PBImpl objects, which I think will 
be really useful too.

Let me know what you guys think.

BTW, the reason I have not created separate classes for target and cardinality 
constraints is that I we also have the more general constraint (the one we 
mention as "cluster operator constraint" in the document, such as "don't allow 
more than 5 ZooKeeper containers per rack") that includes all three. So I don't 
see the use of adding three different classes for this. Especially if we have 
the utility class I mentioned above.

PS: I forgot to reply to [~pg1...@imperial.ac.uk], since we chatted offline, 
but I am adding it here for completeness. Unfortunately protobuf 2.5, which is 
still the required version in Hadoop, does not support the {{extends}} 
construct to define sub/superclasses in the protobufs.

> [API] Introduce Placement Constraint object
> ---
>
> Key: YARN-6593
> URL: https://issues.apache.org/jira/browse/YARN-6593
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-6593.001.patch, YARN-6593.002.patch
>
>
> This JIRA introduces an object for defining placement constraints.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6635) Merging refactored YARN UI changes from yarn-native-services branch

2017-05-23 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-6635:
---
Summary: Merging refactored YARN UI changes from yarn-native-services 
branch  (was: Merging refactored changes from yarn-native-services branch)

> Merging refactored YARN UI changes from yarn-native-services branch
> ---
>
> Key: YARN-6635
> URL: https://issues.apache.org/jira/browse/YARN-6635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akhil PB
>Assignee: Akhil PB
> Attachments: YARN-6635.001.patch
>
>
> There are some refactoring done for yarn-app pages in new YARN UI codebase in 
> yarn-native-services branch. This ticket intends to bring the refactored 
> changes done in UI code into trunk from yarn-native-services branch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-05-23 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021458#comment-16021458
 ] 

Daniel Templeton commented on YARN-6125:


Because {{BoundedAppender}} has no dependency on {{RMAppAttemptImpl}}.  It's an 
inner class because it's a bespoke class that makes to attempt to account for 
use cases outside the needs of {{RMAppAttemptImpl}}.

> The application attempt's diagnostic message should have a maximum size
> ---
>
> Key: YARN-6125
> URL: https://issues.apache.org/jira/browse/YARN-6125
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Daniel Templeton
>Assignee: Andras Piros
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6125.000.patch, YARN-6125.001.patch, 
> YARN-6125.002.patch, YARN-6125.003.patch, YARN-6125.004.patch, 
> YARN-6125.005.patch, YARN-6125.006.patch, YARN-6125.007.patch, 
> YARN-6125.008.patch, YARN-6125.009.patch
>
>
> We've found through experience that the diagnostic message can grow 
> unbounded.  I've seen attempts that have diagnostic messages over 1MB.  Since 
> the message is stored in the state store, it's a bad idea to allow the 
> message to grow unbounded.  Instead, there should be a property that sets a 
> maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were 
> due to the size of the diagnostic messages and not to the size of the 
> classpath, as is the current prevailing opinion.
> An open question is how best to prune the message once it grows too large.  
> Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6111) Rumen input does't work in SLS

2017-05-23 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021430#comment-16021430
 ] 

Yufei Gu commented on YARN-6111:


Combine my comments and the patch, you will get the idea.

> Rumen input does't work in SLS
> --
>
> Key: YARN-6111
> URL: https://issues.apache.org/jira/browse/YARN-6111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 2.6.0, 2.7.3, 3.0.0-alpha2
> Environment: ubuntu14.0.4 os
>Reporter: YuJie Huang
>Assignee: Yufei Gu
>  Labels: test
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-6111.001.patch
>
>
> Hi guys,
> I am trying to learn the use of SLS.
> I would like to get the file realtimetrack.json, but this it only 
> contains "[]" at the end of a simulation. This is the command I use to 
> run the instance:
> HADOOP_HOME $ bin/slsrun.sh --input-rumen=sample-data/2jobsmin-rumen-jh.json 
> --output-dir=sample-data 
> All other files, including metrics, appears to be properly populated.I can 
> also trace with web:http://localhost:10001/simulate
> Can someone help?
> Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-05-23 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021343#comment-16021343
 ] 

Sunil G commented on YARN-5892:
---

Yes [~eepayne]. I will do a round of review and some tests today and share my 
experience. Thank You.

> Capacity Scheduler: Support user-specific minimum user limit percent
> 
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-05-23 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021209#comment-16021209
 ] 

Eric Payne commented on YARN-5892:
--

[~jlowe], [~leftnoteasy], [~sunilg], just checking to see if you have had any 
progress reviewing this patch. Thanks.

> Capacity Scheduler: Support user-specific minimum user limit percent
> 
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch, 
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch, 
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch, 
> YARN-5892.008.patch, YARN-5892.009.patch, YARN-5892.010.patch, 
> YARN-5892.012.patch, YARN-5892.013.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}} 
> property is per queue. A cluster admin should be able to set the minimum user 
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled 
> (YARN-4945 / YARN-2113), some users can be deemed as more important than 
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user 
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed 
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like 
> this:
> {code}
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent
> 25
>   
>   
> 
> yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent
> 75
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5006) ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk

2017-05-23 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021154#comment-16021154
 ] 

Bibin A Chundatt commented on YARN-5006:


[~imstefanlee]
Aim of jira is to protect RM from ApplicationStateData size is more than the 
Limit of ZK node. 
{quote}
Add 1 file into DistributedCache
{quote}
we can consider as way to simulate the issue. 


> ResourceManager quit due to ApplicationStateData exceed the limit  size of 
> znode in zk
> --
>
> Key: YARN-5006
> URL: https://issues.apache.org/jira/browse/YARN-5006
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0, 2.7.2
>Reporter: dongtingting
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: YARN-5006.001.patch, YARN-5006.002.patch
>
>
> Client submit a job, this job add 1 file into DistributedCache. when the 
> job is submitted, ResourceManager sotre ApplicationStateData into zk. 
> ApplicationStateData  is exceed the limit size of znode. RM exit 1.   
> The related code in RMStateStore.java :
> {code}
>   private static class StoreAppTransition
>   implements SingleArcTransition {
> @Override
> public void transition(RMStateStore store, RMStateStoreEvent event) {
>   if (!(event instanceof RMStateStoreAppEvent)) {
> // should never happen
> LOG.error("Illegal event type: " + event.getClass());
> return;
>   }
>   ApplicationState appState = ((RMStateStoreAppEvent) 
> event).getAppState();
>   ApplicationId appId = appState.getAppId();
>   ApplicationStateData appStateData = ApplicationStateData
>   .newInstance(appState);
>   LOG.info("Storing info for app: " + appId);
>   try {  
> store.storeApplicationStateInternal(appId, appStateData);  //store 
> the appStateData
> store.notifyApplication(new RMAppEvent(appId,
>RMAppEventType.APP_NEW_SAVED));
>   } catch (Exception e) {
> LOG.error("Error storing app: " + appId, e);
> store.notifyStoreOperationFailed(e);   //handle fail event, system 
> exit 
>   }
> };
>   }
> {code}
> The Exception log:
> {code}
>  ...
> 2016-04-20 11:26:35,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore 
> AsyncDispatcher event handler: Maxed out ZK retries. Giving up!
> 2016-04-20 11:26:35,732 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore 
> AsyncDispatcher event handler: Error storing app: 
> application_1461061795989_17671
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1075)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1096)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:947)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:956)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:626)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:123)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> 

[jira] [Commented] (YARN-6601) Allow service to be started as System Services during serviceapi start up

2017-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021081#comment-16021081
 ] 

Hadoop QA commented on YARN-6601:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
45s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
37s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
14s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 5s{color} | {color:green} yarn-native-services passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
4s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
yarn-native-services has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} yarn-native-services passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 55s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 206 unchanged - 1 fixed = 208 total (was 207) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ac17dc |
| JIRA Issue | YARN-6601 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869428/YARN-6601-yarn-native-services.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 0887a747e927 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | yarn-native-services / 8c3b3db |
| Default Java | 

[jira] [Comment Edited] (YARN-5006) ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk

2017-05-23 Thread stefanlee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021078#comment-16021078
 ] 

stefanlee edited comment on YARN-5006 at 5/23/17 11:39 AM:
---

[~bibinchundatt] thanks, but  why  "add 1 file into DistributedCache" can 
due to *ApplicationStateData* exceed *1M*?


was (Author: imstefanlee):
thanks, but  why  "add 1 file into DistributedCache" can due to 
ApplicationStateData exceed 1M?

> ResourceManager quit due to ApplicationStateData exceed the limit  size of 
> znode in zk
> --
>
> Key: YARN-5006
> URL: https://issues.apache.org/jira/browse/YARN-5006
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0, 2.7.2
>Reporter: dongtingting
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: YARN-5006.001.patch, YARN-5006.002.patch
>
>
> Client submit a job, this job add 1 file into DistributedCache. when the 
> job is submitted, ResourceManager sotre ApplicationStateData into zk. 
> ApplicationStateData  is exceed the limit size of znode. RM exit 1.   
> The related code in RMStateStore.java :
> {code}
>   private static class StoreAppTransition
>   implements SingleArcTransition {
> @Override
> public void transition(RMStateStore store, RMStateStoreEvent event) {
>   if (!(event instanceof RMStateStoreAppEvent)) {
> // should never happen
> LOG.error("Illegal event type: " + event.getClass());
> return;
>   }
>   ApplicationState appState = ((RMStateStoreAppEvent) 
> event).getAppState();
>   ApplicationId appId = appState.getAppId();
>   ApplicationStateData appStateData = ApplicationStateData
>   .newInstance(appState);
>   LOG.info("Storing info for app: " + appId);
>   try {  
> store.storeApplicationStateInternal(appId, appStateData);  //store 
> the appStateData
> store.notifyApplication(new RMAppEvent(appId,
>RMAppEventType.APP_NEW_SAVED));
>   } catch (Exception e) {
> LOG.error("Error storing app: " + appId, e);
> store.notifyStoreOperationFailed(e);   //handle fail event, system 
> exit 
>   }
> };
>   }
> {code}
> The Exception log:
> {code}
>  ...
> 2016-04-20 11:26:35,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore 
> AsyncDispatcher event handler: Maxed out ZK retries. Giving up!
> 2016-04-20 11:26:35,732 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore 
> AsyncDispatcher event handler: Error storing app: 
> application_1461061795989_17671
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1075)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1096)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:947)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:956)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:626)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:123)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> 

[jira] [Commented] (YARN-5006) ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk

2017-05-23 Thread stefanlee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021078#comment-16021078
 ] 

stefanlee commented on YARN-5006:
-

thanks, but  why  "add 1 file into DistributedCache" can due to 
ApplicationStateData exceed 1M?

> ResourceManager quit due to ApplicationStateData exceed the limit  size of 
> znode in zk
> --
>
> Key: YARN-5006
> URL: https://issues.apache.org/jira/browse/YARN-5006
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.6.0, 2.7.2
>Reporter: dongtingting
>Assignee: Bibin A Chundatt
>Priority: Critical
> Attachments: YARN-5006.001.patch, YARN-5006.002.patch
>
>
> Client submit a job, this job add 1 file into DistributedCache. when the 
> job is submitted, ResourceManager sotre ApplicationStateData into zk. 
> ApplicationStateData  is exceed the limit size of znode. RM exit 1.   
> The related code in RMStateStore.java :
> {code}
>   private static class StoreAppTransition
>   implements SingleArcTransition {
> @Override
> public void transition(RMStateStore store, RMStateStoreEvent event) {
>   if (!(event instanceof RMStateStoreAppEvent)) {
> // should never happen
> LOG.error("Illegal event type: " + event.getClass());
> return;
>   }
>   ApplicationState appState = ((RMStateStoreAppEvent) 
> event).getAppState();
>   ApplicationId appId = appState.getAppId();
>   ApplicationStateData appStateData = ApplicationStateData
>   .newInstance(appState);
>   LOG.info("Storing info for app: " + appId);
>   try {  
> store.storeApplicationStateInternal(appId, appStateData);  //store 
> the appStateData
> store.notifyApplication(new RMAppEvent(appId,
>RMAppEventType.APP_NEW_SAVED));
>   } catch (Exception e) {
> LOG.error("Error storing app: " + appId, e);
> store.notifyStoreOperationFailed(e);   //handle fail event, system 
> exit 
>   }
> };
>   }
> {code}
> The Exception log:
> {code}
>  ...
> 2016-04-20 11:26:35,732 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore 
> AsyncDispatcher event handler: Maxed out ZK retries. Giving up!
> 2016-04-20 11:26:35,732 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore 
> AsyncDispatcher event handler: Error storing app: 
> application_1461061795989_17671
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:931)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1075)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1096)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:933)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:947)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.createWithRetries(ZKRMStateStore.java:956)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeApplicationStateInternal(ZKRMStateStore.java:626)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$StoreAppTransition.transition(RMStateStore.java:123)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806)
> at 
> 

[jira] [Commented] (YARN-6458) New Yarn UI: Lock down dependency versions

2017-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021035#comment-16021035
 ] 

Hadoop QA commented on YARN-6458:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
5s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
51s{color} | {color:green} hadoop-yarn-ui in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
26s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6458 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869432/YARN-6458.2.patch |
| Optional Tests |  asflicense  mvnsite  compile  javac  javadoc  mvninstall  
unit  xml  |
| uname | Linux 9023f819e22c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d0f346a |
| Default Java | 1.8.0_131 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/16001/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/16001/artifact/patchprocess/patch-asflicense-problems.txt
 |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/16001/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> New Yarn UI: Lock down dependency versions
> --
>
> Key: YARN-6458
> URL: https://issues.apache.org/jira/browse/YARN-6458
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: YARN-6458.1.patch, YARN-6458.2.patch
>
>
> As we use semver to denote dependency version, every time a new build is 
> made, the latest available version of the dependency would be downloaded. 
> This affects the reliability of the UI build. Hence we must lockdown the 
> dependencies.
> Lockdown must happen in both the package managers used by the UI - NPM & 
> Bower.
> Yarn:
> Replace NPM with Yarn. Yarn is a package 

[jira] [Updated] (YARN-6047) Documentation updates for TimelineService v2

2017-05-23 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-6047:

Attachment: YARN-6047-YARN-5355.001.patch

updating the documentation patch covering from_id query parameter and new rest 
api entity-types. 
cc :/ [~vrushalic] [~haibo.chen]

> Documentation updates for TimelineService v2
> 
>
> Key: YARN-6047
> URL: https://issues.apache.org/jira/browse/YARN-6047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: documentation, timelineserver
>Reporter: Varun Saxena
>Assignee: Rohith Sharma K S
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6047-YARN-5355.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6458) New Yarn UI: Lock down dependency versions

2017-05-23 Thread Sreenath Somarajapuram (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath Somarajapuram updated YARN-6458:
-
Attachment: YARN-6458.2.patch

[~sunilg] Attaching a fresh patch with ember-truth-helpers version upgraded.

> New Yarn UI: Lock down dependency versions
> --
>
> Key: YARN-6458
> URL: https://issues.apache.org/jira/browse/YARN-6458
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: YARN-6458.1.patch, YARN-6458.2.patch
>
>
> As we use semver to denote dependency version, every time a new build is 
> made, the latest available version of the dependency would be downloaded. 
> This affects the reliability of the UI build. Hence we must lockdown the 
> dependencies.
> Lockdown must happen in both the package managers used by the UI - NPM & 
> Bower.
> Yarn:
> Replace NPM with Yarn. Yarn is a package manager developed to solve this 
> issue and many more. It also enables offline build.
> Bower: 
> Bower shrinkwrap resolver plugin can be used to lock the dependency versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6601) Allow service to be started as System Services during serviceapi start up

2017-05-23 Thread Lokesh Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated YARN-6601:
--
Attachment: YARN-6601-yarn-native-services.002.patch

> Allow service to be started as System Services during serviceapi start up
> -
>
> Key: YARN-6601
> URL: https://issues.apache.org/jira/browse/YARN-6601
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Rohith Sharma K S
> Attachments: SystemServices.pdf, 
> YARN-6601-yarn-native-services.001.patch, 
> YARN-6601-yarn-native-services.002.patch
>
>
> This is extended from YARN-1593 focusing only on system services. This 
> particular JIRA focusing on starting the system services during 
> native-service-api start up. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6458) New Yarn UI: Lock down dependency versions

2017-05-23 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020935#comment-16020935
 ] 

Sunil G commented on YARN-6458:
---

Thanks [~ssomarajapu...@hortonworks.com]

I tested it in ubuntu, and it works fine. I think we can commit this change 
now. Any more thoughts?

> New Yarn UI: Lock down dependency versions
> --
>
> Key: YARN-6458
> URL: https://issues.apache.org/jira/browse/YARN-6458
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sreenath Somarajapuram
>Assignee: Sreenath Somarajapuram
> Attachments: YARN-6458.1.patch
>
>
> As we use semver to denote dependency version, every time a new build is 
> made, the latest available version of the dependency would be downloaded. 
> This affects the reliability of the UI build. Hence we must lockdown the 
> dependencies.
> Lockdown must happen in both the package managers used by the UI - NPM & 
> Bower.
> Yarn:
> Replace NPM with Yarn. Yarn is a package manager developed to solve this 
> issue and many more. It also enables offline build.
> Bower: 
> Bower shrinkwrap resolver plugin can be used to lock the dependency versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6125) The application attempt's diagnostic message should have a maximum size

2017-05-23 Thread stefanlee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020819#comment-16020819
 ] 

stefanlee commented on YARN-6125:
-

thanks for this jira, i have a doubt that why class *BounderAppender* is 
*static*?[~templedf] [~andras.piros]

> The application attempt's diagnostic message should have a maximum size
> ---
>
> Key: YARN-6125
> URL: https://issues.apache.org/jira/browse/YARN-6125
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: Daniel Templeton
>Assignee: Andras Piros
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6125.000.patch, YARN-6125.001.patch, 
> YARN-6125.002.patch, YARN-6125.003.patch, YARN-6125.004.patch, 
> YARN-6125.005.patch, YARN-6125.006.patch, YARN-6125.007.patch, 
> YARN-6125.008.patch, YARN-6125.009.patch
>
>
> We've found through experience that the diagnostic message can grow 
> unbounded.  I've seen attempts that have diagnostic messages over 1MB.  Since 
> the message is stored in the state store, it's a bad idea to allow the 
> message to grow unbounded.  Instead, there should be a property that sets a 
> maximum size on the message.
> I suspect that some of the ZK state store issues we've seen in the past were 
> due to the size of the diagnostic messages and not to the size of the 
> classpath, as is the current prevailing opinion.
> An open question is how best to prune the message once it grows too large.  
> Should we
> # truncate the tail,
> # truncate the head,
> # truncate the middle,
> # add another property to make the behavior selectable, or
> # none of the above?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6635) Merging refactored changes from yarn-native-services branch

2017-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16020721#comment-16020721
 ] 

Hadoop QA commented on YARN-6635:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | YARN-6635 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12869402/YARN-6635.001.patch |
| Optional Tests |  asflicense  |
| uname | Linux dd0d69d03ff0 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d0f346a |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15999/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Merging refactored changes from yarn-native-services branch
> ---
>
> Key: YARN-6635
> URL: https://issues.apache.org/jira/browse/YARN-6635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akhil PB
>Assignee: Akhil PB
> Attachments: YARN-6635.001.patch
>
>
> There are some refactoring done for yarn-app pages in new YARN UI codebase in 
> yarn-native-services branch. This ticket intends to bring the refactored 
> changes done in UI code into trunk from yarn-native-services branch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6635) Merging refactored changes from yarn-native-services branch

2017-05-23 Thread Akhil PB (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-6635:
---
Attachment: YARN-6635.001.patch

Adding v1 patch.
Hi [~sunilg], help in test and review the patch. 

> Merging refactored changes from yarn-native-services branch
> ---
>
> Key: YARN-6635
> URL: https://issues.apache.org/jira/browse/YARN-6635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Akhil PB
>Assignee: Akhil PB
> Attachments: YARN-6635.001.patch
>
>
> There are some refactoring done for yarn-app pages in new YARN UI codebase in 
> yarn-native-services branch. This ticket intends to bring the refactored 
> changes done in UI code into trunk from yarn-native-services branch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6635) Merging refactored changes from yarn-native-services branch

2017-05-23 Thread Akhil PB (JIRA)
Akhil PB created YARN-6635:
--

 Summary: Merging refactored changes from yarn-native-services 
branch
 Key: YARN-6635
 URL: https://issues.apache.org/jira/browse/YARN-6635
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Akhil PB
Assignee: Akhil PB


There are some refactoring done for yarn-app pages in new YARN UI codebase in 
yarn-native-services branch. This ticket intends to bring the refactored 
changes done in UI code into trunk from yarn-native-services branch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org