[jira] [Updated] (YARN-7673) ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster

2017-12-19 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-7673:
-
Affects Version/s: 3.0.0

> ClassNotFoundException: 
> org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using 
> hadoop-client-minicluster
> --
>
> Key: YARN-7673
> URL: https://issues.apache.org/jira/browse/YARN-7673
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jeff Zhang
>
> I'd like to use hadoop-client-minicluster for hadoop downstream project, but 
> I encounter the following exception when starting hadoop minicluster.  And I 
> check the hadoop-client-minicluster, it indeed does not have this class. Is 
> this something that is missing when packaging the published jar ?
> {code}
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/api/DistributedSchedulingAMProtocol
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.createResourceManager(MiniYARNCluster.java:851)
>   at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:285)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7673) ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster

2017-12-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296390#comment-16296390
 ] 

Jeff Zhang commented on YARN-7673:
--

\cc [~djp]

> ClassNotFoundException: 
> org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using 
> hadoop-client-minicluster
> --
>
> Key: YARN-7673
> URL: https://issues.apache.org/jira/browse/YARN-7673
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jeff Zhang
>
> I'd like to use hadoop-client-minicluster for hadoop downstream project, but 
> I encounter the following exception when starting hadoop minicluster.  And I 
> check the hadoop-client-minicluster, it indeed does not have this class. Is 
> this something that is missing when packaging the published jar ?
> {code}
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/yarn/server/api/DistributedSchedulingAMProtocol
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.createResourceManager(MiniYARNCluster.java:851)
>   at 
> org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:285)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7673) ClassNotFoundException: org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using hadoop-client-minicluster

2017-12-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-7673:


 Summary: ClassNotFoundException: 
org.apache.hadoop.yarn.server.api.DistributedSchedulingAMProtocol when using 
hadoop-client-minicluster
 Key: YARN-7673
 URL: https://issues.apache.org/jira/browse/YARN-7673
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jeff Zhang


I'd like to use hadoop-client-minicluster for hadoop downstream project, but I 
encounter the following exception when starting hadoop minicluster.  And I 
check the hadoop-client-minicluster, it indeed does not have this class. Is 
this something that is missing when packaging the published jar ?

{code}
java.lang.NoClassDefFoundError: 
org/apache/hadoop/yarn/server/api/DistributedSchedulingAMProtocol

at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.createResourceManager(MiniYARNCluster.java:851)
at 
org.apache.hadoop.yarn.server.MiniYARNCluster.serviceInit(MiniYARNCluster.java:285)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6364) How to set the resource queue when start spark job running on yarn

2017-03-20 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932404#comment-15932404
 ] 

Jeff Zhang commented on YARN-6364:
--

Set spark.yarn.queue in zeppelin interpreter setting. 

> How to set the resource queue  when start spark job running on yarn
> ---
>
> Key: YARN-6364
> URL: https://issues.apache.org/jira/browse/YARN-6364
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sydt
>
> As we all know, yarn takes charge of resource manage for hadoop. When 
> zeppelin start a spark job with yarn-client mode, how to set the designated 
> resource queue on yarn in order to make different spark applications belongs 
> to respective  user running different yarn resource queue?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6364) How to set the resource queue when start spark job running on yarn

2017-03-20 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved YARN-6364.
--
Resolution: Invalid

> How to set the resource queue  when start spark job running on yarn
> ---
>
> Key: YARN-6364
> URL: https://issues.apache.org/jira/browse/YARN-6364
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: sydt
>
> As we all know, yarn takes charge of resource manage for hadoop. When 
> zeppelin start a spark job with yarn-client mode, how to set the designated 
> resource queue on yarn in order to make different spark applications belongs 
> to respective  user running different yarn resource queue?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3603) Application Attempts page confusing

2015-09-23 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904342#comment-14904342
 ] 

Jeff Zhang commented on YARN-3603:
--

Still feel excluding failed container doesn't make sense. If some of my 
containers fail during job running, how can I check its logs through web ui if 
I don't want to kill the job ? 

> Application Attempts page confusing
> ---
>
> Key: YARN-3603
> URL: https://issues.apache.org/jira/browse/YARN-3603
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.8.0
>Reporter: Thomas Graves
>Assignee: Sunil G
> Attachments: 0001-YARN-3603.patch, 0002-YARN-3603.patch, 
> 0003-YARN-3603.patch, ahs1.png
>
>
> The application attempts page 
> (http://RM:8088/cluster/appattempt/appattempt_1431101480046_0003_01)
> is a bit confusing on what is going on.  I think the table of containers 
> there is for only Running containers and when the app is completed or killed 
> its empty.  The table should have a label on it stating so.  
> Also the "AM Container" field is a link when running but not when its killed. 
>  That might be confusing.
> There is no link to the logs in this page but there is in the app attempt 
> table when looking at http://
> rm:8088/cluster/app/application_1431101480046_0003



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved YARN-4182.
--
Resolution: Duplicate

> Killed containers disappear on app attempt page
> ---
>
> Key: YARN-4182
> URL: https://issues.apache.org/jira/browse/YARN-4182
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Jeff Zhang
> Attachments: 2015-09-18_1601.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805223#comment-14805223
 ] 

Jeff Zhang commented on YARN-4182:
--

Thanks [~bibinchundatt] Close it as duplicated and put comment in YARN-3603

> Killed containers disappear on app attempt page
> ---
>
> Key: YARN-4182
> URL: https://issues.apache.org/jira/browse/YARN-4182
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Jeff Zhang
> Attachments: 2015-09-18_1601.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3603) Application Attempts page confusing

2015-09-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805222#comment-14805222
 ] 

Jeff Zhang commented on YARN-3603:
--

[~sunilg] If it is "Running Container ID", is it still necessary to include 
"Container Exit Status" ?  And any reasons to exclude the killed containers 
here ? I think it would be helpful to diagnose if all the containers are 
included. 


> Application Attempts page confusing
> ---
>
> Key: YARN-3603
> URL: https://issues.apache.org/jira/browse/YARN-3603
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.8.0
>Reporter: Thomas Graves
>Assignee: Sunil G
> Attachments: 0001-YARN-3603.patch, 0002-YARN-3603.patch, 
> 0003-YARN-3603.patch, ahs1.png
>
>
> The application attempts page 
> (http://RM:8088/cluster/appattempt/appattempt_1431101480046_0003_01)
> is a bit confusing on what is going on.  I think the table of containers 
> there is for only Running containers and when the app is completed or killed 
> its empty.  The table should have a label on it stating so.  
> Also the "AM Container" field is a link when running but not when its killed. 
>  That might be confusing.
> There is no link to the logs in this page but there is in the app attempt 
> table when looking at http://
> rm:8088/cluster/app/application_1431101480046_0003



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-4182:
-
Attachment: 2015-09-18_1601.png

> Killed containers disappear on app attempt page
> ---
>
> Key: YARN-4182
> URL: https://issues.apache.org/jira/browse/YARN-4182
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Reporter: Jeff Zhang
> Attachments: 2015-09-18_1601.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4182) Killed containers disappear on app attempt page

2015-09-18 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-4182:


 Summary: Killed containers disappear on app attempt page
 Key: YARN-4182
 URL: https://issues.apache.org/jira/browse/YARN-4182
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Jeff Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-13 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-4154:
-
Affects Version/s: 2.6.1
 Target Version/s: 2.6.1
 Priority: Blocker  (was: Major)

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Jeff Zhang
>Priority: Blocker
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-13 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-4154:
-
Description: 
{code}
[ERROR] 
/mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
 no suitable constructor found for 
MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)

{code}

MR might have the same issue.

\cc [~vinodkv]



  was:
{code}
[ERROR] 
/mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
 no suitable constructor found for 
MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)

{code}

\cc [~vinodkv]


> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jeff Zhang
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> MR might have the same issue.
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-13 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-4154:
-
Description: 
{code}
[ERROR] 
/mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
 no suitable constructor found for 
MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)

{code}

\cc [~vinodkv]

  was:
{code}
[ERROR] 
/mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
 no suitable constructor found for 
MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)

{code}


> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jeff Zhang
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}
> \cc [~vinodkv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-13 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-4154:
-
Description: 
{code}
[ERROR] 
/mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
 no suitable constructor found for 
MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)
constructor 
org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
 is not applicable
  (actual and formal argument lists differ in length)

{code}

> Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change
> ---
>
> Key: YARN-4154
> URL: https://issues.apache.org/jira/browse/YARN-4154
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jeff Zhang
>
> {code}
> [ERROR] 
> /mnt/nfs0/jzhang/tez-autobuild/tez/tez-plugins/tez-yarn-timeline-history/src/test/java/org/apache/tez/tests/MiniTezClusterWithTimeline.java:[92,5]
>  no suitable constructor found for 
> MiniYARNCluster(java.lang.String,int,int,int,int,boolean)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> constructor 
> org.apache.hadoop.yarn.server.MiniYARNCluster.MiniYARNCluster(java.lang.String,int,int,int)
>  is not applicable
>   (actual and formal argument lists differ in length)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4154) Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster change

2015-09-13 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-4154:


 Summary: Tez Build with hadoop 2.6.1 fails due to MiniYarnCluster 
change
 Key: YARN-4154
 URL: https://issues.apache.org/jira/browse/YARN-4154
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jeff Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700867#comment-14700867
 ] 

Jeff Zhang commented on YARN-2262:
--

And the document needs to be updated for the deprecation of 
FileSystemApplicationHistoryStore.

http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/TimelineServer.html



> Few fields displaying wrong values in Timeline server after RM restart
> --
>
> Key: YARN-2262
> URL: https://issues.apache.org/jira/browse/YARN-2262
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.4.0
>Reporter: Nishan Shetty
>Assignee: Naganarasimha G R
> Attachments: Capture.PNG, Capture1.PNG, 
> yarn-testos-historyserver-HOST-10-18-40-95.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-95.log
>
>
> Few fields displaying wrong values in Timeline server after RM restart
> State:null
> FinalStatus:  UNDEFINED
> Started:  8-Jul-2014 14:58:08
> Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700861#comment-14700861
 ] 

Jeff Zhang commented on YARN-2262:
--

But my app still can not be recovered. Does it mean yarn can not recover 
running app ?
{code}
2015-08-18 15:18:35,270 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Recovering attempt: appattempt_1439882258172_0001_01 with final state: null
2015-08-18 15:18:35,270 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Create AMRMToken for ApplicationAttempt: appattempt_1439882258172_0001_01
2015-08-18 15:18:35,273 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Creating password for appattempt_1439882258172_0001_01
2015-08-18 15:18:35,277 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
Application added - appId: application_1439882258172_0001 user: jzhang 
leaf-queue of parent: root #applications: 1
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Accepted application application_1439882258172_0001 from user: jzhang, in 
queue: default
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1439882258172_0001_01 State change from NEW to LAUNCHED
2015-08-18 15:18:35,278 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
application_1439882258172_0001 State change from NEW to ACCEPTED
{code}

{code}
2015-08-18 15:18:36,305 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Application attempt appattempt_1439882258172_0001_01 doesn't exist in 
ApplicationMasterService cache.
2015-08-18 15:18:36,306 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 
on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate 
from 192.168.3.3:56241 Call#56 Retry#0
org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: 
Application attempt appattempt_1439882258172_0001_01 doesn't exist in 
ApplicationMasterService cache.
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
2015-08-18 15:18:37,298 INFO org.apache.hadoop.yarn.util.RackResolver: Resolved 
192.168.3.3 to /default-rack
{code}

> Few fields displaying wrong values in Timeline server after RM restart
> --
>
> Key: YARN-2262
> URL: https://issues.apache.org/jira/browse/YARN-2262
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.4.0
>Reporter: Nishan Shetty
>Assignee: Naganarasimha G R
> Attachments: Capture.PNG, Capture1.PNG, 
> yarn-testos-historyserver-HOST-10-18-40-95.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-95.log
>
>
> Few fields displaying wrong values in Timeline server after RM restart
> State:null
> FinalStatus:  UNDEFINED
> Started:  8-Jul-2014 14:58:08
> Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700845#comment-14700845
 ] 

Jeff Zhang commented on YARN-2262:
--

Please ignore my last comment, finally find how to use ATS for Storing history 
data.

{code}
private ApplicationHistoryManager createApplicationHistoryManager(
  Configuration conf) {
// Backward compatibility:
// APPLICATION_HISTORY_STORE is neither null nor empty, it means that the
// user has enabled it explicitly.
if (conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE) == null ||
conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE).length() == 0 ||
conf.get(YarnConfiguration.APPLICATION_HISTORY_STORE).equals(
NullApplicationHistoryStore.class.getName())) {
  return new ApplicationHistoryManagerOnTimelineStore(
  timelineDataManager, aclsManager);
} else {
  LOG.warn("The filesystem based application history store is deprecated.");
  return new ApplicationHistoryManagerImpl();
}
  }
{code}

> Few fields displaying wrong values in Timeline server after RM restart
> --
>
> Key: YARN-2262
> URL: https://issues.apache.org/jira/browse/YARN-2262
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.4.0
>Reporter: Nishan Shetty
>Assignee: Naganarasimha G R
> Attachments: Capture.PNG, Capture1.PNG, 
> yarn-testos-historyserver-HOST-10-18-40-95.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-95.log
>
>
> Few fields displaying wrong values in Timeline server after RM restart
> State:null
> FinalStatus:  UNDEFINED
> Started:  8-Jul-2014 14:58:08
> Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2262) Few fields displaying wrong values in Timeline server after RM restart

2015-08-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700830#comment-14700830
 ] 

Jeff Zhang commented on YARN-2262:
--

bq. No longer maintain FS based generic history store.
I can reproduce this issue easily by restarting RM when app is running. Check 
YARN-2033, I do see that app history data can now be stored in Timeline 
service. But it looks like there's no ATS implementation of 
ApplicationHistoryStore.  FileSystemApplicationHistoryStore is still the only 
feasible one for RM recovery. so does it make sense to make it no longer 
maintain. Or do I miss something ? [~zjshen]  [~djp]


> Few fields displaying wrong values in Timeline server after RM restart
> --
>
> Key: YARN-2262
> URL: https://issues.apache.org/jira/browse/YARN-2262
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.4.0
>Reporter: Nishan Shetty
>Assignee: Naganarasimha G R
> Attachments: Capture.PNG, Capture1.PNG, 
> yarn-testos-historyserver-HOST-10-18-40-95.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-84.log, 
> yarn-testos-resourcemanager-HOST-10-18-40-95.log
>
>
> Few fields displaying wrong values in Timeline server after RM restart
> State:null
> FinalStatus:  UNDEFINED
> Started:  8-Jul-2014 14:58:08
> Elapsed:  2562047397789hrs, 44mins, 47sec 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3763) Support fuzzy search in ATS

2015-06-02 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3763:
-
Description: Currently ATS only support exact match. Sometimes fuzzy match 
may be helpful when the entities in the ATS has some common prefix or suffix.   
Link with TEZ-2531  (was: Currently ATS only support exact match. Sometimes 
fuzzy match may be helpful when the entities in the ATS has some common prefix 
or suffix.  )

> Support fuzzy search in ATS
> ---
>
> Key: YARN-3763
> URL: https://issues.apache.org/jira/browse/YARN-3763
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>
> Currently ATS only support exact match. Sometimes fuzzy match may be helpful 
> when the entities in the ATS has some common prefix or suffix.   Link with 
> TEZ-2531



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3763) Support for fuzzy search in ATS

2015-06-02 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3763:


 Summary: Support for fuzzy search in ATS
 Key: YARN-3763
 URL: https://issues.apache.org/jira/browse/YARN-3763
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: timelineserver
Affects Versions: 2.7.0
Reporter: Jeff Zhang


Currently ATS only support exact match. Sometimes fuzzy match may be helpful 
when the entities in the ATS has some common prefix or suffix.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3763) Support fuzzy search in ATS

2015-06-02 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3763:
-
Summary: Support fuzzy search in ATS  (was: Support for fuzzy search in ATS)

> Support fuzzy search in ATS
> ---
>
> Key: YARN-3763
> URL: https://issues.apache.org/jira/browse/YARN-3763
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: timelineserver
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>
> Currently ATS only support exact match. Sometimes fuzzy match may be helpful 
> when the entities in the ATS has some common prefix or suffix.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3755) Log the command of launching containers

2015-06-02 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570276#comment-14570276
 ] 

Jeff Zhang commented on YARN-3755:
--

Close it as won't fix

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch, YARN-3755-2.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3755) Log the command of launching containers

2015-06-02 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14570275#comment-14570275
 ] 

Jeff Zhang commented on YARN-3755:
--

bq. How about we let individual frameworks like MapReduce/Tez log them as 
needed? That seems like the right place for debugging too - app developers 
don't always get access to the daemon logs.
Make sense. 

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch, YARN-3755-2.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3755) Log the command of launching containers

2015-06-02 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3755:
-
Attachment: YARN-3755-2.patch

Upload new patch to address the checkstyle issue 

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch, YARN-3755-2.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3755:
-
Target Version/s: 2.7.1

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3755:
-
Attachment: YARN-3755-1.patch

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3755:
-
Description: 
In the resource manager log, yarn would log the command for launching AM, this 
is very useful. But there's no such log in the NN log for launching containers. 
It would be difficult to diagnose when containers fails to launch due to some 
issue in the commands. Although user can look at the commands in the container 
launch script file, this is an internal things of yarn, usually user don't know 
that. In user's perspective, they only know what commands they specify when 
building yarn application. 

{code}
2015-06-01 16:06:42,245 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1433145984561_0001_01_01 : $JAVA_HOME/bin/java 
-server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  
-Xmx1024m  -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
-Dlog4j.configuration=tez-container-log4j.properties 
-Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
-Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 1>/stdout 
2>/stderr
{code}

  was:
In the resource manager, yarn would log the command for launching AM, this is 
very useful. But there's no such log in the NN log for launching containers. It 
would be difficult to diagnose when containers fails to launch due to some 
issue in the commands. Although use can look at the commands in the container 
launch script file, this is an internal things of yarn, usually user don't know 
that. In user's perspective, they only know what command they specify when 
building yarn application. 

{code}
2015-06-01 16:06:42,245 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1433145984561_0001_01_01 : $JAVA_HOME/bin/java 
-server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  
-Xmx1024m  -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
-Dlog4j.configuration=tez-container-log4j.properties 
-Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
-Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 1>/stdout 
2>/stderr
{code}


> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Attachments: YARN-3755-1.patch
>
>
> In the resource manager log, yarn would log the command for launching AM, 
> this is very useful. But there's no such log in the NN log for launching 
> containers. It would be difficult to diagnose when containers fails to launch 
> due to some issue in the commands. Although user can look at the commands in 
> the container launch script file, this is an internal things of yarn, usually 
> user don't know that. In user's perspective, they only know what commands 
> they specify when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3755:


 Summary: Log the command of launching containers
 Key: YARN-3755
 URL: https://issues.apache.org/jira/browse/YARN-3755
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jeff Zhang


In the resource manager, yarn would log the command for launching AM, this is 
very useful. But there's no such log in the NN log for launching containers. It 
would be difficult to diagnose when containers fails to launch due to some 
issue in the commands. Although use can look at the commands in the container 
launch script file, this is an internal things of yarn, usually user don't know 
that. In user's perspective, they only know what command they specify when 
building yarn application. 

{code}
2015-06-01 16:06:42,245 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1433145984561_0001_01_01 : $JAVA_HOME/bin/java 
-server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  
-Xmx1024m  -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
-Dlog4j.configuration=tez-container-log4j.properties 
-Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
-Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 1>/stdout 
2>/stderr
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3755) Log the command of launching containers

2015-06-01 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned YARN-3755:


Assignee: Jeff Zhang

> Log the command of launching containers
> ---
>
> Key: YARN-3755
> URL: https://issues.apache.org/jira/browse/YARN-3755
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
>
> In the resource manager, yarn would log the command for launching AM, this is 
> very useful. But there's no such log in the NN log for launching containers. 
> It would be difficult to diagnose when containers fails to launch due to some 
> issue in the commands. Although use can look at the commands in the container 
> launch script file, this is an internal things of yarn, usually user don't 
> know that. In user's perspective, they only know what command they specify 
> when building yarn application. 
> {code}
> 2015-06-01 16:06:42,245 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command 
> to launch container container_1433145984561_0001_01_01 : 
> $JAVA_HOME/bin/java -server -Djava.net.preferIPv4Stack=true 
> -Dhadoop.metrics.log.level=WARN  -Xmx1024m  
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator 
> -Dlog4j.configuration=tez-container-log4j.properties 
> -Dyarn.app.container.log.dir= -Dtez.root.logger=info,CLA 
> -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster 
> 1>/stdout 2>/stderr
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1609) Add Service Container type to NodeManager in YARN

2015-03-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1609:
-
Assignee: (was: Jeff Zhang)

> Add Service Container type to NodeManager in YARN
> -
>
> Key: YARN-1609
> URL: https://issues.apache.org/jira/browse/YARN-1609
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Wangda Tan (No longer used)
> Attachments: Add Service Container type to NodeManager in YARN-V1.pdf
>
>
> From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found 
> that it’s important to have framework specific daemon process manage the 
> tasks on each node directly. The daemon process, most likely similar in other 
> frameworks as well, provides critical services to tasks running on that 
> node(for example “wireup”, spawn user process in large numbers at once etc). 
> In YARN, it’s hard, if not possible, to have the those processes to be 
> managed by YARN. 
> We propose to extend the container model on NodeManager side to support 
> “Service Container” to run/manage such framework daemon/services process. We 
> believe this is very useful to other application framework developers as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315680#comment-14315680
 ] 

Jeff Zhang commented on YARN-3171:
--

[~Naganarasimha] Please go ahead.

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Attachment: ats_webui.png

attach screenshot

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Priority: Minor  (was: Major)

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Attachments: ats_webui.png
>
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3171) Sort by application id doesn't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-3171:
-
Summary: Sort by application id doesn't work in ATS web ui  (was: Sort by 
application id don't work in ATS web ui)

> Sort by application id doesn't work in ATS web ui
> -
>
> Key: YARN-3171
> URL: https://issues.apache.org/jira/browse/YARN-3171
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>
> The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3171) Sort by application id don't work in ATS web ui

2015-02-10 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3171:


 Summary: Sort by application id don't work in ATS web ui
 Key: YARN-3171
 URL: https://issues.apache.org/jira/browse/YARN-3171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jeff Zhang


The order doesn't change when I click the column header



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2015-01-26 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1743:
-
Attachment: YARN-1743-3.patch

[~leftnoteasy] attach then updated patch with apache licence header added.


> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743-3.patch, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2015-01-22 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14287192#comment-14287192
 ] 

Jeff Zhang commented on YARN-1743:
--

[~leftnoteasy] Upload a new patch 
* Change the annotation type to be Class
* Add more java doc to explain the usage of the 2 annotations
* The patch only use the annotation on ApplicationEventType, for the other 
events we can create following up jira on that.

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2015-01-22 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1743:
-
Attachment: YARN-1743-2.patch

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
>  Labels: documentation
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, 
> YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2015-01-05 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265426#comment-14265426
 ] 

Jeff Zhang commented on YARN-3000:
--

[~aw] THanks for clarification, then it make sense to make YARN_PID_DIR 
deprecated. 

> YARN_PID_DIR should be visible in yarn-env.sh
> -
>
> Key: YARN-3000
> URL: https://issues.apache.org/jira/browse/YARN-3000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3000.patch
>
>
> Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed 
> not the place for user to set up enviroment variable. IMO, yarn-env.sh is the 
> place for users to set up enviroment variable just like hadoop-env.sh, so 
> it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment 
> just like YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2015-01-05 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264373#comment-14264373
 ] 

Jeff Zhang commented on YARN-3000:
--

BTW, what's the jira for making HADOOP_PID_DIR to replace YARN_PID_DIR ? As I 
know hadoop-2.6 didn't do that. 

> YARN_PID_DIR should be visible in yarn-env.sh
> -
>
> Key: YARN-3000
> URL: https://issues.apache.org/jira/browse/YARN-3000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3000.patch
>
>
> Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed 
> not the place for user to set up enviroment variable. IMO, yarn-env.sh is the 
> place for users to set up enviroment variable just like hadoop-env.sh, so 
> it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment 
> just like YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2015-01-05 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264371#comment-14264371
 ] 

Jeff Zhang commented on YARN-3000:
--

It makes sense to make YARN_PID_DIR as deprecated if HADOOP_PID_DIR is used for 
yarn in trunk. 

> YARN_PID_DIR should be visible in yarn-env.sh
> -
>
> Key: YARN-3000
> URL: https://issues.apache.org/jira/browse/YARN-3000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Assignee: Rohith
>Priority: Minor
> Attachments: 0001-YARN-3000.patch
>
>
> Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed 
> not the place for user to set up enviroment variable. IMO, yarn-env.sh is the 
> place for users to set up enviroment variable just like hadoop-env.sh, so 
> it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment 
> just like YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2014-12-31 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262022#comment-14262022
 ] 

Jeff Zhang commented on YARN-3000:
--

Sure, please take it over. 

> YARN_PID_DIR should be visible in yarn-env.sh
> -
>
> Key: YARN-3000
> URL: https://issues.apache.org/jira/browse/YARN-3000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Priority: Minor
>
> Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed 
> not the place for user to set up enviroment variable. IMO, yarn-env.sh is the 
> place for users to set up enviroment variable just like hadoop-env.sh, so 
> it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment 
> just like YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2014-12-31 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262021#comment-14262021
 ] 

Jeff Zhang commented on YARN-3000:
--

Sure, please take it over. 

> YARN_PID_DIR should be visible in yarn-env.sh
> -
>
> Key: YARN-3000
> URL: https://issues.apache.org/jira/browse/YARN-3000
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
>Reporter: Jeff Zhang
>Priority: Minor
>
> Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed 
> not the place for user to set up enviroment variable. IMO, yarn-env.sh is the 
> place for users to set up enviroment variable just like hadoop-env.sh, so 
> it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment 
> just like YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh

2014-12-30 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-3000:


 Summary: YARN_PID_DIR should be visible in yarn-env.sh
 Key: YARN-3000
 URL: https://issues.apache.org/jira/browse/YARN-3000
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 2.6.0
Reporter: Jeff Zhang
Priority: Minor


Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not 
the place for user to set up enviroment variable. IMO, yarn-env.sh is the place 
for users to set up enviroment variable just like hadoop-env.sh, so it's better 
to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like 
YARN_RESOURCEMANAGER_HEAPSIZE )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport

2014-12-30 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261855#comment-14261855
 ] 

Jeff Zhang commented on YARN-2560:
--

Check RMAppImpl & RMAppAttemptImpl and found that FinalApplicationStatus would 
been retrieved from the currentAttempt, while diagnostics would been retrieved 
from the RMApp which would be updated until it gets the AttemptFinishedEvent. 
This is the root cause that make the FinalApplicationStatus and diagnostics 
inconsistent sometimes. 

> Diagnostics is delayed to passed to ApplicationReport
> -
>
> Key: YARN-2560
> URL: https://issues.apache.org/jira/browse/YARN-2560
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jeff Zhang
>
> The diagnostics of Application may be delayed to pass to ApplicationReport. 
> Here's one example when ApplicationStatus has changed to FAILED, but the 
> diagnostics is still empty. And the next call of getApplicationReport could 
> get the diagnostics.
> {code}
> while(true) {
> appReport = yarnClient.getApplicationReport(appId);
> Thread.sleep(1000);
> LOG.info("AppStatus:" + appReport.getFinalApplicationStatus());
> LOG.info("Diagnostics:" + appReport.getDiagnostics());
> 
> }
> {code}
> *Output:*
> {code}
> AppStatus:FAILED
> Diagnostics: // empty
> // get diagnostics for the next getApplicationReport
> AppStatus:FAILED
> Diagnostics: // diagnostics info here
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-1197) Support changing resources of an allocated container

2014-12-10 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1197:
-
Assignee: (was: Jeff Zhang)

> Support changing resources of an allocated container
> 
>
> Key: YARN-1197
> URL: https://issues.apache.org/jira/browse/YARN-1197
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, nodemanager, resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Wangda Tan
> Attachments: mapreduce-project.patch.ver.1, 
> tools-project.patch.ver.1, yarn-1197-scheduler-v1.pdf, yarn-1197-v2.pdf, 
> yarn-1197-v3.pdf, yarn-1197-v4.pdf, yarn-1197-v5.pdf, yarn-1197.pdf, 
> yarn-api-protocol.patch.ver.1, yarn-pb-impl.patch.ver.1, 
> yarn-server-common.patch.ver.1, yarn-server-nodemanager.patch.ver.1, 
> yarn-server-resourcemanager.patch.ver.1
>
>
> The current YARN resource management logic assumes resource allocated to a 
> container is fixed during the lifetime of it. When users want to change a 
> resource 
> of an allocated container the only way is releasing it and allocating a new 
> container with expected size.
> Allowing run-time changing resources of an allocated container will give us 
> better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2750) Allow StateMachine has callback when transition fail

2014-10-27 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-2750:
-
Attachment: YARN-2750-2.patch

> Allow StateMachine has callback when transition fail
> 
>
> Key: YARN-2750
> URL: https://issues.apache.org/jira/browse/YARN-2750
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.5.1
>Reporter: Jeff Zhang
> Attachments: YARN-2750-2.patch, YARN-2750.patch
>
>
> We have a situation that sometimes Transition may fail, but we don't want to 
> handle the fail in each Transition, we'd like to handle it in one centralized 
> place, Allow StateMachine has a callback would be good for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2750) Allow StateMachine has callback when transition fail

2014-10-27 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-2750:
-
Attachment: YARN-2750.patch

Attach a patch for initial review. 

> Allow StateMachine has callback when transition fail
> 
>
> Key: YARN-2750
> URL: https://issues.apache.org/jira/browse/YARN-2750
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.5.1
>Reporter: Jeff Zhang
> Attachments: YARN-2750.patch
>
>
> We have a situation that sometimes Transition may fail, but we don't want to 
> handle the fail in each Transition, we'd like to handle it in one centralized 
> place, Allow StateMachine has a callback would be good for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2750) Allow StateMachine has callback when transition fail

2014-10-27 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-2750:
-
Affects Version/s: 2.5.1

> Allow StateMachine has callback when transition fail
> 
>
> Key: YARN-2750
> URL: https://issues.apache.org/jira/browse/YARN-2750
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.5.1
>Reporter: Jeff Zhang
>
> We have a situation that sometimes Transition may fail, but we don't want to 
> handle the fail in each Transition, we'd like to handle it in one centralized 
> place, Allow StateMachine has a callback would be good for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2750) Allow StateMachine has callback when transition fail

2014-10-27 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-2750:
-
Description: We have a situation that sometimes Transition may fail, but we 
don't want to handle the fail in each Transition, we'd like to handle it in one 
centralized place, Allow StateMachine has a callback would be good for us.

> Allow StateMachine has callback when transition fail
> 
>
> Key: YARN-2750
> URL: https://issues.apache.org/jira/browse/YARN-2750
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>
> We have a situation that sometimes Transition may fail, but we don't want to 
> handle the fail in each Transition, we'd like to handle it in one centralized 
> place, Allow StateMachine has a callback would be good for us.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2750) Allow StateMachine has callback when transition fail

2014-10-27 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-2750:


 Summary: Allow StateMachine has callback when transition fail
 Key: YARN-2750
 URL: https://issues.apache.org/jira/browse/YARN-2750
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jeff Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport

2014-09-16 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-2560:


 Summary: Diagnostics is delayed to passed to ApplicationReport
 Key: YARN-2560
 URL: https://issues.apache.org/jira/browse/YARN-2560
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jeff Zhang


The diagnostics of Application may be delayed to pass to ApplicationReport. 
Here's one example when ApplicationStatus has changed to FAILED, but the 
diagnostics is still empty. And the next call of getApplicationReport could get 
the diagnostics.
{code}
while(true) {
appReport = yarnClient.getApplicationReport(appId);
Thread.sleep(1000);
LOG.info("AppStatus:" + appReport.getFinalApplicationStatus());
LOG.info("Diagnostics:" + appReport.getDiagnostics());

}
{code}

*Output:*
{code}
AppStatus:FAILED
Diagnostics: // empty

// get diagnostics for the next getApplicationReport
AppStatus:FAILED
Diagnostics: // diagnostics info here
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2014-07-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056976#comment-14056976
 ] 

Jeff Zhang commented on YARN-1743:
--

BTW, this patch is based on branch-2.4.0

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2014-07-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned YARN-1743:


Assignee: Jeff Zhang

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Jeff Zhang
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2014-07-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056931#comment-14056931
 ] 

Jeff Zhang commented on YARN-1743:
--

Attach one initial patch. It may need to refine later, this patch is just to 
illustrate my basic idea on generating event flow between different entities.  
For each event there is a source and dest, and use this information to generate 
the graph viz file. The state_machine diagram is for describing the internal 
transition of one entity, while this patch is for describing the interaction 
between different entities. Welcome any comments and feedback on this.  



> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2014-07-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1743:
-

Attachment: NodeManager.pdf

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour

2014-07-09 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1743:
-

Attachment: NodeManager.gv
YARN-1743.patch

> Decorate event transitions and the event-types with their behaviour
> ---
>
> Key: YARN-1743
> URL: https://issues.apache.org/jira/browse/YARN-1743
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
> Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743.patch
>
>
> Helps to annotate the transitions with (start-state, end-state) pair and the 
> events with (source, destination) pair.
> Not just readability, we may also use them to generate the event diagrams 
> across components.
> Not a blocker for 0.23, but let's see.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-2184) ResourceManager may fail due to name node in safe mode

2014-06-19 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-2184:


 Summary: ResourceManager may fail due to name node in safe mode
 Key: YARN-2184
 URL: https://issues.apache.org/jira/browse/YARN-2184
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.4.0
Reporter: Jeff Zhang
Assignee: Jeff Zhang


If the historyservice is enabled in resourcemanager, it will try to mkdir when 
service is inited. And at that time maybe the name node is still in safemode 
which may cause the historyservice failed and then cause the resouremanager 
fail. It would be very possible when the cluster is restarted when namenode 
will be in safemode in a long time.

Here's the error logs:

{code}
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException):
 Cannot create directory 
/Users/jzhang/Java/lib/hadoop-2.4.0/logs/yarn/system/history/ApplicationHistoryDataRoot.
 Name node is in safe mode.
The reported blocks 85 has reached the threshold 0.9990 of total blocks 85. The 
number of live datanodes 1 has reached the minimum number 0. In safe mode 
extension. Safe mode will be turned off automatically in 19 seconds.
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1195)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:3564)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:754)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:558)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.ipc.Client.call(Client.java:1410)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy14.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:500)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2553)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2524)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:823)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:823)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:816)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.serviceInit(FileSystemApplicationHistoryStore.java:120)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
... 10 more
2014-06-20 11:06:25,220 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down ResourceManager at jzhangMBPr.local/192.168.100.152
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1197) Support changing resources of an allocated container

2014-03-14 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned YARN-1197:


Assignee: Jeff Zhang

> Support changing resources of an allocated container
> 
>
> Key: YARN-1197
> URL: https://issues.apache.org/jira/browse/YARN-1197
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: api, nodemanager, resourcemanager
>Affects Versions: 2.1.0-beta
>Reporter: Wangda Tan
>Assignee: Jeff Zhang
> Attachments: mapreduce-project.patch.ver.1, 
> tools-project.patch.ver.1, yarn-1197-scheduler-v1.pdf, yarn-1197-v2.pdf, 
> yarn-1197-v3.pdf, yarn-1197-v4.pdf, yarn-1197-v5.pdf, yarn-1197.pdf, 
> yarn-api-protocol.patch.ver.1, yarn-pb-impl.patch.ver.1, 
> yarn-server-common.patch.ver.1, yarn-server-nodemanager.patch.ver.1, 
> yarn-server-resourcemanager.patch.ver.1
>
>
> The current YARN resource management logic assumes resource allocated to a 
> container is fixed during the lifetime of it. When users want to change a 
> resource 
> of an allocated container the only way is releasing it and allocating a new 
> container with expected size.
> Allowing run-time changing resources of an allocated container will give us 
> better control of resource usage in application side



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1609) Add Service Container type to NodeManager in YARN

2014-03-14 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned YARN-1609:


Assignee: Jeff Zhang

> Add Service Container type to NodeManager in YARN
> -
>
> Key: YARN-1609
> URL: https://issues.apache.org/jira/browse/YARN-1609
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.2.0
>Reporter: Wangda Tan
>Assignee: Jeff Zhang
> Attachments: Add Service Container type to NodeManager in YARN-V1.pdf
>
>
> From our work to support running OpenMPI on YARN (MAPREDUCE-2911), we found 
> that it’s important to have framework specific daemon process manage the 
> tasks on each node directly. The daemon process, most likely similar in other 
> frameworks as well, provides critical services to tasks running on that 
> node(for example “wireup”, spawn user process in large numbers at once etc). 
> In YARN, it’s hard, if not possible, to have the those processes to be 
> managed by YARN. 
> We propose to extend the container model on NodeManager side to support 
> “Service Container” to run/manage such framework daemon/services process. We 
> believe this is very useful to other application framework developers as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1754) Container process is not really killed

2014-02-23 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated YARN-1754:
-

Description: 
I test the following distributed shell example on my mac:

hadoop jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -appname 
shell -jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
-shell_command=sleep -shell_args=10 -num_containers=1

And it will start 2 process for one container, one is the shell process, 
another is the real command I execute ( here is "sleep 10"). 

And then I kill this application by running command "yarn application -kill 
app_id"

it will kill the shell process, but won't kill the real command process. The 
reason is that yarn use kill command to kill process, but it won't kill its 
child process. use pkill could resolve this issue.

IMHO, it is a very important case which will make the resource usage 
inconsistency, and have potential security problem. 


  was:
I test the following distributed shell example on my mac:

hadoop jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -appname 
shell -jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
-shell_command=sleep -shell_args=10 -num_containers=1

And it will start 2 process for one container, one is the shell process, 
another is the real command I execute ( here is "sleep 10"). 

And then I kill this application by running command "yarn application -kill 
app_id"

it will kill the shell process, but won't kill the real command process. The 
reason is that yarn use kill command to kill process, but it won't kill its 
child process. use pkill could resolve this issue.

I also verify this case on centos which is the same as mac. IMHO, it is a very 
important case which will make the resource usage inconsistency, and have 
potential security problem. 



> Container process is not really killed
> --
>
> Key: YARN-1754
> URL: https://issues.apache.org/jira/browse/YARN-1754
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
> Environment: Mac
>Reporter: Jeff Zhang
>
> I test the following distributed shell example on my mac:
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
> -appname shell -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
> -shell_command=sleep -shell_args=10 -num_containers=1
> And it will start 2 process for one container, one is the shell process, 
> another is the real command I execute ( here is "sleep 10"). 
> And then I kill this application by running command "yarn application -kill 
> app_id"
> it will kill the shell process, but won't kill the real command process. The 
> reason is that yarn use kill command to kill process, but it won't kill its 
> child process. use pkill could resolve this issue.
> IMHO, it is a very important case which will make the resource usage 
> inconsistency, and have potential security problem. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1754) Container process is not really killed

2014-02-23 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910006#comment-13910006
 ] 

Jeff Zhang commented on YARN-1754:
--

It looks like it only happens on mac, but works normally in linux

> Container process is not really killed
> --
>
> Key: YARN-1754
> URL: https://issues.apache.org/jira/browse/YARN-1754
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.2.0
> Environment: Mac
>Reporter: Jeff Zhang
>
> I test the following distributed shell example on my mac:
> hadoop jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
> -appname shell -jar 
> share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
> -shell_command=sleep -shell_args=10 -num_containers=1
> And it will start 2 process for one container, one is the shell process, 
> another is the real command I execute ( here is "sleep 10"). 
> And then I kill this application by running command "yarn application -kill 
> app_id"
> it will kill the shell process, but won't kill the real command process. The 
> reason is that yarn use kill command to kill process, but it won't kill its 
> child process. use pkill could resolve this issue.
> IMHO, it is a very important case which will make the resource usage 
> inconsistency, and have potential security problem. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1754) Container process is not really killed

2014-02-23 Thread Jeff Zhang (JIRA)
Jeff Zhang created YARN-1754:


 Summary: Container process is not really killed
 Key: YARN-1754
 URL: https://issues.apache.org/jira/browse/YARN-1754
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.2.0
 Environment: Mac
Reporter: Jeff Zhang


I test the following distributed shell example on my mac:

hadoop jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar -appname 
shell -jar 
share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar 
-shell_command=sleep -shell_args=10 -num_containers=1

And it will start 2 process for one container, one is the shell process, 
another is the real command I execute ( here is "sleep 10"). 

And then I kill this application by running command "yarn application -kill 
app_id"

it will kill the shell process, but won't kill the real command process. The 
reason is that yarn use kill command to kill process, but it won't kill its 
child process. use pkill could resolve this issue.

I also verify this case on centos which is the same as mac. IMHO, it is a very 
important case which will make the resource usage inconsistency, and have 
potential security problem. 




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-321) Generic application history service

2013-12-03 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838492#comment-13838492
 ] 

Jeff Zhang commented on YARN-321:
-

Thanks Shen. Is there any estimation for the release of 2.4 ?

> Generic application history service
> ---
>
> Key: YARN-321
> URL: https://issues.apache.org/jira/browse/YARN-321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
>Assignee: Vinod Kumar Vavilapalli
> Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, 
> HistoryStorageDemo.java
>
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-321) Generic application history service

2013-12-02 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837281#comment-13837281
 ] 

Jeff Zhang commented on YARN-321:
-

Another question about this jira. I found that the container logURL is 
hard-coded there, user still could not see the logs of each container ( stdout, 
stderror ). Is it on the roadmap that allow user to see the logs ? And which 
jira is tracking this ? Thanks .

> Generic application history service
> ---
>
> Key: YARN-321
> URL: https://issues.apache.org/jira/browse/YARN-321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
>Assignee: Vinod Kumar Vavilapalli
> Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, 
> HistoryStorageDemo.java
>
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-321) Generic application history service

2013-12-02 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837276#comment-13837276
 ] 

Jeff Zhang commented on YARN-321:
-

Will this jira been included in the next release ?

> Generic application history service
> ---
>
> Key: YARN-321
> URL: https://issues.apache.org/jira/browse/YARN-321
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Luke Lu
>Assignee: Vinod Kumar Vavilapalli
> Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, 
> HistoryStorageDemo.java
>
>
> The mapreduce job history server currently needs to be deployed as a trusted 
> server in sync with the mapreduce runtime. Every new application would need a 
> similar application history server. Having to deploy O(T*V) (where T is 
> number of type of application, V is number of version of application) trusted 
> servers is clearly not scalable.
> Job history storage handling itself is pretty generic: move the logs and 
> history data into a particular directory for later serving. Job history data 
> is already stored as json (or binary avro). I propose that we create only one 
> trusted application history server, which can have a generic UI (display json 
> as a tree of strings) as well. Specific application/version can deploy 
> untrusted webapps (a la AMs) to query the application history server and 
> interpret the json for its specific UI and/or analytics.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1440) Yarn aggregated logs are difficult for external tools to understand

2013-11-25 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831275#comment-13831275
 ] 

Jeff Zhang commented on YARN-1440:
--

@ledion,  the current implementation will be one TFile per application, while 
your method will create one TFile per container which would generate more files.
I guess the reason why the original author adopt TFile is that TFile has one 
index block which allow user quickly find the value. In this way, user could 
quickly find one container's log of one application.  

> Yarn aggregated logs are difficult for external tools to understand
> ---
>
> Key: YARN-1440
> URL: https://issues.apache.org/jira/browse/YARN-1440
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: ledion bitincka
>  Labels: log-aggregation, logs, tfile, yarn
>
> The log aggregation feature in Yarn is awesome! However, the file type and 
> format in which the log files are aggregated into (TFile) should either be 
> much simpler or be made pluggable. The current TFile format forces anyone who 
> wants to see the files to either 
> a) use the web UI
> b) use the CLI tools (yarn logs)  or 
> c) write custom code to read the files 
> My suggestion would be to simplify the log collection by collecting and 
> writing the raw log files into a directory structure as follows: 
> {noformat}
> /{log-collection-dir}/{app-id}/{container-id}/{log-file-name} 
> {noformat}
> This way the application developers can (re)use a much wider array of tools 
> to process the logs. 
> For the readers who are not familiar with logs and their format you can find 
> more info the following two blog posts:
> http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/
> http://blogs.splunk.com/2013/11/18/hadoop-2-0-rant/



--
This message was sent by Atlassian JIRA
(v6.1#6144)