[jira] [Commented] (YARN-10101) Support listing of aggregated logs for containers belonging to an application attempt

2020-02-09 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033409#comment-17033409
 ] 

Szilard Nemeth commented on YARN-10101:
---

Retriggered build to check if UT failure is going away.

> Support listing of aggregated logs for containers belonging to an application 
> attempt
> -
>
> Key: YARN-10101
> URL: https://issues.apache.org/jira/browse/YARN-10101
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10101.001.patch, YARN-10101.002.patch, 
> YARN-10101.003.patch, YARN-10101.004.patch, YARN-10101.005.patch, 
> YARN-10101.006.patch, YARN-10101.007.patch, YARN-10101.008.patch, 
> YARN-10101.009.patch, YARN-10101.branch-3.2.001.patch, 
> YARN-10101.branch-3.2.001.patch, YARN-10101.branch-3.2.002.patch
>
>
> To display logs without access to the timeline server, we need an interface 
> where we can query the list of containers with aggregated logs belonging to 
> an application attempt.
> We should add support for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10123:
---
Description: 
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK
{code}

set up in their client configuration.

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with a different error here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.

As such, maybe this error message can be potentially improved to say something 
like:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK. Please ensure the App admin client 
class actually exists within SPARK to handle this functionality.
{code}

or something similar.

Further, documentation around "-stop" and "-start" options will need to be 
improved here -> 
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
 as it does not mention anything about having an implementation at the 
framework level for the YARN stop/start command to succeed.


  was:
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK
{code}

set up in their client configuration.

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail

[jira] [Updated] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10123:
---
Description: 
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK
{code}

set up in their client configuration.

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with a different error here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.

As such, maybe this error message can be improved to say something like:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK. Please ensure the App admin client 
class actually exists within SPARK to handle this functionality.
{code}

or something similar.

Further, documentation around "-stop" and "-start" options will need to be 
improved here -> 
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
 as it does not mention anything about having an implementation at the 
framework level for the YARN stop/start command to succeed.


  was:
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK
{code}

set up in the client configuration.

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit 

[jira] [Updated] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10123:
---
Description: 
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK
{code}

set up in the client configuration.

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with a different error here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.

As such, maybe this error message can be improved to say something like:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK. Please ensure the App admin client 
class actually exists within SPARK to handle this functionality.
{code}

or something similar.

Further, documentation around "-stop" and "-start" options will need to be 
improved here -> 
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
 as it does not mention anything about having an implementation at the 
framework level for the YARN stop/start command to succeed.


  was:
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK set up in the client configuration.
{code}

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit wit

[jira] [Updated] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10123:
---
Description: 
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK set up in the client configuration.
{code}

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with a different error here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.

As such, maybe this error message can be improved to say something like:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK. Please ensure the App admin client 
class actually exists within SPARK to handle this functionality.
{code}

or something similar.

Further, documentation around "-stop" and "-start" options will need to be 
improved here -> 
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
 as it does not mention anything about having an implementation at the 
framework level for the YARN stop/start command to succeed.


  was:
A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK set up in the client configuration.
{code}

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with

[jira] [Created] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)
Siddharth Ahuja created YARN-10123:
--

 Summary: Error message around yarn app -stop/start can be improved 
to highlight that an implementation at framework level is needed for the 
stop/start functionality to work
 Key: YARN-10123
 URL: https://issues.apache.org/jira/browse/YARN-10123
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: client, documentation
Affects Versions: 3.2.1
Reporter: Siddharth Ahuja


A "stop" on a YARN application fails with the below error:

{code}
# yarn app -stop application_1581294743321_0002 -appTypes SPARK
20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK
at 
org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
{code}

>From 
>https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
> it seems that this is because user does not have the setting:

{code}
yarn.application.admin.client.class.SPARK set up in the client configuration.
{code}

However, even if this setting is present, we still need to have an 
implementation available for the application type. From my internal discussions 
- Jobs don't have a notion of stop / resume functionality at YARN level. If 
some apps like Spark need it, it has to be implemented at those framework's 
level.

Therefore, the above error message is a bit misleading in that, even if 
"yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
actually available underneath to handle the stop/start functionality then, we 
will fail again, albeit with a different error here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.

As such, maybe this error message can be improved to say something like:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: App admin client 
class name not specified for type SPARK, is there an implementation even 
available for this framework?
{code}

or something similar.

Further, documentation around "-stop" and "-start" options will need to be 
improved here -> 
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
 as it does not mention anything about having an implementation at the 
framework level for the YARN stop/start command to succeed.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2020-02-09 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-10123:
--

Assignee: Siddharth Ahuja

> Error message around yarn app -stop/start can be improved to highlight that 
> an implementation at framework level is needed for the stop/start 
> functionality to work
> ---
>
> Key: YARN-10123
> URL: https://issues.apache.org/jira/browse/YARN-10123
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, documentation
>Affects Versions: 3.2.1
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
>
> A "stop" on a YARN application fails with the below error:
> {code}
> # yarn app -stop application_1581294743321_0002 -appTypes SPARK
> 20/02/10 06:24:27 INFO client.RMProxy: Connecting to ResourceManager at 
> c3224-node2.squadron.support.hortonworks.com/172.25.34.128:8050
> 20/02/10 06:24:27 INFO client.AHSProxy: Connecting to Application History 
> server at c3224-node2.squadron.support.hortonworks.com/172.25.34.128:10200
> Exception in thread "main" java.lang.IllegalArgumentException: App admin 
> client class name not specified for type SPARK
> at 
> org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:579)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:123)
> {code}
> From 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L76,
>  it seems that this is because user does not have the setting:
> {code}
> yarn.application.admin.client.class.SPARK set up in the client configuration.
> {code}
> However, even if this setting is present, we still need to have an 
> implementation available for the application type. From my internal 
> discussions - Jobs don't have a notion of stop / resume functionality at YARN 
> level. If some apps like Spark need it, it has to be implemented at those 
> framework's level.
> Therefore, the above error message is a bit misleading in that, even if 
> "yarn.application.admin.client.class.SPARK" is supplied (or for that matter - 
> yarn.application.admin.client.class.MAPREDUCE), if there is no implementation 
> actually available underneath to handle the stop/start functionality then, we 
> will fail again, albeit with a different error here: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/AppAdminClient.java#L85.
> As such, maybe this error message can be improved to say something like:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: App admin 
> client class name not specified for type SPARK, is there an implementation 
> even available for this framework?
> {code}
> or something similar.
> Further, documentation around "-stop" and "-start" options will need to be 
> improved here -> 
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html#application_or_app
>  as it does not mention anything about having an implementation at the 
> framework level for the YARN stop/start command to succeed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10122) In Federation,executing yarn container signal command throws an exception

2020-02-09 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10122:
---

 Summary: In Federation,executing yarn container signal command 
throws an exception
 Key: YARN-10122
 URL: https://issues.apache.org/jira/browse/YARN-10122
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


Executing yarn container signal command failed, prompting an error 
“org.apache.commons.lang.NotImplementedException: Code is not implemented”.


{noformat}
./yarn container -signal container_e79_1581316978887_0001_01_10
Signalling container container_e79_1581316978887_0001_01_10
2020-02-10 14:51:18,045 INFO impl.YarnClientImpl: Signalling container 
container_e79_1581316978887_0001_01_10 with command OUTPUT_THREAD_DUMP
Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
Code is not implemented
at 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.signalToContainer(FederationClientInterceptor.java:993)
at 
org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.signalToContainer(RouterClientRMService.java:403)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.signalToContainer(ApplicationClientProtocolPBServiceImpl.java:629)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:629)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.signalToContainer(ApplicationClientProtocolPBClientImpl.java:620)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy8.signalToContainer(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.signalToContainer(YarnClientImpl.java:949)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.signalToContainer(ApplicationCLI.java:717)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:478)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at 
org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:119)

{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10121) In Federation executing yarn queue status command throws an exception

2020-02-09 Thread Sushanta Sen (Jira)
Sushanta Sen created YARN-10121:
---

 Summary: In Federation executing yarn queue status command throws 
an exception
 Key: YARN-10121
 URL: https://issues.apache.org/jira/browse/YARN-10121
 Project: Hadoop YARN
  Issue Type: Bug
  Components: federation, yarn
Reporter: Sushanta Sen


yarn queue status is failing, prompting an error 
“org.apache.commons.lang.NotImplementedException: Code is not implemented”.


{noformat}
 ./yarn queue -status default
Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
Code is not implemented
at 
org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.getQueueInfo(FederationClientInterceptor.java:715)
at 
org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.getQueueInfo(RouterClientRMService.java:246)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:328)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:591)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getQueueInfo(ApplicationClientProtocolPBClientImpl.java:341)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy8.getQueueInfo(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getQueueInfo(YarnClientImpl.java:650)
at 
org.apache.hadoop.yarn.client.cli.QueueCLI.listQueue(QueueCLI.java:111)
at org.apache.hadoop.yarn.client.cli.QueueCLI.run(QueueCLI.java:78)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.yarn.client.cli.QueueCLI.main(QueueCLI.java:50)

{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10109) Allow stop and convert from leaf to parent queue in a single Mutation API call

2020-02-09 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033248#comment-17033248
 ] 

Hudson commented on YARN-10109:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17933 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17933/])
YARN-10109. Allow stop and convert from leaf to parent queue in a single 
(sunilg: rev 28f730b317b23038bf9bd0775dd2cdb96518b13b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesConfigurationMutation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfigValidator.java


> Allow stop and convert from leaf to parent queue in a single Mutation API call
> --
>
> Key: YARN-10109
> URL: https://issues.apache.org/jira/browse/YARN-10109
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2
>
> Attachments: YARN-10109-001.patch, YARN-10109-002.patch, 
> YARN-10109-003.patch, YARN-10109-004.patch
>
>
> SchedulerConf Mutation API does not Allow Stop and Adding queue under an 
> existing Leaf Queue in a single call. 
> *Repro:*
>  
> {code:java}
> Capacity-Scheduler.xml: 
> yarn.scheduler.capacity.root.queues = default
> yarn.scheduler.capacity.root.default.capacity = 100 
> cat abc.xml 
> 
>   
>   root.default.v1
>   
> 
>   capacity
>   100
> 
>   
> 
> 
>   root.default
>   
> 
>   state
>   STOPPED
> 
>   
> 
> 
> [yarn@pjoseph-1 tmp]$ curl --negotiate -u : -X PUT -d @add.xml -H 
> "Content-type: application/xml" 
> 'http://:8088/ws/v1/cluster/scheduler-conf?user.name=yarn'
> Failed to re-init queues : Can not convert the leaf queue: root.default to 
> parent queue since it is not yet in stopped state. Current State : RUNNING
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9624) Use switch case for ProtoUtils#convertFromProtoFormat containerState

2020-02-09 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033212#comment-17033212
 ] 

Hudson commented on YARN-9624:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17932 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17932/])
YARN-9624. Use switch case for ProtoUtils#convertFromProtoFormat (ayushsaxena: 
rev 3f0a7cd17a1a8b904ef16426dbe2e2e267416464)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/impl/pb/TestProtoUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java


> Use switch case for ProtoUtils#convertFromProtoFormat containerState
> 
>
> Key: YARN-9624
> URL: https://issues.apache.org/jira/browse/YARN-9624
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bibin Chundatt
>Assignee: Bilwa S T
>Priority: Major
>  Labels: performance
> Fix For: 3.3.0
>
> Attachments: YARN-9624.001.patch, YARN-9624.002.patch, 
> YARN-9624.003.patch
>
>
> On large cluster with 100K+ containers on every heartbeat 
> {{ContainerState.valueOf(e.name().replace(CONTAINER_STATE_PREFIX, ""))}} will 
> be too costly. Update with switch case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org