[jira] [Commented] (YARN-10002) Code cleanup and improvements in ConfigurationStoreBaseTest

2020-04-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080250#comment-17080250
 ] 

Hadoop QA commented on YARN-10002:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
27s{color} | {color:green} branch-3.2 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
18s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-3.2 
failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} branch-3.2 passed {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  2m 
31s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 37s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 17 new + 0 unchanged - 0 fixed = 17 total (was 0) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 4 new + 2 unchanged - 4 fixed = 6 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 11s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}391m 25s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}455m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector 
|
|   | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestParentQueue |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.policy.TestFairOrderingPolicy |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerWithMultiResourceTypes
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerResizing |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | 

[jira] [Commented] (YARN-10223) Duplicate jersey-test-framework-core dependency in yarn-server-common

2020-04-09 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080158#comment-17080158
 ] 

Akira Ajisaka commented on YARN-10223:
--

It's not critical, so I targeted this to 3.3.1.

> Duplicate jersey-test-framework-core dependency in yarn-server-common
> -
>
> Key: YARN-10223
> URL: https://issues.apache.org/jira/browse/YARN-10223
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Minor
>
> The following warning appears in maven log.
> {noformat}
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: 
> com.sun.jersey.jersey-test-framework:jersey-test-framework-core:jar -> 
> version (?) vs 1.19 @ line 148, column 17
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7558) "yarn logs" command fails to get logs for running containers if UI authentication is enabled.

2020-04-09 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated YARN-7558:
-
Reporter: Namit Maheshwari  (was: Namit Maheshwari)

> "yarn logs" command fails to get logs for running containers if UI 
> authentication is enabled.
> -
>
> Key: YARN-7558
> URL: https://issues.apache.org/jira/browse/YARN-7558
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Xuan Gong
>Priority: Critical
> Fix For: 3.1.0, 2.9.1, 3.0.1
>
> Attachments: YARN-7558.1.patch, YARN-7558.2.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10002) Code cleanup and improvements in ConfigurationStoreBaseTest

2020-04-09 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10002:
-
Attachment: YARN-10002.branch-3.2.001.patch

> Code cleanup and improvements in ConfigurationStoreBaseTest
> ---
>
> Key: YARN-10002
> URL: https://issues.apache.org/jira/browse/YARN-10002
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Benjamin Teke
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-10002.001.patch, YARN-10002.002.patch, 
> YARN-10002.003.patch, YARN-10002.004.patch, YARN-10002.005.patch, 
> YARN-10002.006.patch, YARN-10002.branch-3.2.001.patch
>
>
> * Some protected fields could be package-private
> * Could add a helper method that prepares a simple LogMutation with 1, 2 or 3 
> updates (Key + value) as this pattern is used extensively in subclasses



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5625) FairScheduler should use FSContext more aggressively to avoid constructors with many parameters

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5625:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> FairScheduler should use FSContext more aggressively to avoid constructors 
> with many parameters
> ---
>
> Key: YARN-5625
> URL: https://issues.apache.org/jira/browse/YARN-5625
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Priority: Major
>
> YARN-5609 introduces FSContext, a structure to capture basic FairScheduler 
> information. In addition to preemption details, it could host references to 
> the scheduler, QueueManager, AllocationConfiguration etc. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4843) [Umbrella] Revisit YARN ProtocolBuffer int32 usages that need to upgrade to int64

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4843:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> [Umbrella] Revisit YARN ProtocolBuffer int32 usages that need to upgrade to 
> int64
> -
>
> Key: YARN-4843
> URL: https://issues.apache.org/jira/browse/YARN-4843
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 3.0.0-alpha1
>Reporter: Wangda Tan
>Priority: Major
>
> This JIRA is to track all int32 usages in YARN's ProtocolBuffer APIs that we 
> possibly need to update to int64.
> One example is resource API. We use int32 for memory now, if a cluster has 
> 10k nodes, each node has 210G memory, we will get a negative total cluster 
> memory.
> We may have other fields may need to upgrade from int32 to int64. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5465) Server-Side NM Graceful Decommissioning subsequent call behavior

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5465:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Server-Side NM Graceful Decommissioning subsequent call behavior
> 
>
> Key: YARN-5465
> URL: https://issues.apache.org/jira/browse/YARN-5465
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful
>Reporter: Robert Kanter
>Priority: Major
>
> The Server-Side NM Graceful Decommissioning feature added by YARN-4676 has 
> the following behavior when subsequent calls are made:
> # Start a long-running job that has containers running on nodeA
> # Add nodeA to the exclude file
> # Run {{-refreshNodes -g 120 -server}} (2min) to begin gracefully 
> decommissioning nodeA
> # Wait 30 seconds
> # Add nodeB to the exclude file
> # Run {{-refreshNodes -g 30 -server}} (30sec)
> # After 30 seconds, both nodeA and nodeB shut down
> In a nutshell, issuing a subsequent call to gracefully decommission nodes 
> updates the timeout for any currently decommissioning nodes.  This makes it 
> impossible to gracefully decommission different sets of nodes with different 
> timeouts.  Though it does let you easily update the timeout of currently 
> decommissioning nodes.
> Another behavior we could do is this:
> # {color:grey}Start a long-running job that has containers running on nodeA
> # {color:grey}Add nodeA to the exclude file{color}
> # {color:grey}Run {{-refreshNodes -g 120 -server}} (2min) to begin gracefully 
> decommissioning nodeA{color}
> # {color:grey}Wait 30 seconds{color}
> # {color:grey}Add nodeB to the exclude file{color}
> # {color:grey}Run {{-refreshNodes -g 30 -server}} (30sec){color}
> # After 30 seconds, nodeB shuts down
> # After 60 more seconds, nodeA shuts down
> This keeps the nodes affected by each call to gracefully decommission nodes 
> independent.  You can now have different sets of decommissioning nodes with 
> different timeouts.  However, to update the timeout of a currently 
> decommissioning node, you'd have to first recommission it, and then 
> decommission it again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4637) AM launching blacklist purge mechanism (time based)

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4637:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> AM launching blacklist purge mechanism (time based)
> ---
>
> Key: YARN-4637
> URL: https://issues.apache.org/jira/browse/YARN-4637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Sunil G
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5414) Integrate NodeQueueLoadMonitor with ClusterNodeTracker

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5414:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Integrate NodeQueueLoadMonitor with ClusterNodeTracker
> --
>
> Key: YARN-5414
> URL: https://issues.apache.org/jira/browse/YARN-5414
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: container-queuing, distributed-scheduling, scheduler
>Reporter: Arun Suresh
>Assignee: Abhishek Modi
>Priority: Major
>
> The {{ClusterNodeTracker}} tracks the states of clusterNodes and provides 
> convenience methods like sort and filter.
> The {{NodeQueueLoadMonitor}} should use the {{ClusterNodeTracker}} instead of 
> maintaining its own data-structure of node information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4944) Handle lack of ResourceCalculatorPlugin gracefully

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4944:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Handle lack of ResourceCalculatorPlugin gracefully
> --
>
> Key: YARN-4944
> URL: https://issues.apache.org/jira/browse/YARN-4944
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Priority: Major
>  Labels: newbie++
>
> On some systems (e.g. mac), the NM might not be able to instantiate a 
> ResourceCalculatorPlugin and leads to logging a bunch of error messages. We 
> could improve the way we handle this. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5536) Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5536:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Multiple format support (JSON, etc.) for exclude node file in NM graceful 
> decommission with timeout
> ---
>
> Key: YARN-5536
> URL: https://issues.apache.org/jira/browse/YARN-5536
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful
>Reporter: Junping Du
>Priority: Major
>
> Per discussion in YARN-4676, we agree that multiple format (other than xml) 
> should be supported to decommission nodes with timeout values.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1426) YARN Components need to unregister their beans upon shutdown

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-1426:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> YARN Components need to unregister their beans upon shutdown
> 
>
> Key: YARN-1426
> URL: https://issues.apache.org/jira/browse/YARN-1426
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.3.0, 3.0.0-alpha1
>Reporter: Jonathan Turner Eagles
>Assignee: Jonathan Turner Eagles
>Priority: Major
>  Labels: oct16-easy
> Attachments: YARN-1426.2.patch, YARN-1426.patch, YARN-1426.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4953:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Delete completed container log folder when rolling log aggregation is enabled
> -
>
> Key: YARN-4953
> URL: https://issues.apache.org/jira/browse/YARN-4953
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Major
>
> There would be potential bottle neck when cluster is running with very large 
> number of containers on the same NodeManager for single application. The 
> linux limits the subfolders count to 32K. If number of containers is greater 
> than 32K for an application, there would be container launch failure. At this 
> point of time, there are no more containers can be launched in this node.
> Currently log folders are deleted after app is finished. Rolling log 
> aggregation aggregates logs to hdfs periodically. 
> I think if aggregation is completed for finished containers, then clean up 
> can be done i.e deleting log folder for finished containers. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9883) Reshape SchedulerHealth class

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9883:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Reshape SchedulerHealth class
> -
>
> Key: YARN-9883
> URL: https://issues.apache.org/jira/browse/YARN-9883
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Kinga Marton
>Priority: Minor
>
> The {{SchedulerHealth}} class has some flaws, for example:
> - It has no javadoc at all
> - All its objects are package-private: they should be private
> - The internal maps should be (Concurrent) EnumMaps instead of HashMaps: they 
> are more efficient in storing Enums
> - schedulerHealthDetails only stores the last operation, its name should 
> reflect that (just like lastSchedulerRunDetails)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2024) IOException in AppLogAggregatorImpl does not give stacktrace and leaves aggregated TFile in a bad state.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-2024:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> IOException in AppLogAggregatorImpl does not give stacktrace and leaves 
> aggregated TFile in a bad state.
> 
>
> Key: YARN-2024
> URL: https://issues.apache.org/jira/browse/YARN-2024
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 0.23.10, 2.4.0
>Reporter: Eric Payne
>Assignee: Xuan Gong
>Priority: Major
>
> Multiple issues were encountered when AppLogAggregatorImpl encountered an 
> IOException in AppLogAggregatorImpl#uploadLogsForContainer while aggregating 
> yarn-logs for an application that had very large (>150G each) error logs.
> - An IOException was encountered during the LogWriter#append call, and a 
> message was printed, but no stacktrace was provided. Message: "ERROR: 
> Couldn't upload logs for container_n_nnn_nn_nn. Skipping 
> this container."
> - After the IOExceptin, the TFile is in a bad state, so subsequent calls to 
> LogWriter#append fail with the following stacktrace:
> 2014-04-16 13:29:09,772 [LogAggregationService #17907] ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[LogAggregationService #17907,5,main] threw an Exception.
> java.lang.IllegalStateException: Incorrect state to start a new key: IN_VALUE
> at 
> org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:528)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:262)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:128)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:164)
> ...
> - At this point, the yarn-logs cleaner still thinks the thread is 
> aggregating, so the huge yarn-logs never get cleaned up for that application.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4969) Fix more loggings in CapacityScheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4969:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Fix more loggings in CapacityScheduler
> --
>
> Key: YARN-4969
> URL: https://issues.apache.org/jira/browse/YARN-4969
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
>  Labels: oct16-easy
> Attachments: YARN-4969.1.patch
>
>
> YARN-3966 did logging cleanup for Capacity Scheduler before, however, 
> there're some loggings we need to improvement:
> Container allocation / complete / reservation / un-reserve messages for every 
> hierarchy (app/leaf/parent-queue) should be printed at INFO level:
> I'm debugging one issue that root queue's resource usage could be negative, 
> it is very hard to reproduce, so we cannot enable debug logging since RM 
> start, size of log cannot be fit in a single disk.
> Existing CS prints INFO message when container cannot be allocated, such as 
> re-reservation / node heartbeat, etc. we should avoid printing such message 
> at INFO level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10032) Implement regex querying of logs

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10032:

Target Version/s: 3.4.0

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Implement regex querying of logs
> 
>
> Key: YARN-10032
> URL: https://issues.apache.org/jira/browse/YARN-10032
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
>
> After YARN-10031, we have query parameters to the log servlet's GET endpoint.
> To demonstrate the new capabilities of the log servlet and how easy it will 
> be to add a functionality to all log servlets at the same time: let's add the 
> ability to search in the aggregated logs with a given regex.
> A conceptual use case:
> User run several MR jobs daily, but some of them fail to localize a 
> particular resource at first. We want to search in the logs of these Yarn 
> applications, and extract some data from them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2684) FairScheduler: When failing an application due to changes in queue config or placement policy, indicate the cause.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-2684:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> FairScheduler: When failing an application due to changes in queue config or 
> placement policy, indicate the cause.
> --
>
> Key: YARN-2684
> URL: https://issues.apache.org/jira/browse/YARN-2684
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.5.1
>Reporter: Karthik Kambatla
>Priority: Major
> Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch
>
>
> YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-1946) need Public interface for WebAppUtils.getProxyHostAndPort

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-1946:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> need Public interface for WebAppUtils.getProxyHostAndPort
> -
>
> Key: YARN-1946
> URL: https://issues.apache.org/jira/browse/YARN-1946
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, webapp
>Affects Versions: 2.4.0
>Reporter: Thomas Graves
>Priority: Major
>
> ApplicationMasters are supposed to go through the ResourceManager web app 
> proxy if they have web UI's so they are properly secured.  There is currently 
> no public interface for Application Masters to conveniently get the proxy 
> host and port.  There is a function in WebAppUtils, but that class is 
> private.  
> We should provide this as a utility since any properly written AM will need 
> to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4638) Node whitelist support for AM launching

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4638:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Node whitelist support for AM launching 
> 
>
> Key: YARN-4638
> URL: https://issues.apache.org/jira/browse/YARN-4638
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4636) Make blacklist tracking policy pluggable for more extensions.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4636:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Make blacklist tracking policy pluggable for more extensions.
> -
>
> Key: YARN-4636
> URL: https://issues.apache.org/jira/browse/YARN-4636
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Junping Du
>Assignee: Sunil G
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4971) RM fails to re-bind to wildcard IP after failover in multi homed clusters

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4971:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> RM fails to re-bind to wildcard IP after failover in multi homed clusters
> -
>
> Key: YARN-4971
> URL: https://issues.apache.org/jira/browse/YARN-4971
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-4971.1.patch
>
>
> If the RM has the {{yarn.resourcemanager.bind-host}} set to 0.0.0.0 the first 
> time the service becomes active binding to the wildcard works as expected. If 
> the service has transitioned from active to standby and then becomes active 
> again after failovers the service only binds to one of the ip addresses.
> There is a difference between the services inside the RM: it only seem to 
> happen for the services listening on ports: 8030 and 8032



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4808) SchedulerNode can use a few more cosmetic changes

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4808:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> SchedulerNode can use a few more cosmetic changes
> -
>
> Key: YARN-4808
> URL: https://issues.apache.org/jira/browse/YARN-4808
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
>Priority: Major
> Attachments: yarn-4808-1.patch, yarn-4808-2.patch
>
>
> We have made some cosmetic changes to SchedulerNode recently. While working 
> on YARN-4511, realized we could improve it a little more:
> # Remove volatile variables - don't see the need for them being volatile
> # Some methods end up doing very similar things, so consolidating them
> # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity 
> to include the un-utilized resources, and having two totals can be a little 
> confusing. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2014) Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-2014:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Performance: AM scaleability is 10% slower in 2.4 compared to 0.23.9
> 
>
> Key: YARN-2014
> URL: https://issues.apache.org/jira/browse/YARN-2014
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: patrick white
>Assignee: Jason Darrell Lowe
>Priority: Major
>
> Performance comparison benchmarks from 2.x against 0.23 shows AM scalability 
> benchmark's runtime is approximately 10% slower in 2.4.0. The trend is 
> consistent across later releases in both lines, latest release numbers are:
> 2.4.0.0 runtime 255.6 seconds (avg 5 passes)
> 0.23.9.12 runtime 230.4 seconds (avg 5 passes)
> Diff: -9.9% 
> AM Scalability test is essentially a sleep job that measures time to launch 
> and complete a large number of mappers.
> The diff is consistent and has been reproduced in both a larger (350 node, 
> 100,000 mappers) perf environment, as well as a small (10 node, 2,900 
> mappers) demo cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7578) Extend TestDiskFailures.waitForDiskHealthCheck() sleeping time.

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7578:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Extend TestDiskFailures.waitForDiskHealthCheck() sleeping time.
> ---
>
> Key: YARN-7578
> URL: https://issues.apache.org/jira/browse/YARN-7578
> Project: Hadoop YARN
>  Issue Type: Test
>Affects Versions: 3.1.0
> Environment: ARMv8 AArch64, Ubuntu16.04
>Reporter: Guangming Zhang
>Priority: Minor
>  Labels: dtest, patch, test
> Attachments: YARN-7578.0.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Thread.sleep() function is called to wait for NodeManager to identify disk 
> failures. But in some cases, for example the lower-end hardware computer, the 
> sleep time is too short so that the NodeManager may haven't finished 
> identifying disk failures. This will occur test errors:
> {code:java}
>   Running org.apache.hadoop.yarn.server.TestDiskFailures
>   Tests run: 3, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 17.686 
> sec <<< FAILURE! - in org.apache.hadoop.yarn.server.TestDiskFailures
>   testLocalDirsFailures(org.apache.hadoop.yarn.server.TestDiskFailures)  
> Time elapsed: 10.412 sec  <<< FAILURE!
>   java.lang.AssertionError: NodeManager could not identify disk failure.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.verifyDisksHealth(TestDiskFailures.java:239)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.testDirsFailures(TestDiskFailures.java:186)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.testLocalDirsFailures(TestDiskFailures.java:99)
>   testLogDirsFailures(org.apache.hadoop.yarn.server.TestDiskFailures)  
> Time elapsed: 5.99 sec  <<< FAILURE!
>   java.lang.AssertionError: NodeManager could not identify disk failure.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.verifyDisksHealth(TestDiskFailures.java:239)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.testDirsFailures(TestDiskFailures.java:186)
>   at 
> org.apache.hadoop.yarn.server.TestDiskFailures.testLogDirsFailures(TestDiskFailures.java:111)
> {code}
>  So extend the sleep time from 1000ms to 1500ms to avoid some unit test 
> errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5674) FairScheduler handles "dots" in user names inconsistently in the config

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5674:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> FairScheduler handles "dots" in user names inconsistently in the config
> ---
>
> Key: YARN-5674
> URL: https://issues.apache.org/jira/browse/YARN-5674
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>
> A user name can contain a dot because it could be used as the queue name we 
> replace the dot with a defined separator. When defining queues in the 
> configuration for users containing a dot we expect that the dot is replaced 
> by the "\_dot\_" string.
> In the user limits we do not do that and user limits need a normal dot in the 
> user name. This is confusing when you create a scheduler configuration in 
> some places you need to replace in others you do not. This can cause issue 
> when user limits are not enforced as expected.
> We should use one way to specify the user and since the queue naming can not 
> be changed we should also use the same "\_dot\_" in the user limits and 
> enforce correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10138) Document the new JHS API

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10138:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Document the new JHS API
> 
>
> Key: YARN-10138
> URL: https://issues.apache.org/jira/browse/YARN-10138
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> A new API has been introduced in YARN-10028, but we did not document it in 
> the JHS API documentation. Let's add it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10106) Yarn logs CLI filtering by application attempt

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10106:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Yarn logs CLI filtering by application attempt
> --
>
> Key: YARN-10106
> URL: https://issues.apache.org/jira/browse/YARN-10106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Trivial
>
> {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the 
> {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as 
> well to filter by application attempt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-867) Isolation of failures in aux services

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-867:
--
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Isolation of failures in aux services 
> --
>
> Key: YARN-867
> URL: https://issues.apache.org/jira/browse/YARN-867
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Hitesh Shah
>Assignee: Xuan Gong
>Priority: Major
> Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
> YARN-867.4.patch, YARN-867.5.patch, YARN-867.6.patch, 
> YARN-867.sampleCode.2.patch
>
>
> Today, a malicious application can bring down the NM by sending bad data to a 
> service. For example, sending data to the ShuffleService such that it results 
> any non-IOException will cause the NM's async dispatcher to exit as the 
> service's INIT APP event is not handled properly. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4495) add a way to tell AM container increase/decrease request is invalid

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4495:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> add a way to tell AM container increase/decrease request is invalid
> ---
>
> Key: YARN-4495
> URL: https://issues.apache.org/jira/browse/YARN-4495
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, client
>Reporter: sandflee
>Priority: Major
>  Labels: oct16-hard
> Attachments: YARN-4495.01.patch
>
>
> now RM may pass InvalidResourceRequestException to AM or just ignore the 
> change request, the former will cause AMRMClientAsync down. and the latter 
> will leave AM waiting for the relay.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4485) [Umbrella] Capture per-application and per-queue container allocation latency

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4485:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> [Umbrella] Capture per-application and per-queue container allocation latency
> -
>
> Key: YARN-4485
> URL: https://issues.apache.org/jira/browse/YARN-4485
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Karthik Kambatla
>Priority: Major
>  Labels: supportability, tuning
>
> Per-application and per-queue container allocation latencies would go a long 
> way towards help with tuning scheduler queue configs. 
> This umbrella JIRA tracks adding these metrics. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6382) Address race condition on TimelineWriter.flush() caused by buffer-sized flush

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6382:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Address race condition on TimelineWriter.flush() caused by buffer-sized flush
> -
>
> Key: YARN-6382
> URL: https://issues.apache.org/jira/browse/YARN-6382
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>Assignee: Yousef Abu-Salah
>Priority: Major
>
> YARN-6376 fixes the race condition between putEntities() and periodical 
> flush() by WriterFlushThread in TimelineCollectorManager, or between 
> putEntities() in different threads.
> However, BufferedMutator can have internal size-based flush as well. We need 
> to address the resulting race condition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6488) Remove continuous scheduling tests

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6488:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Remove continuous scheduling tests
> --
>
> Key: YARN-6488
> URL: https://issues.apache.org/jira/browse/YARN-6488
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>
> Remove all continuous scheduling tests from the code



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8149) Revisit behavior of Re-Reservation in Capacity Scheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8149:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Revisit behavior of Re-Reservation in Capacity Scheduler
> 
>
> Key: YARN-8149
> URL: https://issues.apache.org/jira/browse/YARN-8149
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Major
>
> Frankly speaking, I'm not sure why we need the re-reservation. The formula is 
> not that easy to understand:
> Inside: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#shouldAllocOrReserveNewContainer}}
> {code:java}
> starvation = re-reservation / (#reserved-container * 
>  (1 - min(requested-resource / max-alloc, 
>   max-alloc - min-alloc / max-alloc))
> should_allocate = starvation + requiredContainers - reservedContainers > 
> 0{code}
> I think we should be able to remove the starvation computation, just to check 
> requiredContainers > reservedContainers should be enough.
> In a large cluster, we can easily overflow re-reservation to MAX_INT, see 
> YARN-7636. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6147) Blacklisting nodes not happening for AM containers

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6147:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Blacklisting nodes not happening for AM containers
> --
>
> Key: YARN-6147
> URL: https://issues.apache.org/jira/browse/YARN-6147
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>Priority: Major
>
> Black Listing of nodes are not happening in the following scenarios
> 1. RMAppattempt is in ALLOCATED and LAUNCH_FAILED event comes when NM is down.
> 2. RMAppattempt is in LAUNCHED and EXPIRE event comes when NM is down.
> In both these cases AppAttempt goes to *FINAL_SAVING* and eventually to 
> *FINAL* state before *CONTAINER_FINISHED* event is triggered by 
> {{RMContainerImpl}} and in the {{FINAL}} state {{CONTAINER_FINISHED}} event 
> is ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8940) [CSI] Add volume as a top-level attribute in service spec

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8940:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> [CSI] Add volume as a top-level attribute in service spec 
> --
>
> Key: YARN-8940
> URL: https://issues.apache.org/jira/browse/YARN-8940
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: CSI
>
> Initial thought:
> {noformat}
> {
>   "name": "volume example",
>   "version": "1.0.0",
>   "description": "a volume simple example",
>   "components" :
> [
>   {
> "name": "",
> "number_of_containers": 1,
> "artifact": {
>   "id": "docker.io/centos:latest",
>   "type": "DOCKER"
> },
> "launch_command": "sleep,120",
> "configuration": {
>   "env": {
> "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true"
>   }
> },
> "resource": {
>   "cpus": 1,
>   "memory": "256",
> },
> "volumes": [
>   {
> "volume" : {
>   "type": "s3_csi",
>   "id": "5504d4a8-b246-11e8-94c2-026b17aa1190",
>   "capability" : {
> "min": "5Gi",
> "max": "100Gi"
>   },
>   "source_path": "s3://my_bucket/my", # optional for object stores
>   "mount_path": "/mnt/data", # required, the mount point in 
> docker container
>   "access_mode": "SINGLE_READ", # how the volume can be accessed
> }
>   }
> ]
>   }
> }
>   ]
> }
> {noformat}
> Open for discussion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8256) Pluggable provider for node membership management

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8256:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Pluggable provider for node membership management
> -
>
> Key: YARN-8256
> URL: https://issues.apache.org/jira/browse/YARN-8256
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 2.8.3, 3.0.2
>Reporter: Dagang Wei
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> h1. Background
> HDFS-7541 introduced a pluggable provider framework for node membership 
> management, which gives HDFS the flexibility to have different ways to manage 
> node membership for different needs.
> [org.apache.hadoop.hdfs.server.blockmanagement.HostConfigManager|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostConfigManager.java]
>  is the class which provides the abstraction. Currently, there are 2 
> implementations in the HDFS codebase:
> 1) 
> [org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HostFileManager.java]
>  which uses 2 config files which are defined by the properties dfs.hosts and 
> dfs.hosts.exclude.
> 2) 
> [org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CombinedHostFileManager.java]
>  which uses a single JSON file defined by the property dfs.hosts.
> dfs.namenode.hosts.provider.classname is the property determining which 
> implementation is used
> h1. Problem
> YARN should be consistent with HDFS in terms of pluggable provider for node 
> membership management. The absence of it makes YARN impossible to have other 
> config sources, e.g., ZooKeeper, database, other config file formats, etc.
> h1. Proposed solution
> [org.apache.hadoop.yarn.server.resourcemanager.NodesListManager|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java]
>  is the class for managing YARN node membership today. It uses 
> [HostsFileReader|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java]
>  to read config files specified by the property 
> yarn.resourcemanager.nodes.include-path for nodes to include and 
> yarn.resourcemanager.nodes.nodes.exclude-path for nodes to exclude.
> The proposed solution is to
> 1) introduce a new interface {color:#008000}HostsConfigManager{color} which 
> provides the abstraction for node membership management. Update 
> {color:#008000}NodeListManager{color} to depend on 
> {color:#008000}HostsConfigManager{color} instead of 
> {color:#008000}HostsFileReader{color}. Then create a wrapper class for 
> {color:#008000}HostsFileReader{color} which implements the interface.
> 2) introduce a new config property 
> {color:#008000}yarn.resourcemanager.hosts-config.manager.class{color} for 
> specifying the implementation class. Set the default value to the wrapper 
> class of {color:#008000}HostsFileReader{color} for backward compatibility 
> between new code and old config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8366) Expose debug log information when user intend to enable GPU without setting nvidia-smi path

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8366:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Expose debug log information when user intend to enable GPU without setting 
> nvidia-smi path
> ---
>
> Key: YARN-8366
> URL: https://issues.apache.org/jira/browse/YARN-8366
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>
> Expose Debug information help user found the root cause of failure when user 
> don't make these two settings manually before enabling GPU on YARN
> 1. yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables in 
> yarn-site.xml
> 2. environment variable LD_LIBRARY_PATH



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5814:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>Priority: Major
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2836) RM behaviour on token renewal failures is broken

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-2836:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> RM behaviour on token renewal failures is broken
> 
>
> Key: YARN-2836
> URL: https://issues.apache.org/jira/browse/YARN-2836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Major
>
> Found this while reviewing YARN-2834.
> We now completely ignore token renewal failures. For things like Timeline 
> tokens which are automatically obtained whether the app needs it or not (we 
> should fix this to be user driven), we can ignore failures. But for HDFS 
> Tokens etc, ignoring failures is bad because it (1) wastes resources as AMs 
> will continue and eventually fail (2) app doesn't know what happened it fails 
> eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9415) Document FS placement rule changes from YARN-8967

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9415:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Document FS placement rule changes from YARN-8967
> -
>
> Key: YARN-9415
> URL: https://issues.apache.org/jira/browse/YARN-9415
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation, fairscheduler
>Affects Versions: 3.3.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>
> With the changes introduced by YARN-8967 we now allow parent rules on all 
> existing rules. This should be documented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4758) Enable discovery of AMs by containers

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4758:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Enable discovery of AMs by containers
> -
>
> Key: YARN-4758
> URL: https://issues.apache.org/jira/browse/YARN-4758
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Junping Du
>Priority: Major
> Attachments: YARN-4758. AM Discovery Service for YARN Container.pdf
>
>
> {color:red}
> This is already discussed on the umbrella JIRA YARN-1489.
> Copying some of my condensed summary from the design doc (section 3.2.10.3) 
> of YARN-4692.
> {color}
> Even after the existing work in Work­preserving AM restart (Section 3.1.2 / 
> YARN-1489), we still haven’t solved the problem of old running containers not 
> knowing where the new AM starts running after the previous AM crashes. This 
> is a specifically important problem to be solved for long running services 
> where we’d like to avoid killing service containers when AMs fail­over. So 
> far, we left this as a task for the apps, but solving it in YARN is much 
> desirable. [(Task) This looks very much like service­-registry (YARN-913), 
> but for app­containers to discover their own AMs.
> Combining this requirement (of any container being able to find their AM 
> across fail­overs) with those of services (to be able to find through DNS 
> where a service container is running - YARN-4757) will put our registry 
> scalability needs to be much higher than that of just service end­points. 
> This calls for a more distributed solution for registry readers  something 
> that is discussed in the comments section of YARN-1489 and MAPREDUCE-6608.
> See comment 
> https://issues.apache.org/jira/browse/YARN-1489?focusedCommentId=13862359=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13862359



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9741) [JDK11] TestAHSWebServices.testAbout fails

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9741:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> [JDK11] TestAHSWebServices.testAbout fails
> --
>
> Key: YARN-9741
> URL: https://issues.apache.org/jira/browse/YARN-9741
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> On openjdk-11.0.2 TestAHSWebServices.testAbout[0] fails consistently with the 
> following stack trace:
> {noformat}
> [ERROR] Tests run: 40, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 7.9 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices
> [ERROR] 
> testAbout[0](org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices)
>   Time elapsed: 0.241 s  <<< FAILURE!
> org.junit.ComparisonFailure: expected: but 
> was:
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices.testAbout(TestAHSWebServices.java:333)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5902) yarn.scheduler.increment-allocation-mb and yarn.scheduler.increment-allocation-vcores are undocumented in yarn-default.xml

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5902:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> yarn.scheduler.increment-allocation-mb and 
> yarn.scheduler.increment-allocation-vcores are undocumented in 
> yarn-default.xml
> --
>
> Key: YARN-5902
> URL: https://issues.apache.org/jira/browse/YARN-5902
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: YARN-5902.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4804) [Umbrella] Improve test run duration

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4804:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> [Umbrella] Improve test run duration
> 
>
> Key: YARN-4804
> URL: https://issues.apache.org/jira/browse/YARN-4804
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Priority: Major
>
> Our tests take a long time to run. e.g. the RM tests take 67 minutes. Given 
> our precommit builds run our tests against two Java versions, this issue is 
> exacerbated. 
> Filing this umbrella JIRA to address this. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6723) NM overallocation based on over-time rather than snapshot utilization

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6723:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> NM overallocation based on over-time rather than snapshot utilization
> -
>
> Key: YARN-6723
> URL: https://issues.apache.org/jira/browse/YARN-6723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> To continue discussion on Miklos's idea in YARN-6670 of
> "Usually the CPU usage fluctuates quite a bit. Do not we need a time period 
> for NM_OVERALLOCATION_GENERAL_THRESHOLD, etc. to avoid allocating on small 
> glitches, even worse preempting in those cases?"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9852) Allow multiple MiniYarnCluster to be used

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9852:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Allow multiple MiniYarnCluster to be used
> -
>
> Key: YARN-9852
> URL: https://issues.apache.org/jira/browse/YARN-9852
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Priority: Major
>
> During implementing new HBase replication tests we observed that there are 
> problems in the communication between multiple MiniYarnCluster in one test 
> suite. I haven't seen any testcase in the Hadoop repository that uses 
> multiple clusters in one test, but seems like a logical request to allow 
> this. 
> In case this jira does not involve any code change (it's just mainly a 
> configuration issue), then I suggest to add a testcase that would demonstrate 
> such a suitable configuration.
> Thanks for the consultation to [~bszabolcs] about this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8074) Support placement policy composite constraints in YARN Service

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8074:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Support placement policy composite constraints in YARN Service
> --
>
> Key: YARN-8074
> URL: https://issues.apache.org/jira/browse/YARN-8074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Gour Saha
>Priority: Major
>
> This is a follow up of YARN-7142 where we support more advanced placement 
> policy features like creating composite constraints by exposing expressions 
> in YARN Service specification.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6653) Retrieve CPU and MEMORY metrics for applications in a flow run

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6653:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Retrieve CPU and MEMORY metrics for applications in a flow run
> --
>
> Key: YARN-6653
> URL: https://issues.apache.org/jira/browse/YARN-6653
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>Assignee: Akhil PB
>Priority: Major
>
> Similarly to YARN-6651, 
> 'metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' can be added 
> to the web ui query fired by a user listing all applications in a flow run.  
> CPU and MEMORY can be retrieved this way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7882) Server side proxy for UI2 log viewer

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7882:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Server side proxy for UI2 log viewer
> 
>
> Key: YARN-7882
> URL: https://issues.apache.org/jira/browse/YARN-7882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: security, timelineserver, yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Eric Yang
>Priority: Major
>
> When viewing container logs in UI2, the log files are directly fetched 
> through timeline server 2.  Hadoop in simple security mode does not have 
> authenticator to make sure the user is authorized to view the log.  The 
> general practice is to use knox or other security proxy to authenticate the 
> user and reserve proxy the request to Hadoop UI to ensure the information 
> does not leak through anonymous user.  The current implementation of UI2 log 
> viewer uses ajax code to timeline server 2.  This could prevent knox or 
> reverse proxy software from working properly with the new design.  It would 
> be good to perform server side proxy to prevent browser from side step the 
> authentication check.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6606) The implementation of LocalizationStatus in ContainerStatusProto

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6606:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> The implementation of LocalizationStatus in ContainerStatusProto
> 
>
> Key: YARN-6606
> URL: https://issues.apache.org/jira/browse/YARN-6606
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Bingxue Qiu
>Priority: Major
> Attachments: YARN-6606.1.patch, YARN-6606.2.patch
>
>
> we have a use case, where the full implementation of localization status in 
> ContainerStatusProto 
> [Continuous-resource-localization|https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
>need to be done , so we make it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7134) AppSchedulingInfo has a dependency on capacity scheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7134:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> AppSchedulingInfo has a dependency on capacity scheduler
> 
>
> Key: YARN-7134
> URL: https://issues.apache.org/jira/browse/YARN-7134
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Assignee: Sunil G
>Priority: Major
>
> The common scheduling code should be independent of all scheduler 
> implementations.  YARN-6040 introduced capacity scheduler's 
> {{SchedulingMode}} into {{AppSchedulingInfo}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5205) yarn logs for live applications does not provide log files which may have already been aggregated

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5205:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> yarn logs for live applications does not provide log files which may have 
> already been aggregated
> -
>
> Key: YARN-5205
> URL: https://issues.apache.org/jira/browse/YARN-5205
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Siddharth Seth
>Priority: Major
>
> With periodic aggregation enabled, the logs which have been partially 
> aggregated are not always displayed by the yarn logs command.
> If the file exists in the log dir for a container - all previously aggregated 
> files with the same name, along with the current file will be part of the 
> yarn log output.
> Files which have been previously aggregated, for which a file with the same 
> name does not exists in the container log dir do not show up in the output.
> After the app completes, all logs are available.
> cc [~xgong]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9936) Support vector of capacity percentages in Capacity Scheduler configuration

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9936:
---
Target Version/s: 3.4.0  (was: 3.3.0, 3.2.2)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Support vector of capacity percentages in Capacity Scheduler configuration
> --
>
> Key: YARN-9936
> URL: https://issues.apache.org/jira/browse/YARN-9936
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Zoltan Siegl
>Assignee: Zoltan Siegl
>Priority: Major
> Attachments: Capacity Scheduler support of “vector of resources 
> percentage”.pdf
>
>
> Currently, the Capacity Scheduler queue configuration supports two ways to 
> set queue capacity.
>  * In percentage of all available resources as a float ( eg. 25.0 ) means 25% 
> of the resources of its parent queue for all resource types equally (eg. 25% 
> of all memory, 25% of all CPU cores, and 25% of all available GPU in the 
> cluster) The percentages of all queues has to add up to 100%.
>  * In an absolute amount of resources ( e.g. 
> memory=4GB,vcores=20,yarn.io/gpu=4 ). The amount of all resources in the 
> queues has to be less than or equal to all resources in the cluster.
> Apart from these two already existing ways, there is a demand to add capacity 
> percentage of each available resource type separately. (eg. 
> {{memory=20%,vcores=40%,yarn.io/gpu=100%}}).
>  At the same time, a similar concept should be included with queues 
> maximum-capacity as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7884) Race condition in registering YARN service in ZooKeeper

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7884:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Race condition in registering YARN service in ZooKeeper
> ---
>
> Key: YARN-7884
> URL: https://issues.apache.org/jira/browse/YARN-7884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Priority: Major
>
> In Kerberos enabled cluster, there seems to be a race condition for 
> registering YARN service.
> Yarn-service znode creation seems to happen after AM started and reporting 
> back to update components information.  For some reason, Yarnservice znode 
> should have access to create the znode, but reported NoAuth.
> {code}
> 2018-02-02 22:53:30,442 [main] INFO  service.ServiceScheduler - Set registry 
> user accounts: sasl:hbase
> 2018-02-02 22:53:30,471 [main] INFO  zk.RegistrySecurity - Registry default 
> system acls: 
> [1,s{'world,'anyone}
> , 31,s{'sasl,'yarn}
> , 31,s{'sasl,'jhs}
> , 31,s{'sasl,'hdfs-demo}
> , 31,s{'sasl,'rm}
> , 31,s{'sasl,'hive}
> ]
> 2018-02-02 22:53:30,472 [main] INFO  zk.RegistrySecurity - Registry User ACLs 
> [31,s{'sasl,'hbase}
> , 31,s{'sasl,'hbase}
> ]
> 2018-02-02 22:53:30,503 [main] INFO  event.AsyncDispatcher - Registering 
> class org.apache.hadoop.yarn.service.component.ComponentEventType for class 
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentEventHandler
> 2018-02-02 22:53:30,504 [main] INFO  event.AsyncDispatcher - Registering 
> class 
> org.apache.hadoop.yarn.service.component.instance.ComponentInstanceEventType 
> for class 
> org.apache.hadoop.yarn.service.ServiceScheduler$ComponentInstanceEventHandler
> 2018-02-02 22:53:30,528 [main] INFO  impl.NMClientAsyncImpl - Upper bound of 
> the thread pool size is 500
> 2018-02-02 22:53:30,531 [main] INFO  service.ServiceMaster - Starting service 
> as user hbase/eyang-5.openstacklo...@example.com (auth:KERBEROS)
> 2018-02-02 22:53:30,545 [main] INFO  ipc.CallQueueManager - Using callQueue: 
> class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler: 
> class org.apache.hadoop.ipc.DefaultRpcScheduler
> 2018-02-02 22:53:30,554 [Socket Reader #1 for port 56859] INFO  ipc.Server - 
> Starting Socket Reader #1 for port 56859
> 2018-02-02 22:53:30,589 [main] INFO  pb.RpcServerFactoryPBImpl - Adding 
> protocol org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPB to 
> the server
> 2018-02-02 22:53:30,606 [IPC Server Responder] INFO  ipc.Server - IPC Server 
> Responder: starting
> 2018-02-02 22:53:30,607 [IPC Server listener on 56859] INFO  ipc.Server - IPC 
> Server listener on 56859: starting
> 2018-02-02 22:53:30,607 [main] INFO  service.ClientAMService - Instantiated 
> ClientAMService at eyang-5.openstacklocal/172.26.111.20:56859
> 2018-02-02 22:53:30,609 [main] INFO  zk.CuratorService - Creating 
> CuratorService with connection fixed ZK quorum "eyang-1.openstacklocal:2181" 
> 2018-02-02 22:53:30,615 [main] INFO  zk.RegistrySecurity - Enabling ZK sasl 
> client: jaasClientEntry = Client, principal = 
> hbase/eyang-5.openstacklo...@example.com, keytab = 
> /etc/security/keytabs/hbase.service.keytab
> 2018-02-02 22:53:30,752 [main] INFO  client.RMProxy - Connecting to 
> ResourceManager at eyang-1.openstacklocal/172.26.111.17:8032
> 2018-02-02 22:53:30,909 [main] INFO  service.ServiceScheduler - Registering 
> appattempt_1517611904996_0001_01, abc into registry
> 2018-02-02 22:53:30,911 [main] INFO  service.ServiceScheduler - Received 0 
> containers from previous attempt.
> 2018-02-02 22:53:31,072 [main] INFO  service.ServiceScheduler - Could not 
> read component paths: `/users/hbase/services/yarn-service/abc/components': No 
> such file or directory: KeeperErrorCode = NoNode for 
> /registry/users/hbase/services/yarn-service/abc/components
> 2018-02-02 22:53:31,074 [main] INFO  service.ServiceScheduler - Triggering 
> initial evaluation of component sleeper
> 2018-02-02 22:53:31,075 [main] INFO  component.Component - [INIT COMPONENT 
> sleeper]: 2 instances.
> 2018-02-02 22:53:31,094 [main] INFO  component.Component - [COMPONENT 
> sleeper] Transitioned from INIT to FLEXING on FLEX event.
> 2018-02-02 22:53:31,215 [pool-5-thread-1] ERROR service.ServiceScheduler - 
> Failed to register app abc in registry
> org.apache.hadoop.registry.client.exceptions.NoPathPermissionsException: 
> `/registry/users/hbase/services/yarn-service/abc': Not authorized to access 
> path; ACLs: [
> 0x01: 'world,'anyone
>  0x1f: 'sasl,'yarn
>  0x1f: 'sasl,'jhs
>  0x1f: 'sasl,'hdfs-demo
>  0x1f: 'sasl,'rm
>  0x1f: 'sasl,'hive
>  0x1f: 

[jira] [Updated] (YARN-7342) Application page doesn't show correct metrics for reservation runs

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7342:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Application page doesn't show correct metrics for reservation runs 
> ---
>
> Key: YARN-7342
> URL: https://issues.apache.org/jira/browse/YARN-7342
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, reservation system
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
> Attachments: Screen Shot 2017-10-16 at 17.27.48.png
>
>
> As the screen shot shows, there are some bugs on the webUI while running job 
> with reservation. For examples, queue name should just be "root.queueA" 
> instead of internal queue name. All metrics(Allocated CPU, % of queue, etc) 
> are missing for reservation runs. These should be a blocker though. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6963) Prevent other containers from staring when a container is re-initializing

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6963:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Prevent other containers from staring when a container is re-initializing
> -
>
> Key: YARN-6963
> URL: https://issues.apache.org/jira/browse/YARN-6963
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> Further to discussions in YARN-6920.
> Container re-initialization will lead to momentary relinquishing of NM 
> resources when the container is brought down followed by re-claiming of the 
> same resources when it is re-launched. If there are Opportunistic containers 
> in the queue, it can lead to un-necessary churn if one of those opportunistic 
> containers are started and immediately killed.
> This JIRA tracks changes required to prevent the above by ensuring the 
> resources for a container are 'locked' for the during of the container 
> lifetime - including the time it takes for a re-initialization. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7912) While launching Native Service app from UI, consider service owner name from user.name query parameter

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7912:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> While launching Native Service app from UI, consider service owner name from 
> user.name query parameter
> --
>
> Key: YARN-7912
> URL: https://issues.apache.org/jira/browse/YARN-7912
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sunil G
>Priority: Major
>
> As per comments from [~eyang] in YARN-7827, 
> "For supporting knox, it would be good for javascript to detect the url 
> entering /ui2 and process [user.name|http://user.name/] property.  If there 
> isn't one found, then proceed with ajax call to resource manager to find out 
> who is the current user to pass the parameter along the rest api calls."
> This Jira will track to handle this. This is now pending feasibility check.
> Thanks [~eyang] and [~jianhe]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7086) Release all containers aynchronously

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7086:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Release all containers aynchronously
> 
>
> Key: YARN-7086
> URL: https://issues.apache.org/jira/browse/YARN-7086
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Arun Suresh
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-7086.001.patch, YARN-7086.002.patch, 
> YARN-7086.Perf-test-case.patch
>
>
> We have noticed in production two situations that can cause deadlocks and 
> cause scheduling of new containers to come to a halt, especially with regard 
> to applications that have a lot of live containers:
> # When these applicaitons release these containers in bulk.
> # When these applications terminate abruptly due to some failure, the 
> scheduler releases all its live containers in a loop.
> To handle the issues mentioned above, we have a patch in production to make 
> sure ALL container releases happen asynchronously - and it has served us well.
> Opening this JIRA to gather feedback on if this is a good idea generally (cc 
> [~leftnoteasy], [~jlowe], [~curino], [~kasha], [~subru], [~roniburd])
> BTW, In YARN-6251, we already have an asyncReleaseContainer() in the 
> AbstractYarnScheduler and a corresponding scheduler event, which is currently 
> used specifically for the container-update code paths (where the scheduler 
> realeases temp containers which it creates for the update)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7263) Check host name resolution performance when resource manager starts up

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7263:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Check host name resolution performance when resource manager starts up
> --
>
> Key: YARN-7263
> URL: https://issues.apache.org/jira/browse/YARN-7263
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Priority: Major
>
> According to YARN-7207, host name resolution could be slow in some 
> environment, which affects RM performance in different ways. It would be nice 
> to check that when RM starts up and place a warning message into the logs if 
> the performance is not ideal. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9652) Convert SchedulerQueueManager from a protocol-only type to a basic hierarchical queue implementation

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9652:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Convert SchedulerQueueManager from a protocol-only type to a basic 
> hierarchical queue implementation
> 
>
> Key: YARN-9652
> URL: https://issues.apache.org/jira/browse/YARN-9652
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: reservation system, scheduler
>Affects Versions: 3.3.0
>Reporter: Erkin Alp Güney
>Priority: Major
>
> SchedulerQueueManager is currently an interface aka a protocol-only type. As 
> seen in the codebase, each scheduler implements the queue configuration and 
> management stuff over and over. If we convert it into a base concrete class 
> with simple implementation of hierarchical queue system (as in Fair and 
> Capacity schedulers), pluggable schedulers may be developed more easily.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6466) Provide shaded framework jar for containers

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6466:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Provide shaded framework jar for containers
> ---
>
> Key: YARN-6466
> URL: https://issues.apache.org/jira/browse/YARN-6466
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: build, yarn
>Affects Versions: 3.0.0-alpha1
>Reporter: Sean Busbey
>Assignee: Haibo Chen
>Priority: Major
>
> We should build on the existing shading work to provide a jar with all of the 
> bits needed within a YARN application's container to talk to the resource 
> manager and node manager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8733) Readiness check for remote component

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8733:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Readiness check for remote component
> 
>
> Key: YARN-8733
> URL: https://issues.apache.org/jira/browse/YARN-8733
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-native-services
>Reporter: Eric Yang
>Assignee: Billie Rinaldi
>Priority: Major
>
> When a service is deploying, there can be remote component dependency between 
> services.  For example, Hive server 2 can depend on Hive metastore, which 
> depends on a remote MySQL database.  It would be great to have ability to 
> check the remote server and port to make sure MySQL is available before 
> deploying Hive LLAP service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8779) Fix few discrepancies between YARN Service swagger spec and code

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8779:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Fix few discrepancies between YARN Service swagger spec and code
> 
>
> Key: YARN-8779
> URL: https://issues.apache.org/jira/browse/YARN-8779
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-native-services
>Affects Versions: 3.1.0, 3.1.1
>Reporter: Gour Saha
>Priority: Major
>
> Following issues were identified in YARN Service swagger definition during an 
> effort to integrate with a running service by generating Java and Go 
> client-side stubs from the spec -
>  
> 1.
> *restartPolicy* is wrong and should be *restart_policy*
>  
> 2.
> A DELETE request to a non-existing service (or a previously existing but 
> deleted service) throws an ApiException instead of something like 
> NotFoundException (the equivalent of 404). Note, DELETE of an existing 
> service behaves fine.
>  
> 3.
> The response code of DELETE request is 200. The spec says 204. Since the 
> response has a payload, the spec should be updated to 200 instead of 204.
>  
> 4.
>  _DefaultApi.java_ client's _appV1ServicesServiceNameGetWithHttpInfo_ method 
> does not return a Service object. Swagger definition has the below bug in GET 
> response of */app/v1/services/\{service_name}* -
> {code:java}
> type: object
> items:
>   $ref: '#/definitions/Service'
> {code}
> It should be -
> {code:java}
> $ref: '#/definitions/Service'
> {code}
>  
> 5.
> Serialization issues were seen in all enum classes - ServiceState.java, 
> ContainerState.java, ComponentState.java, PlacementType.java and 
> PlacementScope.java.
> Java client threw the below exception for ServiceState -
> {code:java}
> Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException: 
> Cannot construct instance of 
> `org.apache.cb.yarn.service.api.records.ServiceState` (although at least one 
> Creator exists): no String-argument constructor/factory method to deserialize 
> from String value ('ACCEPTED')
>  at [Source: 
> (org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream);
>  line: 1, column: 121] (through reference chain: 
> org.apache.cb.yarn.service.api.records.Service["state”])
> {code}
> For Golang we saw this for ContainerState -
> {code:java}
> ERRO[2018-08-12T23:32:31.851-07:00] During GET request: json: cannot 
> unmarshal string into Go struct field Container.state of type 
> yarnmodel.ContainerState 
> {code}
>  
> 6.
> *launch_time* actually returns an integer but swagger definition says date. 
> Hence, the following exception is seen on the client side -
> {code:java}
> Caused by: com.fasterxml.jackson.databind.exc.MismatchedInputException: 
> Unexpected token (VALUE_NUMBER_INT), expected START_ARRAY: Expected array or 
> string.
>  at [Source: 
> (org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream);
>  line: 1, column: 477] (through reference chain: 
> org.apache.cb.yarn.service.api.records.Service["components"]->java.util.ArrayList[0]->org.apache.cb.yarn.service.api.records.Component["containers"]->java.util.ArrayList[0]->org.apache.cb.yarn.service.api.records.Container["launch_time”])
> {code}
>  
> 8.
> *user.name* query param with a valid value is required for all API calls to 
> an unsecure cluster. This is not defined in the spec.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9675) Expose log aggregation diagnostic messages through RM API

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-9675:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Expose log aggregation diagnostic messages through RM API
> -
>
> Key: YARN-9675
> URL: https://issues.apache.org/jira/browse/YARN-9675
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api, log-aggregation, resourcemanager
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
>
> The ResourceManager collects the log aggregation status reports from the 
> NodeManagers. Currently these reports are collected, but when app info API or 
> similar high-level REST is called, only an overall status is displayed 
> (RUNNING, RUNNING_WITH_FAILURES,FAILED etc.). 
> The diagnostic messages are only available through the old RM web UI, so our 
> internal tool currently crawls that page and extract the log aggregation 
> diagnostic and error messages from the raw HTML. This is not a good practice, 
> and more elegant API call may be preferable. It may be useful for others as 
> well since log aggregation related failures are usually hard to debug since 
> the lack of trace/debug messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5852) Consolidate CSAssignment, ContainerAllocation, ContainerAllocationContext class in CapacityScheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5852:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Consolidate CSAssignment, ContainerAllocation, ContainerAllocationContext 
> class in CapacityScheduler
> 
>
> Key: YARN-5852
> URL: https://issues.apache.org/jira/browse/YARN-5852
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Priority: Major
>
> Quite a few data structures which wraps container related info with similar 
> names: CSAssignment, ContainerAllocation, ContainerAllocationContext, And a 
> bunch of code to convert one from another. we should consolidate those to be 
> a single one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5194) Avoid adding yarn-site to all Configuration instances created by the JVM

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5194:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Avoid adding yarn-site to all Configuration instances created by the JVM
> 
>
> Key: YARN-5194
> URL: https://issues.apache.org/jira/browse/YARN-5194
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Priority: Major
>
> {code}
> static {
> addDeprecatedKeys();
> Configuration.addDefaultResource(YARN_DEFAULT_CONFIGURATION_FILE);
> Configuration.addDefaultResource(YARN_SITE_CONFIGURATION_FILE);
>   }
> {code}
> This puts the contents of yarn-default and yarn-site into every configuration 
> instance created in the VM after YarnConfiguration has been initialized.
> This should be changed to a local addResource for the specific 
> YarnConfiguration instance, instead of polluting every Configuration instance.
> Incompatible change. Have set the target version to 3.x. 
> The same applies to HdfsConfiguration (hdfs-site.xml), and Configuration 
> (core-site.xml etc).
> core-site may be worth including everywhere, however it would be better to 
> expect users to explicitly add the relevant resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8161) ServiceState FLEX should be removed

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8161:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> ServiceState FLEX should be removed
> ---
>
> Key: YARN-8161
> URL: https://issues.apache.org/jira/browse/YARN-8161
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Gour Saha
>Priority: Major
>
> ServiceState FLEX is not required to trigger flex up/down of containers and 
> should be removed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6651) Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to retrieve CPU and memory

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6651:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Flow Activity should specify 'metricstoretrieve' in its query to ATSv2 to 
> retrieve CPU and memory 
> --
>
> Key: YARN-6651
> URL: https://issues.apache.org/jira/browse/YARN-6651
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>Assignee: Akhil PB
>Priority: Major
>
> When you click on Flow Acitivity => \{a flow\} => flow runs, the web server 
> sends a REST query to ATSv2 TimelineReaderServer, but it does not include a 
> query param 'metricstoretrieve" to get any metrics back.
> Instead, we should add 
> '?metricstoretrieve=YARN_APPLICATION_CPU,YARN_APPLICATION_MEMORY' to the 
> query to get CPU and MEMORY back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6690) Consolidate NM overallocation thresholds with ResourceTypes

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6690:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Consolidate  NM overallocation thresholds with ResourceTypes
> 
>
> Key: YARN-6690
> URL: https://issues.apache.org/jira/browse/YARN-6690
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> YARN-3926 (ResourceTypes) introduces a new class  ResourceInformation to 
> encapsulate all information about a given resource type (e.g. type, value, 
> unit). We could add the overallocation thresholds to it as well.
> Another thing to look at, as suggested by Wangda in YARN-4511 is whether we 
> could just use ResourceThresholds to replace OverallocationInfo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7182) YARN's StateMachine should be stable

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7182:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> YARN's StateMachine should be stable
> 
>
> Key: YARN-7182
> URL: https://issues.apache.org/jira/browse/YARN-7182
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0-alpha4
>Reporter: Daniel Templeton
>Priority: Major
>
> It's currently {{Evolving}}, which is clearly no longer true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10059) Final states of failed-to-localize containers are not recorded in NM state store

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10059:

Target Version/s: 3.4.0  (was: 3.3.0, 3.2.2, 3.1.4, 2.10.1)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Final states of failed-to-localize containers are not recorded in NM state 
> store
> 
>
> Key: YARN-10059
> URL: https://issues.apache.org/jira/browse/YARN-10059
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-10059.001.patch
>
>
> Currently we found an issue that many localizers of completed containers were 
> launched and exhausted memory/cpu of that machine after NM restarted, these 
> containers were all failed and completed when localizing on a non-existed 
> local directory which is caused by another problem, but their final states 
> weren't recorded in NM state store.
>  The process flow of a fail-to-localize container is as follow:
> {noformat}
> ResourceLocalizationService$LocalizerRunner#run
> -> ContainerImpl$ResourceFailedTransition#transition handle LOCALIZING -> 
> LOCALIZATION_FAILED upon RESOURCE_FAILED
>   dispatch LocalizationEventType.CLEANUP_CONTAINER_RESOURCES
>   -> ResourceLocalizationService#handleCleanupContainerResources  handle 
> CLEANUP_CONTAINER_RESOURCES
>   dispatch ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP
>   -> ContainerImpl$LocalizationFailedToDoneTransition#transition  
> handle LOCALIZATION_FAILED -> DONE upon CONTAINER_RESOURCES_CLEANEDUP
> {noformat}
> There's no update for state store in this flow now, which is required to 
> avoid unnecessary localizations after NM restarts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6429) Revisit implementation of LocalitySchedulingPlacementSet to avoid invoke methods of AppSchedulingInfo

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6429:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Revisit implementation of LocalitySchedulingPlacementSet to avoid invoke 
> methods of AppSchedulingInfo
> -
>
> Key: YARN-6429
> URL: https://issues.apache.org/jira/browse/YARN-6429
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
>
> An example is, LocalitySchedulingPlacementSet#decrementOutstanding: it calls 
> appSchedulingInfo directly, which could potentially cause trouble since it 
> tries to modify parent from child. Is it possible to move this logic to 
> AppSchedulingInfo#allocate.
> Need to check other methods as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8583) Inconsistency in YARN status command

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-8583:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Inconsistency in YARN status command
> 
>
> Key: YARN-8583
> URL: https://issues.apache.org/jira/browse/YARN-8583
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Yang
>Priority: Major
>
> YARN app -status command can report base on application ID or application 
> name with some usability limitation.  Application ID is globally unique, and 
> it allows any user to query application status of any application.  
> Application name is not globally unique, and it will only work for querying 
> user's own application.  This is somewhat restrictive for application 
> administrator, but allowing other user to query any other user's application 
> could consider a security hole as well.  There are two possible options to 
> reduce the inconsistency:
> Option 1.  Block other user from query application status.  This may improve 
> security in some sense, but it is an incompatible change.  This is a simpler 
> change by matching the owner of the application, and decide to report or not 
> report.
> Option 2.  Add --user parameter to allow administrator to query application 
> name ran by other user.  This is a bigger change because application metadata 
> is stored in user's own hdfs directory.  There are security restriction that 
> need to be defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4713) Warning by unchecked conversion in TestTimelineWebServices

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-4713:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Warning by unchecked conversion in TestTimelineWebServices 
> ---
>
> Key: YARN-4713
> URL: https://issues.apache.org/jira/browse/YARN-4713
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Priority: Major
>  Labels: newbie
> Attachments: YARN-4713.1.patch, YARN-4713.2.patch
>
>
> [WARNING] 
> /testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java:[123,38]
>  [unchecked] unchecked conversion
> {code}
>   Enumeration names = mock(Enumeration.class);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6652) Merge flow info and flow runs

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6652:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Merge flow info and flow runs
> -
>
> Key: YARN-6652
> URL: https://issues.apache.org/jira/browse/YARN-6652
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>Assignee: Akhil PB
>Priority: Major
>
> If a user clicks on a flow from the flow activity page, Flow Run and Flow 
> Info are shown separately. Usually, users want to go to individual flow runs. 
> With the current work flow, the user will need to click on Flow Run because 
> Flow Info is selected by default. 
> Given that Flow Info does not have much information, It'd be a nice 
> improvement if we can show flow info and flow run together, that is, one 
> section at the top containing flow info, another section at the bottom 
> containing the flow runs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6527) Provide a better out-of-the-box experience for SLS

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6527:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Provide a better out-of-the-box experience for SLS
> --
>
> Key: YARN-6527
> URL: https://issues.apache.org/jira/browse/YARN-6527
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 3.0.0-alpha4
>Reporter: Robert Kanter
>Priority: Major
>
> The example provided with SLS appears to be broken - I didn't see any jobs 
> running.  On top of that, it seems like getting SLS to run properly requires 
> a lot of hadoop site configs, scheduler configs, etc.  I was only able to get 
> something running after [~yufeigu] provided a lot of config files.
> We should provide a better out-of-the-box experience for SLS.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6088) RM UI has to redirect to AHS for completed applications logs

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6088:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> RM UI has to redirect to AHS for completed applications logs
> 
>
> Key: YARN-6088
> URL: https://issues.apache.org/jira/browse/YARN-6088
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: webapp
>Affects Versions: 2.7.3
>Reporter: Sunil G
>Priority: Major
>
> Currently AMContainer logs link in RMAppBlock is hardcoded containers' host 
> node. If that node unavailable, we will not have enough information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10065) Support Placement Constraints for AM container allocations

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10065:

Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Support Placement Constraints for AM container allocations
> --
>
> Key: YARN-10065
> URL: https://issues.apache.org/jira/browse/YARN-10065
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Daniel Velasquez
>Priority: Major
>
> Currently ApplicationSubmissionContext API supports specifying a node label 
> expression for the AM resource request. It would be beneficial to have the 
> ability to specify Placement Constraints as well for the AM resource request. 
> We have a requirement to constrain AM containers on certain nodes e.g. AM 
> containers not on preemptible/spot cloud instances. It looks like node 
> attributes would fit our use case well. However, we currently don't have the 
> ability to specify this in the API for AM resource requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7867) Enable YARN service by default

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7867:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Enable YARN service by default
> --
>
> Key: YARN-7867
> URL: https://issues.apache.org/jira/browse/YARN-7867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Priority: Major
>
> YARN service REST API is disabled by default.  We will make the decision to 
> turn on this feature by default when the code is mature enough to be consumed 
> by public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7418) Improve performance of locking in fair scheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-7418:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Improve performance of locking in fair scheduler
> 
>
> Key: YARN-7418
> URL: https://issues.apache.org/jira/browse/YARN-7418
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0-beta1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Major
>
> Based on initial testing, we can improve scheduler performance by 5%-10% with 
> some simple optimizations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9995) Code cleanup in TestSchedConfCLI

2020-04-09 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079668#comment-17079668
 ] 

Bilwa S T commented on YARN-9995:
-

Hi [~snemeth] 

TestSchedConfCLI#testFormatSchedulerConf  fails in branch-3.2 . So i have 
raised Jira YARN-10230 to fix it.

> Code cleanup in TestSchedConfCLI
> 
>
> Key: YARN-9995
> URL: https://issues.apache.org/jira/browse/YARN-9995
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Minor
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-9995.001.patch, YARN-9995.002.patch, 
> YARN-9995.003.patch, YARN-9995.004.patch, YARN-9995.branch-3.2.patch
>
>
> Some tests are too verbose: 
> - add / delete / remove queues testcases: Creating SchedConfUpdateInfo 
> instances could be simplified with a helper method or something like that.
> - Some fields can be converted to local variables: sysOutStream, sysOut, 
> sysErr, csConf
> - Any additional cleanup



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6831) Miscellaneous refactoring changes of ContainScheduler

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6831:
---
Target Version/s: 3.4.0  (was: 3.3.0)

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Miscellaneous refactoring changes of ContainScheduler 
> --
>
> Key: YARN-6831
> URL: https://issues.apache.org/jira/browse/YARN-6831
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> While reviewing YARN-6706, Karthik pointed out a few issues for improvment in 
> ContainerScheduler
> *Make ResourceUtilizationTracker pluggable. That way, we could use a 
> different tracker when oversubscription is enabled.
> *ContainerScheduler
>   ##Why do we need maxOppQueueLength given queuingLimit?
>   ##Is there value in splitting runningContainers into runningGuaranteed and 
> runningOpportunistic?
>   ##getOpportunisticContainersStatus method implementation feels awkward. How 
> about capturing the state in the field here, and have metrics etc. pull from 
> here?
>   ##startContainersFromQueue: Local variable resourcesAvailable is unnecessary
> *OpportunisticContainersStatus
>   ##Let us clearly differentiate between allocated, used and utilized. Maybe, 
> we should rename current Used methods to Allocated?
>   ##I prefer either full name Opportunistic (in method) or Opp (shortest name 
> that makes sense). Opport is neither short nor fully descriptive.
>   ##Have we considered folding ContainerQueuingLimit class into this?
> We decided to move the issues into this follow up jira to keep YARN-6706 
> moving forward to unblock oversubscription work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10230) TestSchedConfCLI#testFormatSchedulerConf fails

2020-04-09 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10230:


 Summary: TestSchedConfCLI#testFormatSchedulerConf fails 
 Key: YARN-10230
 URL: https://issues.apache.org/jira/browse/YARN-10230
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.2.0
Reporter: Bilwa S T
Assignee: Bilwa S T


{code:java}
[ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 10.979 
s <<< FAILURE! - in org.apache.hadoop.yarn.client.cli.TestSchedConfCLI
[ERROR] 
testFormatSchedulerConf(org.apache.hadoop.yarn.client.cli.TestSchedConfCLI)  
Time elapsed: 10.017 s  <<< ERROR!
java.lang.Exception: test timed out after 1 milliseconds
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1587)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at 
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at 
com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at 
com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at 
com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at 
org.apache.hadoop.yarn.client.cli.SchedConfCLI.formatSchedulerConf(SchedConfCLI.java:191)
at 
org.apache.hadoop.yarn.client.cli.TestSchedConfCLI.testFormatSchedulerConf(TestSchedConfCLI.java:226)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10212) Create separate configuration for max global AM attempts

2020-04-09 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079664#comment-17079664
 ] 

Hudson commented on YARN-10212:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18135 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18135/])
YARN-10212. Create separate configuration for max global AM attempts. (jhung: 
rev 23481ad378de7f8e95eabefbd102825f757714b8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java


> Create separate configuration for max global AM attempts
> 
>
> Key: YARN-10212
> URL: https://issues.apache.org/jira/browse/YARN-10212
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4, 2.10.1, 3.4.0
>
> Attachments: YARN-10212.001.patch, YARN-10212.002.patch, 
> YARN-10212.003.patch, YARN-10212.004.patch
>
>
> Right now user's default max AM attempts is set to the same as global max AM 
> attempts:
> {noformat}
> int globalMaxAppAttempts = conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS,
> YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS); {noformat}
> If we want to increase global max AM attempts, it will also increase the 
> default. So we should create a separate global AM max attempts config to 
> separate the two.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10227) Pull YARN-8242 back to branch-2.10

2020-04-09 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079656#comment-17079656
 ] 

Jonathan Hung commented on YARN-10227:
--

Thanks Jim for fixing this. Belated +1 from me.

> Pull YARN-8242 back to branch-2.10
> --
>
> Key: YARN-10227
> URL: https://issues.apache.org/jira/browse/YARN-10227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.0, 2.10.1
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 2.10.1
>
> Attachments: YARN-10227-branch-2.10.001.patch
>
>
> We have recently seen the nodemanager OOM issue reported in YARN-8242 during 
> a rolling upgrade.  Our code is currently based on branch-2.8, but we are in 
> the process of moving to 2.10.  I checked and YARN-8242 pulls back to 
> branch-2.10 pretty cleanly.  The only conflict was a minor one in 
> TestNMLeveldbStateStoreService.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9995) Code cleanup in TestSchedConfCLI

2020-04-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079639#comment-17079639
 ] 

Hadoop QA commented on YARN-9995:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
49s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m 54s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestSchedConfCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:11aff6c269f |
| JIRA Issue | YARN-9995 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999458/YARN-9995.branch-3.2.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1065f27aeb23 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 4c63a81 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25840/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25840/testReport/ |
| Max. process+thread count | 547 (vs. ulimit of 5500) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 

[jira] [Updated] (YARN-6838) Add support to LinuxContainerExecutor to support container PAUSE

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6838:
---
Target Version/s: 3.4.0  (was: 3.3.0)

> Add support to LinuxContainerExecutor to support container PAUSE
> 
>
> Key: YARN-6838
> URL: https://issues.apache.org/jira/browse/YARN-6838
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> This JIRA tracks the changes needed to the {{LinuxContainerExecutor}},  
> {{LinuxContainerRuntime}}, {{DockerLinuxContainerRuntime}} and the 
> {{container-executor}} linux binary to support container PAUSE using cgroups 
> freezer module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6838) Add support to LinuxContainerExecutor to support container PAUSE

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079638#comment-17079638
 ] 

Brahma Reddy Battula commented on YARN-6838:


Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker. 

> Add support to LinuxContainerExecutor to support container PAUSE
> 
>
> Key: YARN-6838
> URL: https://issues.apache.org/jira/browse/YARN-6838
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>Priority: Major
>
> This JIRA tracks the changes needed to the {{LinuxContainerExecutor}},  
> {{LinuxContainerRuntime}}, {{DockerLinuxContainerRuntime}} and the 
> {{container-executor}} linux binary to support container PAUSE using cgroups 
> freezer module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10223) Duplicate jersey-test-framework-core dependency in yarn-server-common

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079615#comment-17079615
 ] 

Brahma Reddy Battula commented on YARN-10223:
-

[~aajisaka], are you sure this Jira can be target to 3.3.1, since broken 
YARN-10101 is in 3.3.0 ..?

> Duplicate jersey-test-framework-core dependency in yarn-server-common
> -
>
> Key: YARN-10223
> URL: https://issues.apache.org/jira/browse/YARN-10223
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Minor
>
> The following warning appears in maven log.
> {noformat}
> [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must 
> be unique: 
> com.sun.jersey.jersey-test-framework:jersey-test-framework-core:jar -> 
> version (?) vs 1.19 @ line 148, column 17
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079614#comment-17079614
 ] 

Brahma Reddy Battula commented on YARN-10063:
-

Updated the fixversion as 3.3.0 for branch-3.3

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch, 
> YARN-10063.003.patch, YARN-10063.004.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10063) Usage output of container-executor binary needs to include --http/--https argument

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-10063:

Fix Version/s: (was: 3.3.1)
   3.3.0

> Usage output of container-executor binary needs to include --http/--https 
> argument
> --
>
> Key: YARN-10063
> URL: https://issues.apache.org/jira/browse/YARN-10063
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10063.001.patch, YARN-10063.002.patch, 
> YARN-10063.003.patch, YARN-10063.004.patch
>
>
> YARN-8448/YARN-6586 seems to have introduced a new option - "\--http" 
> (default) and "\--https" that is possible to be passed in to the 
> container-executor binary, see :
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L564
> and 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L521
> however, the usage output seems to have missed this:
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c#L74
> Raising this jira to improve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6911) Graph application-level resource utilization in Web UI v2

2020-04-09 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079603#comment-17079603
 ] 

Hadoop QA commented on YARN-6911:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-6911 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6911 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25841/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Graph application-level resource utilization in Web UI v2
> -
>
> Key: YARN-6911
> URL: https://issues.apache.org/jira/browse/YARN-6911
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
>Priority: Major
> Attachments: Resource Graph Screenshot 2.png, Resource Graph 
> Screenshot.png, Resource Utilization Graph Mock Up.png, YARN-6911.001.patch, 
> YARN-6911.002.patch, YARN-6911.003.patch, resource graph in web ui v2.png
>
>
> It would be useful to have a visualization of the resource utilization 
> (memory, cpu, etc.) per application using the ATSv2 time series data. Rough 
> mock up attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10227) Pull YARN-8242 back to branch-2.10

2020-04-09 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079602#comment-17079602
 ] 

Jim Brennan commented on YARN-10227:


Thanks [~epayne]!

> Pull YARN-8242 back to branch-2.10
> --
>
> Key: YARN-10227
> URL: https://issues.apache.org/jira/browse/YARN-10227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.0, 2.10.1
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 2.10.1
>
> Attachments: YARN-10227-branch-2.10.001.patch
>
>
> We have recently seen the nodemanager OOM issue reported in YARN-8242 during 
> a rolling upgrade.  Our code is currently based on branch-2.8, but we are in 
> the process of moving to 2.10.  I checked and YARN-8242 pulls back to 
> branch-2.10 pretty cleanly.  The only conflict was a minor one in 
> TestNMLeveldbStateStoreService.java.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6911) Graph application-level resource utilization in Web UI v2

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079601#comment-17079601
 ] 

Brahma Reddy Battula commented on YARN-6911:


Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.

> Graph application-level resource utilization in Web UI v2
> -
>
> Key: YARN-6911
> URL: https://issues.apache.org/jira/browse/YARN-6911
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
>Priority: Major
> Attachments: Resource Graph Screenshot 2.png, Resource Graph 
> Screenshot.png, Resource Utilization Graph Mock Up.png, YARN-6911.001.patch, 
> YARN-6911.002.patch, YARN-6911.003.patch, resource graph in web ui v2.png
>
>
> It would be useful to have a visualization of the resource utilization 
> (memory, cpu, etc.) per application using the ATSv2 time series data. Rough 
> mock up attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6911) Graph application-level resource utilization in Web UI v2

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6911:
---
Target Version/s: 3.4.0  (was: 3.3.0)

> Graph application-level resource utilization in Web UI v2
> -
>
> Key: YARN-6911
> URL: https://issues.apache.org/jira/browse/YARN-6911
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
>Priority: Major
> Attachments: Resource Graph Screenshot 2.png, Resource Graph 
> Screenshot.png, Resource Utilization Graph Mock Up.png, YARN-6911.001.patch, 
> YARN-6911.002.patch, YARN-6911.003.patch, resource graph in web ui v2.png
>
>
> It would be useful to have a visualization of the resource utilization 
> (memory, cpu, etc.) per application using the ATSv2 time series data. Rough 
> mock up attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6812) Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079597#comment-17079597
 ] 

Brahma Reddy Battula edited comment on YARN-6812 at 4/9/20, 5:35 PM:
-

Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.


was (Author: brahmareddy):
Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.
 * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13086602]

> Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit 
> -
>
> Key: YARN-6812
> URL: https://issues.apache.org/jira/browse/YARN-6812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6812) Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079597#comment-17079597
 ] 

Brahma Reddy Battula commented on YARN-6812:


Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a 
blocker.
 * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13086602]

> Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit 
> -
>
> Key: YARN-6812
> URL: https://issues.apache.org/jira/browse/YARN-6812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6812) Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit

2020-04-09 Thread Brahma Reddy Battula (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-6812:
---
Target Version/s: 3.4.0  (was: 3.3.0)

> Consolidate ContainerScheduler maxOpprQueueLength with ContainerQueuingLimit 
> -
>
> Key: YARN-6812
> URL: https://issues.apache.org/jira/browse/YARN-6812
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10120) In Federation Router Nodes/Applications/About pages throws 500 exception when https is enabled

2020-04-09 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17079589#comment-17079589
 ] 

Brahma Reddy Battula commented on YARN-10120:
-

I am going to close this issue, as this merged to 3.3.0 and 3.4.0..if you 
planing for other branches, please raise seperate Jira.

> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled
> --
>
> Key: YARN-10120
> URL: https://issues.apache.org/jira/browse/YARN-10120
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation
>Reporter: Sushanta Sen
>Assignee: Bilwa S T
>Priority: Critical
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10120-YARN-7402.patch, 
> YARN-10120-YARN-7402.v2.patch, YARN-10120-addendum-01.patch, 
> YARN-10120-branch-3.3.patch, YARN-10120-branch-3.3.v2.patch, 
> YARN-10120.001.patch, YARN-10120.002.patch
>
>
> In Federation Router Nodes/Applications/About pages throws 500 exception when 
> https is enabled.
> yarn.router.webapp.https.address =router ip:8091
> {noformat}
> 2020-02-07 16:38:49,990 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error 
> handling URI: /cluster/apps
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:166)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>   at 
> com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
>   at 
> com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
>   at 
> com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
>   at 
> com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:829)
>   at 
> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
>   at 
> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:119)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:133)
>   at com.google.inject.servlet.GuiceFilter$1.call(GuiceFilter.java:130)
>   at 
> com.google.inject.servlet.GuiceFilter$Context.call(GuiceFilter.java:203)
>   at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:130)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.http.XFrameOptionsFilter.doFilter(XFrameOptionsFilter.java:57)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1622)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> 

  1   2   >