date:20150210

[
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lowe reassigned YARN-2246:

Assignee: Devaraj K (was: Jason Lowe)

bq. Do you suggest that we should use the original tracking url directly
instead of proxy url on the web UI?

Not sure if it's always OK to use the raw history tracking URL, as there might
be some setups where the client can reach the RM but can't reach the history
tracking URL directly. However I think it's OK to always have the RM advertise
the original proxy tracking URL (i.e.: http://rmaddr/proxy/appid) to
clients through its UI. The AM can redefine what that proxy redirects to, but
the RM should never tack on paths to that proxy URI when advertising it. In
other words, clients visiting http://rmaddr/proxyappid will always reach
the UI (either AM or history), and if the AM and history have properly mirrored
UIs then it should be seamless to transition between the two when the AM
unregisters and redefines the tracking URL to point to the history server.

bq. seamless view between the AM UI and history UI is not possible nowadays.

Correct, but that's MapReduce's fault and not YARN's. If the RM handles the
proxy properly then it should be possible for an app framework to implement a
properly mirrored UI between the AM and the history server.

bq. In general, seamless view will still be difficult with the aforementioned
solution between two tracking URLs. For example, tracking URL is
http://t1:p1/a/b first, and I'm visiting the path at http://t1:p1/a/b/x/y/z.
When the tracking URL becomes http://t2:p2/c/d/e, I refresh the package and am
redirected http://t2:p2/c/d/e/a/b/x/y/z. Without mapping between original
tracking url and proxy url, we don't know /a/b is part of tracking url base,
and it shouldn't be carried on.

Not sure I'm following the example because there's no proxy URLs in it. The
client should always be using the proxy URL for this discussion. If I follow
the example correctly, the original tracking URL is http://t1:p1/a/b, and the
proxy URL is rooted there (i.e.: proxy/appid - t1:p1/a/b). So I'm visiting
proxy/appid/x/y/z and then the AM unregisters with a new tracking URL of
t2:p2/c/d/e. Then the proxy servlet should redirect that same
proxy/appid/x/y/z request to t2:p2/c/d/e/x/y/z which seems correct to me.
It's just taking the path underneath the proxy address (i.e.: everything after
proxy/appid) and tacking it on the specified tracking URL. The same subpath
is seen by both the AM and history URIs, assuming a/b is the root of the AM UI
and c/d/e is the root of the history UI (for that app). So it seems this
works as I would expect. Am I missing something?

bq. I have updated the generateProxyUriWithScheme() in the latest patch.

Thanks for updating the patch, Devaraj. I think it looks good, although it
would be nice to have some regression tests to verify that if the app changes
the tracking URL that the proxy URL doesn't update like it used to.

Job History Link in RM UI is redirecting to the URL which contains Job Id
twice
---

Key: YARN-2246
URL: https://issues.apache.org/jira/browse/YARN-2246
Project: Hadoop YARN
Issue Type: Bug
Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
Fix For: 2.7.0

Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch,
YARN-2246-3.patch, YARN-2246.2.patch, YARN-2246.patch

{code:xml}
http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
{code}

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2616) Add CLI client to the registry to list/view entries


[ 
https://issues.apache.org/jira/browse/YARN-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314236#comment-14314236
 ] 

Hadoop QA commented on YARN-2616:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697704/YARN-2616-008.patch
  against trunk revision e0ec071.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6574//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6574//console

This message is automatically generated.

 Add CLI client to the registry to list/view entries
 ---

 Key: YARN-2616
 URL: https://issues.apache.org/jira/browse/YARN-2616
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Akshay Radia
 Attachments: YARN-2616-003.patch, YARN-2616-008.patch, 
 YARN-2616-008.patch, yarn-2616-v1.patch, yarn-2616-v2.patch, 
 yarn-2616-v4.patch, yarn-2616-v5.patch, yarn-2616-v6.patch, yarn-2616-v7.patch


 registry needs a CLI interface



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3090) DeletionService can silently ignore deletion task failures


 [ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3090:
---
Attachment: YARN-3090.04.patch

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3090) DeletionService can silently ignore deletion task failures


 [ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3090:
---
Attachment: (was: YARN-3090.004.patch)

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures


[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314344#comment-14314344
 ] 

Varun Saxena commented on YARN-3090:


Able to kick it.

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures


[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314328#comment-14314328
 ] 

Varun Saxena commented on YARN-3090:


Weird. Jenkins is not getting kicked. Can somebody do that manually ?

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures


[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314385#comment-14314385
 ] 

Hadoop QA commented on YARN-3090:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697794/YARN-3090.04.patch
  against trunk revision e0ec071.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6576//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6576//console

This message is automatically generated.

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


 [ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned YARN-2246:


Assignee: Jason Lowe  (was: Devaraj K)

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Jason Lowe
 Fix For: 2.7.0

 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3170) YARN architecture document needs updating

Allen Wittenauer created YARN-3170:
--

 Summary: YARN architecture document needs updating
 Key: YARN-3170
 URL: https://issues.apache.org/jira/browse/YARN-3170
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Allen Wittenauer


The marketing paragraph at the top, NextGen MapReduce, etc are all marketing 
rather than actual descriptions. It also needs some general updates, esp given 
it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3170) YARN architecture document needs updating


 [ 
https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3170:
---
Component/s: documentation

 YARN architecture document needs updating
 -

 Key: YARN-3170
 URL: https://issues.apache.org/jira/browse/YARN-3170
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer

 The marketing paragraph at the top, NextGen MapReduce, etc are all 
 marketing rather than actual descriptions. It also needs some general 
 updates, esp given it reads as though 0.23 was just released yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2942) Aggregated Log Files should be compacted


 [ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2942:

Attachment: YARN-2942.003.patch

The YARN-2942.003.patch fixes some minor problems I found when dealing with 
logs for long running applications:
- The JHS would correctly display the logs, but also show a message that they 
couldn't be found
- The NM wasn't trying to compact the long running logs (which is expected), 
but it was dumping an ugly error message to it's log about it.  It now checks 
that the normal aggregated log file exists before trying to read it to 
prevent that.  I also made it so that it won't even try to get the lock if it's 
aggregated file is not there, which is better.

 Aggregated Log Files should be compacted
 

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
 YARN-2942.003.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name


[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315443#comment-14315443
 ] 

Bibin A Chundatt commented on YARN-3164:


Findbug and Test failure seems not related to this commit only console message 
gets updated with patch uploaded

 rmadmin command usage prints incorrect command name
 ---

 Key: YARN-3164
 URL: https://issues.apache.org/jira/browse/YARN-3164
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3164.1.patch


 /hadoop/bin{color:red} ./yarn rmadmin -transitionToActive {color}
 transitionToActive: incorrect number of arguments
 Usage:{color:red}  HAAdmin  {color} [-transitionToActive serviceId 
 [--forceactive]]
 {color:red} ./yarn HAAdmin {color} 
 Error: Could not find or load main class HAAdmin
 Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label


[ 
https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315554#comment-14315554
 ] 

Hadoop QA commented on YARN-3124:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697933/YARN-3124.3.patch
  against trunk revision 7c6b654.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6587//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6587//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6587//console

This message is automatically generated.

 Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track 
 capacities-by-label
 

 Key: YARN-3124
 URL: https://issues.apache.org/jira/browse/YARN-3124
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3124.1.patch, YARN-3124.2.patch, YARN-3124.3.patch


 After YARN-3098, capacities-by-label (include 
 used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be 
 tracked in QueueCapacities.
 This patch is targeting to make capacities-by-label in CS Queues are all 
 tracked by QueueCapacities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3157) Wrong format for application id / attempt id not handled completely


 [ 
https://issues.apache.org/jira/browse/YARN-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3157:
---
Attachment: YARN-3157.1.patch

uploading after applying formatter

 Wrong format for application id / attempt id not handled completely
 ---

 Key: YARN-3157
 URL: https://issues.apache.org/jira/browse/YARN-3157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3157.1.patch, YARN-3157.patch, YARN-3157.patch


 yarn.cmd application -kill application_123
 Format wrong given for application id or attempt. Exception will be thrown to 
 console with out any info
 {quote}
 15/02/07 22:18:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where
 Exception in thread main java.util.NoSuchElementException
 at 
 com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:146)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:205)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.killApplication(ApplicationCLI.java:383)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:219)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 {quote}
 Need to add catch block for java.util.NoSuchElementException also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-02-10 Thread Akira AJISAKA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315420#comment-14315420
 ] 

Akira AJISAKA commented on YARN-2336:
-

Hi [~kj-ki], would you rebase the patch for trunk?

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1
Reporter: Kenji Kikushima
Assignee: Kenji Kikushima
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2942) Aggregated Log Files should be compacted


[ 
https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315405#comment-14315405
 ] 

Hadoop QA commented on YARN-2942:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697901/YARN-2942.002.patch
  against trunk revision d5855c0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to cause Findbugs 
(version 2.0.3) to fail.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6586//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6586//console

This message is automatically generated.

 Aggregated Log Files should be compacted
 

 Key: YARN-2942
 URL: https://issues.apache.org/jira/browse/YARN-2942
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.6.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: CompactedAggregatedLogsProposal_v1.pdf, 
 CompactedAggregatedLogsProposal_v2.pdf, YARN-2942-preliminary.001.patch, 
 YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, 
 YARN-2942.003.patch


 Turning on log aggregation allows users to easily store container logs in 
 HDFS and subsequently view them in the YARN web UIs from a central place.  
 Currently, there is a separate log file for each Node Manager.  This can be a 
 problem for HDFS if you have a cluster with many nodes as you’ll slowly start 
 accumulating many (possibly small) files per YARN application.  The current 
 “solution” for this problem is to configure YARN (actually the JHS) to 
 automatically delete these files after some amount of time.  
 We should improve this by compacting the per-node aggregated log files into 
 one log file per application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name

2015-02-10 Thread Rohith (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315473#comment-14315473
 ] 

Rohith commented on YARN-3164:
--

[~bibinchundatt] thank for providing patch. 
Could you add test for regression?

 rmadmin command usage prints incorrect command name
 ---

 Key: YARN-3164
 URL: https://issues.apache.org/jira/browse/YARN-3164
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3164.1.patch


 /hadoop/bin{color:red} ./yarn rmadmin -transitionToActive {color}
 transitionToActive: incorrect number of arguments
 Usage:{color:red}  HAAdmin  {color} [-transitionToActive serviceId 
 [--forceactive]]
 {color:red} ./yarn HAAdmin {color} 
 Error: Could not find or load main class HAAdmin
 Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Chengbing Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315476#comment-14315476
 ] 

Chengbing Liu commented on YARN-3160:
-

Maybe just {{updatedContainers}}? Renaming is fine to me.

 Non-atomic operation on nodeUpdateQueue in RMNodeImpl
 -

 Key: YARN-3160
 URL: https://issues.apache.org/jira/browse/YARN-3160
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-3160.2.patch, YARN-3160.patch


 {code:title=RMNodeImpl.java|borderStyle=solid}
 while(nodeUpdateQueue.peek() != null){
   latestContainerInfoList.add(nodeUpdateQueue.poll());
 }
 {code}
 The above code brings potential risk of adding null value to 
 {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
 wait-free algorithm, we can directly poll the queue, before checking whether 
 the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3169) drop the useless yarn overview document

Allen Wittenauer created YARN-3169:
--

 Summary: drop the useless yarn overview document
 Key: YARN-3169
 URL: https://issues.apache.org/jira/browse/YARN-3169
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer


It's pretty superfluous given there is a site index on the left.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-10 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315548#comment-14315548
 ] 

Xuan Gong commented on YARN-3151:
-

Patch looks good to me.
[~rohithsharma] Could you check whether the test cases are related or not ?

 On Failover tracking url wrong in application cli for KILLED application
 

 Key: YARN-3151
 URL: https://issues.apache.org/jira/browse/YARN-3151
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.6.0
 Environment: 2 RM HA 
Reporter: Bibin A Chundatt
Assignee: Rohith
Priority: Minor
 Attachments: 0001-YARN-3151.patch


 Run an application and kill the same after starting
 Check {color:red} ./yarn application -list -appStates KILLED {color}
 (empty line)
 {quote}
 Application-Id Tracking-URL
 application_1423219262738_0001  
 http://IP:PORT/cluster/app/application_1423219262738_0001
 {quote}
 Shutdown the active RM1
 Check the same command {color:red} ./yarn application -list -appStates KILLED 
 {color} after RM2 is active
 {quote}
 Application-Id Tracking-URL
 application_1423219262738_0001  null
 {quote}
 Tracking url for application is shown as null 
 Expected : Same url before failover should be shown
 ApplicationReport .getOriginalTrackingUrl() is null after failover
 org.apache.hadoop.yarn.client.cli.ApplicationCLI
 listApplications(SetString appTypes,
   EnumSetYarnApplicationState appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3151) On Failover tracking url wrong in application cli for KILLED application

2015-02-10 Thread Rohith (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315559#comment-14315559
 ] 

Rohith commented on YARN-3151:
--

Thanks [~xgong] for review.. I will check and  upload the patch soon

 On Failover tracking url wrong in application cli for KILLED application
 

 Key: YARN-3151
 URL: https://issues.apache.org/jira/browse/YARN-3151
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.6.0
 Environment: 2 RM HA 
Reporter: Bibin A Chundatt
Assignee: Rohith
Priority: Minor
 Attachments: 0001-YARN-3151.patch


 Run an application and kill the same after starting
 Check {color:red} ./yarn application -list -appStates KILLED {color}
 (empty line)
 {quote}
 Application-Id Tracking-URL
 application_1423219262738_0001  
 http://IP:PORT/cluster/app/application_1423219262738_0001
 {quote}
 Shutdown the active RM1
 Check the same command {color:red} ./yarn application -list -appStates KILLED 
 {color} after RM2 is active
 {quote}
 Application-Id Tracking-URL
 application_1423219262738_0001  null
 {quote}
 Tracking url for application is shown as null 
 Expected : Same url before failover should be shown
 ApplicationReport .getOriginalTrackingUrl() is null after failover
 org.apache.hadoop.yarn.client.cli.ApplicationCLI
 listApplications(SetString appTypes,
   EnumSetYarnApplicationState appStates)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading


[ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314484#comment-14314484
 ] 

Brahma Reddy Battula commented on YARN-1237:


can we update like comma separated list of services where service name should 
only contain a-zA-Z0-9_ and can not start with numbers..?

 Description for yarn.nodemanager.aux-services in yarn-default.xml is 
 misleading
 ---

 Key: YARN-1237
 URL: https://issues.apache.org/jira/browse/YARN-1237
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Hitesh Shah
Priority: Minor

 Description states:
 the valid service name should only contain a-zA-Z0-9_ and can not start with 
 numbers 
 It seems to indicate only one service is supported. If multiple services are 
 allowed, it does not indicate how they should be specified i.e. 
 comma-separated or space-separated? If the service name cannot contain 
 spaces, does this imply that space-separated lists are also permitted?
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3110) Faulty link and state in ApplicationHistory when aplication is in unassigned state


[ 
https://issues.apache.org/jira/browse/YARN-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314533#comment-14314533
 ] 

Hadoop QA commented on YARN-3110:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12697521/YARN-3110.20150209-1.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6579//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6579//console

This message is automatically generated.

 Faulty link and state in ApplicationHistory when aplication is in unassigned 
 state
 --

 Key: YARN-3110
 URL: https://issues.apache.org/jira/browse/YARN-3110
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications, timelineserver
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Naganarasimha G R
Priority: Minor
 Attachments: YARN-3110.20150209-1.patch


 Application state and History link wrong when Application is in unassigned 
 state
  
 1.Configure capacity schedular with queue size as 1  also max Absolute Max 
 Capacity:  10.0%
 (Current application state is Accepted and Unassigned from resource manager 
 side)
 2.Submit application to queue and check the state and link in Application 
 history
 State= null and History link shown as N/A in applicationhistory page
 Kill the same application . In timeline server logs the below is show when 
 selecting application link.
 {quote}
 2015-01-29 15:39:50,956 ERROR org.apache.hadoop.yarn.webapp.View: Failed to 
 read the AM container of the application attempt 
 appattempt_1422467063659_0007_01.
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainer(ApplicationHistoryManagerOnTimelineStore.java:162)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAMContainer(ApplicationHistoryManagerOnTimelineStore.java:184)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:160)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:157)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at 
 org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:156)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77)
   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
   at 
 org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
   at 
 org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117)
   at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845)
   at 
 org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56)
   at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
   at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
   at 
 org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.AHSController.app(AHSController.java:38)
   at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

[jira] [Commented] (YARN-3129) [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process logs

2015-02-10 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314443#comment-14314443
 ] 

Naganarasimha G R commented on YARN-3129:
-

Hi [~jagadesh.kiran]  [~brahmareddy]
I feel this is not a issue, as the usage of the command is to pass the Log name 
for example
/yarn daemonlog -setlevel xx.xx.xx.xxx:45020 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl  DEBUG

After running the above command you can run a yarn app and check the RM logs to 
find the debug logs for  
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.
If its working then will close this issue.
 

 [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process 
 logs 
 

 Key: YARN-3129
 URL: https://issues.apache.org/jira/browse/YARN-3129
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jagadesh Kiran N
Assignee: Naganarasimha G R

 a. Execute the command
 ./yarn daemonlog -setlevel xx.xx.xx.xxx:45020 ResourceManager DEBUG
 b. It is not reflecting in process logs even after performing client level 
 operations
 c. Log level is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3164) rmadmin command usage prints incorrect command name


 [ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3164:
---
Attachment: YARN-3164.1.patch

Patch added for the same. Please review the same.

 rmadmin command usage prints incorrect command name
 ---

 Key: YARN-3164
 URL: https://issues.apache.org/jira/browse/YARN-3164
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3164.1.patch


 /hadoop/bin{color:red} ./yarn rmadmin -transitionToActive {color}
 transitionToActive: incorrect number of arguments
 Usage:{color:red}  HAAdmin  {color} [-transitionToActive serviceId 
 [--forceactive]]
 {color:red} ./yarn HAAdmin {color} 
 Error: Could not find or load main class HAAdmin
 Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures


[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314470#comment-14314470
 ] 

Hudson commented on YARN-3090:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7062 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7062/])
YARN-3090. DeletionService can silently ignore deletion task failures. 
Contributed by Varun Saxena (jlowe: rev 
4eb5f7fa32bab1b9ce3fb58eca51e2cd2e194cd5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* hadoop-yarn-project/CHANGES.txt


 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl

2015-02-10 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314482#comment-14314482
 ] 

Junping Du commented on YARN-3160:
--

Didn't see these failures in testReport. Kick off Jenkins test again.

 Non-atomic operation on nodeUpdateQueue in RMNodeImpl
 -

 Key: YARN-3160
 URL: https://issues.apache.org/jira/browse/YARN-3160
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-3160.2.patch, YARN-3160.patch


 {code:title=RMNodeImpl.java|borderStyle=solid}
 while(nodeUpdateQueue.peek() != null){
   latestContainerInfoList.add(nodeUpdateQueue.poll());
 }
 {code}
 The above code brings potential risk of adding null value to 
 {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
 wait-free algorithm, we can directly poll the queue, before checking whether 
 the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading

2015-02-10 Thread Tsuyoshi OZAWA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314507#comment-14314507
 ] 

Tsuyoshi OZAWA commented on YARN-1237:
--

Hi [~brahmareddy] , thank you for taking this JIRA.

{quote}
comma separated list of services where service name should only contain 
a-zA-Z0-9_ and can not start with numbers
{quote}

Sounds reasonable. From my actual configuration:
{code}
property
  nameyarn.nodemanager.aux-services/name
  valuespark_shuffle,mapreduce_shuffle/value
/property
{code}

 Description for yarn.nodemanager.aux-services in yarn-default.xml is 
 misleading
 ---

 Key: YARN-1237
 URL: https://issues.apache.org/jira/browse/YARN-1237
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Hitesh Shah
Priority: Minor

 Description states:
 the valid service name should only contain a-zA-Z0-9_ and can not start with 
 numbers 
 It seems to indicate only one service is supported. If multiple services are 
 allowed, it does not indicate how they should be specified i.e. 
 comma-separated or space-separated? If the service name cannot contain 
 spaces, does this imply that space-separated lists are also permitted?
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures


[ 
https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314445#comment-14314445
 ] 

Jason Lowe commented on YARN-3090:
--

+1 lgtm.  Committing this.

 DeletionService can silently ignore deletion task failures
 --

 Key: YARN-3090
 URL: https://issues.apache.org/jira/browse/YARN-3090
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3090.001.patch, YARN-3090.002.patch, 
 YARN-3090.003.patch, YARN-3090.04.patch


 If a non-I/O exception occurs while the DeletionService is executing a 
 deletion task then it will be silently ignored.  The exception bubbles up to 
 the thread workers of the ScheduledThreadPoolExecutor which simply attaches 
 the throwable to the Future that was returned when the task was scheduled.  
 However the thread pool is used as a fire-and-forget pool, so nothing ever 
 looks at the Future and therefore the exception is never logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3129) [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process logs

2015-02-10 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314477#comment-14314477
 ] 

Naganarasimha G R commented on YARN-3129:
-

Or as part of this jira we can do the following : 
# update the usage as 
{quote}
Usage: General options are:
[-getlevel host:httpPort log name]
[-setlevel host:httpPort log name level]
{quote}
# update the documentation with an example of using this command
# {{level}} param currently takes only case sensitive but i think we should 
support case insensitive value too

 [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process 
 logs 
 

 Key: YARN-3129
 URL: https://issues.apache.org/jira/browse/YARN-3129
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jagadesh Kiran N
Assignee: Naganarasimha G R

 a. Execute the command
 ./yarn daemonlog -setlevel xx.xx.xx.xxx:45020 ResourceManager DEBUG
 b. It is not reflecting in process logs even after performing client level 
 operations
 c. Log level is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-1580) Documentation error regarding container-allocation.expiry-interval-ms


 [ 
https://issues.apache.org/jira/browse/YARN-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-1580:
--

Assignee: Brahma Reddy Battula

 Documentation error regarding container-allocation.expiry-interval-ms
 ---

 Key: YARN-1580
 URL: https://issues.apache.org/jira/browse/YARN-1580
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.2.0
 Environment: CentOS 6.4
Reporter: German Florez-Larrahondo
Assignee: Brahma Reddy Battula
Priority: Trivial

 While trying to control settings related to expiration of tokens for long 
 running jobs,based on the documentation ( 
 http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml)
  I attempted to increase values for 
 yarn.rm.container-allocation.expiry-interval-ms without luck. Looking code 
 like YarnConfiguration.java I noticed that in  recent versions all these kind 
 of settings now have the prefix yarn.resourcemanager.rm as opposed to 
 yarn.rm. So for this specific case the setting of interest is 
 yarn.resourcemanager.rm.container-allocation.expiry-interval-ms
 I supposed there are other documentation errors similar to this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2809) Implement workaround for linux kernel panic when removing cgroup


[ 
https://issues.apache.org/jira/browse/YARN-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314500#comment-14314500
 ] 

Hudson commented on YARN-2809:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7063 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7063/])
YARN-2809. Implement workaround for linux kernel panic when removing cgroup. 
Contributed by Nathan Roberts (jlowe: rev 
3f5431a22fcef7e3eb9aceeefe324e5b7ac84049)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* hadoop-yarn-project/CHANGES.txt


 Implement workaround for linux kernel panic when removing cgroup
 

 Key: YARN-2809
 URL: https://issues.apache.org/jira/browse/YARN-2809
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.6.0
 Environment:  RHEL 6.4
Reporter: Nathan Roberts
Assignee: Nathan Roberts
 Fix For: 2.7.0

 Attachments: YARN-2809-v2.patch, YARN-2809-v3.patch, YARN-2809.patch


 Some older versions of linux have a bug that can cause a kernel panic when 
 the LCE attempts to remove a cgroup. It is a race condition so it's a bit 
 rare but on a few thousand node cluster it can result in a couple of panics 
 per day.
 This is the commit that likely (haven't verified) fixes the problem in linux: 
 https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-2.6.39.yid=068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267
 Details will be added in comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


 [ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-2246:

Attachment: YARN-2246-4.patch

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.7.0

 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name


[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314544#comment-14314544
 ] 

Hadoop QA commented on YARN-3164:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697812/YARN-3164.1.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client:

  org.apache.hadoop.ipc.TestRPCWaitForProxy
  org.apache.hadoop.yarn.client.api.impl.TestAMRMClient

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6577//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6577//artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6577//console

This message is automatically generated.

 rmadmin command usage prints incorrect command name
 ---

 Key: YARN-3164
 URL: https://issues.apache.org/jira/browse/YARN-3164
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3164.1.patch


 /hadoop/bin{color:red} ./yarn rmadmin -transitionToActive {color}
 transitionToActive: incorrect number of arguments
 Usage:{color:red}  HAAdmin  {color} [-transitionToActive serviceId 
 [--forceactive]]
 {color:red} ./yarn HAAdmin {color} 
 Error: Could not find or load main class HAAdmin
 Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-933) Potential InvalidStateTransitonException: Invalid event: LAUNCHED at FINAL_SAVING

[
https://issues.apache.org/jira/browse/YARN-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313834#comment-14313834
]

Hadoop QA commented on YARN-933:

{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12697671/0004-YARN-933.patch
against trunk revision b73956f.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 1 new
or modified test files.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 javadoc{color}. There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}. The patch built with
eclipse:eclipse.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 core tests{color}. The patch passed unit tests in
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results:
https://builds.apache.org/job/PreCommit-YARN-Build/6573//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6573//console

This message is automatically generated.

Potential InvalidStateTransitonException: Invalid event: LAUNCHED at
FINAL_SAVING
-

Key: YARN-933
URL: https://issues.apache.org/jira/browse/YARN-933
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Affects Versions: 2.0.5-alpha
Reporter: J.Andreina
Assignee: Rohith
Attachments: 0001-YARN-933.patch, 0001-YARN-933.patch,
0004-YARN-933.patch, YARN-933.3.patch, YARN-933.patch

am max retries configured as 3 at client and RM side.
Step 1: Install cluster with NM on 2 Machines
Step 2: Make Ping using ip from RM machine to NM1 machine as successful ,But
using Hostname should fail
Step 3: Execute a job
Step 4: After AM [ AppAttempt_1 ] allocation to NM1 machine is done ,
connection loss happened.
Observation :
==
After AppAttempt_1 has moved to failed state ,release of container for
AppAttempt_1 and Application removal are successful. New AppAttempt_2 is
sponed.
1. Then again retry for AppAttempt_1 happens.
2. Again RM side it is trying to launch AppAttempt_1, hence fails with
InvalidStateTransitonException
3. Client got exited after AppAttempt_1 is been finished [But actually job is
still running ], while the appattempts configured is 3 and rest appattempts
are all sponed and running.
RMLogs:
==
2013-07-17 16:22:51,013 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1373952096466_0056_01 State change from SCHEDULED to ALLOCATED
2013-07-17 16:35:48,171 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: host-10-18-40-15/10.18.40.59:8048. Already tried 36 time(s);
maxRetries=45
2013-07-17 16:36:07,091 INFO
org.apache.hadoop.yarn.util.AbstractLivelinessMonitor:
Expired:container_1373952096466_0056_01_01 Timed out after 600 secs
2013-07-17 16:36:07,093 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
container_1373952096466_0056_01_01 Container Transitioned from ACQUIRED
to EXPIRED
2013-07-17 16:36:07,093 INFO
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
Registering appattempt_1373952096466_0056_02
2013-07-17 16:36:07,131 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Application appattempt_1373952096466_0056_01 is done. finalState=FAILED
2013-07-17 16:36:07,131 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue:
Application removed - appId: application_1373952096466_0056 user: Rex
leaf-queue of parent: root #applications: 35
2013-07-17 16:36:07,132 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
Application Submission: appattempt_1373952096466_0056_02,
2013-07-17 16:36:07,138 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1373952096466_0056_02 State change from SUBMITTED to SCHEDULED
2013-07-17 16:36:30,179 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: host-10-18-40-15/10.18.40.59:8048. Already tried 38 time(s);
maxRetries=45
2013-07-17 16:38:36,203 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: host-10-18-40-15/10.18.40.59:8048. Already tried 44 time(s);
maxRetries=45
2013-07-17

[jira] [Commented] (YARN-3100) Make YARN authorization pluggable

2015-02-10 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313843#comment-14313843
 ] 

Chris Douglas commented on YARN-3100:
-

Sorry, I didn't get to the patch over the weekend. Thanks for addressing the 
review feedback.

Are there JIRAs following some of the types to be added to PrivilegedEntity? 
Just curious.

 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.7.0

 Attachments: YARN-3100.1.patch, YARN-3100.2.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1983) Support heterogeneous container types at runtime on YARN

2015-02-10 Thread Chris Douglas (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313831#comment-14313831
 ] 

Chris Douglas commented on YARN-1983:
-

bq. We still need a way to demux the executor to support the case of YARN 
cluster with a mix of executors. That'd mean some impact on the CLC, no?

Policies that select the appropriate executor could demux on the contents of 
the CLC and not a dedicated field. A simple, static dispatch from an 
admin-configured list is a great place to start, but adding a string to the CLC 
that selects the executor class by name is difficult to evolve. Since the same 
semantics are available without changes to the platform, why bake these in?

bq. I think my current patch is intrusive indeed but more general, right?

I'm not sure I follow. How is it more general?

 Support heterogeneous container types at runtime on YARN
 

 Key: YARN-1983
 URL: https://issues.apache.org/jira/browse/YARN-1983
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Junping Du
 Attachments: YARN-1983.2.patch, YARN-1983.patch


 Different container types (default, LXC, docker, VM box, etc.) have different 
 semantics on isolation of security, namespace/env, performance, etc.
 Per discussions in YARN-1964, we have some good thoughts on supporting 
 different types of containers running on YARN and specified by application at 
 runtime which largely enhance YARN's flexibility to meet heterogenous app's 
 requirement on isolation at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


 [ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj K updated YARN-2246:

Attachment: YARN-2246-3.patch

I have updated the generateProxyUriWithScheme() in the latest patch.

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.7.0

 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313926#comment-14313926
 ] 

Devaraj K commented on YARN-2246:
-

I agree that this needs to be handled in RMAttemptImpl before exposing the 
proxy URL to the users.

Thanks for your patch [~zjshen]. I have tried this patch, it works fine. I 
think generateProxyUriWithScheme() can be updated according the patch changes.


 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.7.0

 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens


[ 
https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313980#comment-14313980
 ] 

Hudson commented on YARN-2971:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #100 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/100/])
YARN-2971. RM uses conf instead of token service address to renew timeline 
delegation tokens (jeagles) (jeagles: rev 
af0842589359ad800427337ad2c84fac09907f72)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java


 RM uses conf instead of token service address to renew timeline delegation 
 tokens
 -

 Key: YARN-2971
 URL: https://issues.apache.org/jira/browse/YARN-2971
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 2.7.0

 Attachments: YARN-2971-v1.patch, YARN-2971-v2.patch


 The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to 
 renew Timeline DelegationTokens. It should read the service address out of 
 the token to renew the delegation token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3100) Make YARN authorization pluggable


[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313977#comment-14313977
 ] 

Hudson commented on YARN-3100:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #100 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/100/])
YARN-3100. Made YARN authorization pluggable. Contributed by Jian He. (zjshen: 
rev 23bf6c72071782e3fd5a628e21495d6b974c7a9e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/AccessType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/AdminACLsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/PrivilegedEntity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/ConfiguredYarnAuthorizer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java


 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.7.0

 Attachments: YARN-3100.1.patch, YARN-3100.2.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery


[ 
https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313979#comment-14313979
 ] 

Hudson commented on YARN-3094:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #100 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/100/])
YARN-3094. Reset timer for liveness monitors after RM recovery. Contributed by 
Jun Gong (jianhe: rev 0af6a99a3fcfa4b47d3bcba5e5cc5fe7b312a152)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestAMLivelinessMonitor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/AMLivelinessMonitor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AbstractLivelinessMonitor.java
* hadoop-yarn-project/CHANGES.txt


 reset timer for liveness monitors after RM recovery
 ---

 Key: YARN-3094
 URL: https://issues.apache.org/jira/browse/YARN-3094
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.7.0

 Attachments: YARN-3094.2.patch, YARN-3094.3.patch, YARN-3094.4.patch, 
 YARN-3094.5.patch, YARN-3094.patch


 When RM restarts, it will recover RMAppAttempts and registry them to 
 AMLivenessMonitor if they are not in final state. AM will time out in RM if 
 the recover process takes long time due to some reasons(e.g. too many apps). 
 In our system, we found the recover process took about 3 mins, and all AM 
 time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3155) Refactor the exception handling code for TimelineClientImpl's retryOn method


[ 
https://issues.apache.org/jira/browse/YARN-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313992#comment-14313992
 ] 

Hudson commented on YARN-3155:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #100 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/100/])
YARN-3155. Refactor the exception handling code for TimelineClientImpl's 
retryOn method (Li Lu via wangda) (wangda: rev 
00a748d24a565bce0cc8cfa2bdcf165778cea395)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* hadoop-yarn-project/CHANGES.txt


 Refactor the exception handling code for TimelineClientImpl's retryOn method
 

 Key: YARN-3155
 URL: https://issues.apache.org/jira/browse/YARN-3155
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Li Lu
Assignee: Li Lu
Priority: Minor
  Labels: refactoring
 Fix For: 2.7.0

 Attachments: YARN-3155-020615.patch, YARN-3155-020915.patch


 Since we switched to Java 1.7, the exception handling code for the retryOn 
 method can be merged into one statement block, instead of the current two, to 
 avoid repeated code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2616) Add CLI client to the registry to list/view entries

2015-02-10 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-2616:
-
Attachment: YARN-2616-008.patch

Patch -008. uploading to see if this triggers jenkins

 Add CLI client to the registry to list/view entries
 ---

 Key: YARN-2616
 URL: https://issues.apache.org/jira/browse/YARN-2616
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Akshay Radia
 Attachments: YARN-2616-003.patch, YARN-2616-008.patch, 
 YARN-2616-008.patch, yarn-2616-v1.patch, yarn-2616-v2.patch, 
 yarn-2616-v4.patch, yarn-2616-v5.patch, yarn-2616-v6.patch, yarn-2616-v7.patch


 registry needs a CLI interface



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3100) Make YARN authorization pluggable


[ 
https://issues.apache.org/jira/browse/YARN-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314036#comment-14314036
 ] 

Hudson commented on YARN-3100:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #834 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/834/])
YARN-3100. Made YARN authorization pluggable. Contributed by Jian He. (zjshen: 
rev 23bf6c72071782e3fd5a628e21495d6b974c7a9e)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/ConfiguredYarnAuthorizer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/PrivilegedEntity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-sharedcachemanager/src/main/java/org/apache/hadoop/yarn/server/sharedcachemanager/SCMAdminProtocolService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/AccessType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/AdminACLsManager.java


 Make YARN authorization pluggable
 -

 Key: YARN-3100
 URL: https://issues.apache.org/jira/browse/YARN-3100
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Jian He
Assignee: Jian He
 Fix For: 2.7.0

 Attachments: YARN-3100.1.patch, YARN-3100.2.patch, YARN-3100.2.patch


 The goal is to have YARN acl model pluggable so as to integrate other 
 authorization tool such as Apache Ranger, Sentry.
 Currently, we have 
 - admin ACL
 - queue ACL
 - application ACL
 - time line domain ACL
 - service ACL
 The proposal is to create a YarnAuthorizationProvider interface. Current 
 implementation will be the default implementation. Ranger or Sentry plug-in 
 can implement  this interface.
 Benefit:
 -  Unify the code base. With the default implementation, we can get rid of 
 each specific ACL manager such as AdminAclManager, ApplicationACLsManager, 
 QueueAclsManager etc.
 - Enable Ranger, Sentry to do authorization for YARN. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2971) RM uses conf instead of token service address to renew timeline delegation tokens


[ 
https://issues.apache.org/jira/browse/YARN-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314039#comment-14314039
 ] 

Hudson commented on YARN-2971:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #834 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/834/])
YARN-2971. RM uses conf instead of token service address to renew timeline 
delegation tokens (jeagles) (jeagles: rev 
af0842589359ad800427337ad2c84fac09907f72)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineClient.java
* hadoop-yarn-project/CHANGES.txt


 RM uses conf instead of token service address to renew timeline delegation 
 tokens
 -

 Key: YARN-2971
 URL: https://issues.apache.org/jira/browse/YARN-2971
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: 2.6.0
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 2.7.0

 Attachments: YARN-2971-v1.patch, YARN-2971-v2.patch


 The TimelineClientImpl renewDelegationToken uses the incorrect webaddress to 
 renew Timeline DelegationTokens. It should read the service address out of 
 the token to renew the delegation token.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery


[ 
https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314038#comment-14314038
 ] 

Hudson commented on YARN-3094:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #834 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/834/])
YARN-3094. Reset timer for liveness monitors after RM recovery. Contributed by 
Jun Gong (jianhe: rev 0af6a99a3fcfa4b47d3bcba5e5cc5fe7b312a152)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AbstractLivelinessMonitor.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestAMLivelinessMonitor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/AMLivelinessMonitor.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 reset timer for liveness monitors after RM recovery
 ---

 Key: YARN-3094
 URL: https://issues.apache.org/jira/browse/YARN-3094
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Fix For: 2.7.0

 Attachments: YARN-3094.2.patch, YARN-3094.3.patch, YARN-3094.4.patch, 
 YARN-3094.5.patch, YARN-3094.patch


 When RM restarts, it will recover RMAppAttempts and registry them to 
 AMLivenessMonitor if they are not in final state. AM will time out in RM if 
 the recover process takes long time due to some reasons(e.g. too many apps). 
 In our system, we found the recover process took about 3 mins, and all AM 
 time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3155) Refactor the exception handling code for TimelineClientImpl's retryOn method


[ 
https://issues.apache.org/jira/browse/YARN-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314052#comment-14314052
 ] 

Hudson commented on YARN-3155:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #834 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/834/])
YARN-3155. Refactor the exception handling code for TimelineClientImpl's 
retryOn method (Li Lu via wangda) (wangda: rev 
00a748d24a565bce0cc8cfa2bdcf165778cea395)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java
* hadoop-yarn-project/CHANGES.txt


 Refactor the exception handling code for TimelineClientImpl's retryOn method
 

 Key: YARN-3155
 URL: https://issues.apache.org/jira/browse/YARN-3155
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Li Lu
Assignee: Li Lu
Priority: Minor
  Labels: refactoring
 Fix For: 2.7.0

 Attachments: YARN-3155-020615.patch, YARN-3155-020915.patch


 Since we switched to Java 1.7, the exception handling code for the retryOn 
 method can be merged into one statement block, instead of the current two, to 
 avoid repeated code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3164) rmadmin command usage prints incorrect command name


[ 
https://issues.apache.org/jira/browse/YARN-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314167#comment-14314167
 ] 

Bibin A Chundatt commented on YARN-3164:


Any problem in changing the message as below 

{color:red} Usage: rmadmin {color}

 rmadmin command usage prints incorrect command name
 ---

 Key: YARN-3164
 URL: https://issues.apache.org/jira/browse/YARN-3164
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor

 /hadoop/bin{color:red} ./yarn rmadmin -transitionToActive {color}
 transitionToActive: incorrect number of arguments
 Usage:{color:red}  HAAdmin  {color} [-transitionToActive serviceId 
 [--forceactive]]
 {color:red} ./yarn HAAdmin {color} 
 Error: Could not find or load main class HAAdmin
 Expected it should be rmadmin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users


[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314636#comment-14314636
 ] 

Robert Kanter commented on YARN-2423:
-

This is based on the current implementation.  We can try to add a compatibility 
layer or something in another JIRA.  Though I'm not sure how feasible that will 
be; the data models are somewhat different...

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
 YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2693) Priority Label Manager in RM to manage priority labels

2015-02-10 Thread Sunil G (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-2693:
--
Attachment: 0005-YARN-2693.patch

Attaching Priority Manager patch with updated changes as discussed in parent 
JIRA

 Priority Label Manager in RM to manage priority labels
 --

 Key: YARN-2693
 URL: https://issues.apache.org/jira/browse/YARN-2693
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 
 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch


 Focus of this JIRA is to have a centralized service to handle priority labels.
 Support operations such as
 * Add/Delete priority label to a specified queue
 * Manage integer mapping associated with each priority label
 * Support managing default priority label of a given queue
 * ACL support in queue level for priority label
 * Expose interface to RM to validate priority label
 Storage for this labels will be done in FileSystem and in Memory similar to 
 NodeLabel
 * FileSystem Based : persistent across RM restart
 * Memory Based: non-persistent across RM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

2015-02-10 Thread Junping Du (JIRA)

[
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314653#comment-14314653
]

Junping Du commented on YARN-914:
-

Thanks [~vinodkv] for comments!
bq. IAC, I think we should also have a CLI command to decommission the node
which optionally waits till the decommission succeeds.
That sounds pretty good. This new CLI can simply gracefully decommission
related nodes and wait to timeout to forcefully decommission nodes haven't
finished. Comparing with approach of external script proposed by Ming above,
this has less dependency on effort that outside of hadoop.

bq. Regarding storage of the decommission state, YARN-2567 also plans to make
sure that the state of all nodes is maintained up to date on the state-store.
That helps with many other cases too. We should combine these efforts.
That make sense. However, YARN-2567 is about threshold thing, may be a wrong
JIRA number?

bq. Regarding long running services, I think it makes sense to let the admin
initiating the decommission know - not in terms of policy but as a diagnostic.
Other than waiting for a timeout, the admin may not have noticed that a service
is running on this node before the decommission is triggered.

bq. This is the umbrella concern I have. There are two ways to do this: Let
YARN manage the decommission process or manage it on top of YARN. If the later
is the approach, I don't see a lot to be done here besides YARN-291. No?
Agree that there is less effort for 2nd approach. If so, we still need RM can
aware containers/apps get finished then trigger shutdown to NM to make
decommission comes earlier (and randomly) which I guess is important to upgrade
of large cluster. Isn't it? For YARN-291, my understanding is now we don't rely
on any open issues left there because we only need to set NM's resource to 0 at
runtime which we already provide there. BTW, I think the approach you just
proposed above is 2nd approach + a new CLI. Isn't it? I prefer to go with
this way but would like to hear other guys' ideas here also.

Support graceful decommission of nodemanager

Key: YARN-914
URL: https://issues.apache.org/jira/browse/YARN-914
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du
Attachments: Gracefully Decommission of NodeManager (v1).pdf

When NMs are decommissioned for non-fault reasons (capacity change etc.),
it's desirable to minimize the impact to running applications.
Currently if a NM is decommissioned, all running containers on the NM need to
be rescheduled on other NMs. Further more, for finished map tasks, if their
map output are not fetched by the reducers of the job, these map tasks will
need to be rerun as well.
We propose to introduce a mechanism to optionally gracefully decommission a
node manager.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314707#comment-14314707
 ] 

Hadoop QA commented on YARN-2246:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697819/YARN-2246-4.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRM

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6580//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6580//console

This message is automatically generated.

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk


 [ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3074:
---
Attachment: YARN-3074.03.patch

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
 YARN-3074.03.patch


 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk


 [ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3074:
---
Attachment: (was: YARN-3074.003.patch)

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
 YARN-3074.03.patch


 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3163) admin support for YarnAuthorizationProvider

2015-02-10 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314648#comment-14314648
 ] 

Jian He commented on YARN-3163:
---

[~sunilg], I have one question that if acl is changed in both config file and 
other storage, after RM restart, how can the RM figure out which one should 
take precedence ?

 admin support for YarnAuthorizationProvider
 ---

 Key: YARN-3163
 URL: https://issues.apache.org/jira/browse/YARN-3163
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G

 Runtime configuration support for YarnAuthorizationProvider. Using admin 
 commands, one should be able to set and get permission from the 
 YarnAuthorizationProvider. This mechanism will help users without updating 
 config files and firing reload commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2693) Priority Label Manager in RM to manage priority labels

[
https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314712#comment-14314712
]

Hadoop QA commented on YARN-2693:
-

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12697839/0005-YARN-2693.patch
against trunk revision 3f5431a.

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:green}+1 tests included{color}. The patch appears to include 4 new
or modified test files.

{color:red}-1 javac{color:red}. The patch appears to cause the build to
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6583//console

This message is automatically generated.

Priority Label Manager in RM to manage priority labels
--

Key: YARN-2693
URL: https://issues.apache.org/jira/browse/YARN-2693
Project: Hadoop YARN
Issue Type: Sub-task
Components: resourcemanager
Reporter: Sunil G
Assignee: Sunil G
Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch,
0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch

Focus of this JIRA is to have a centralized service to handle priority labels.
Support operations such as
* Add/Delete priority label to a specified queue
* Manage integer mapping associated with each priority label
* Support managing default priority label of a given queue
* ACL support in queue level for priority label
* Expose interface to RM to validate priority label
Storage for this labels will be done in FileSystem and in Memory similar to
NodeLabel
* FileSystem Based : persistent across RM restart
* Memory Based: non-persistent across RM restart

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users


 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: (was: YARN-2423.007.patch)

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
 YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users


 [ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-2423:

Attachment: YARN-2423.007.patch

I'm not sure what Jenkin's problem is.  I've re-rebased the 007 patch and am 
trying again.

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
 YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3157) Wrong format for application id / attempt id not handled completely


[ 
https://issues.apache.org/jira/browse/YARN-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314718#comment-14314718
 ] 

Hadoop QA commented on YARN-3157:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697492/YARN-3157.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6581//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6581//console

This message is automatically generated.

 Wrong format for application id / attempt id not handled completely
 ---

 Key: YARN-3157
 URL: https://issues.apache.org/jira/browse/YARN-3157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: YARN-3157.patch, YARN-3157.patch


 yarn.cmd application -kill application_123
 Format wrong given for application id or attempt. Exception will be thrown to 
 console with out any info
 {quote}
 15/02/07 22:18:01 WARN util.NativeCodeLoader: Unable to load native-hadoop 
 library for your platform... using builtin-java classes where
 Exception in thread main java.util.NoSuchElementException
 at 
 com.google.common.base.AbstractIterator.next(AbstractIterator.java:75)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:146)
 at 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationId(ConverterUtils.java:205)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.killApplication(ApplicationCLI.java:383)
 at 
 org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:219)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 {quote}
 Need to add catch block for java.util.NoSuchElementException also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3129) [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process logs


[ 
https://issues.apache.org/jira/browse/YARN-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314658#comment-14314658
 ] 

Allen Wittenauer commented on YARN-3129:


bq. we should support case insensitive value too

These levels are defined by log4j and defined as uppercase everywhere in both 
code and config.  Making it mixed case here means supporting mixed case 
everywhere...

But otherwise, yes, I agree this sounds like a documentation issue more than a 
bug.  I'll move this to HADOOP.

 [YARN] Daemon log 'set level' and 'get level' is not reflecting in Process 
 logs 
 

 Key: YARN-3129
 URL: https://issues.apache.org/jira/browse/YARN-3129
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jagadesh Kiran N
Assignee: Naganarasimha G R

 a. Execute the command
 ./yarn daemonlog -setlevel xx.xx.xx.xxx:45020 ResourceManager DEBUG
 b. It is not reflecting in process logs even after performing client level 
 operations
 c. Log level is not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3160) Non-atomic operation on nodeUpdateQueue in RMNodeImpl


[ 
https://issues.apache.org/jira/browse/YARN-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314631#comment-14314631
 ] 

Hadoop QA commented on YARN-3160:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697665/YARN-3160.2.patch
  against trunk revision 4eb5f7f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6578//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6578//console

This message is automatically generated.

 Non-atomic operation on nodeUpdateQueue in RMNodeImpl
 -

 Key: YARN-3160
 URL: https://issues.apache.org/jira/browse/YARN-3160
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Chengbing Liu
Assignee: Chengbing Liu
 Attachments: YARN-3160.2.patch, YARN-3160.patch


 {code:title=RMNodeImpl.java|borderStyle=solid}
 while(nodeUpdateQueue.peek() != null){
   latestContainerInfoList.add(nodeUpdateQueue.poll());
 }
 {code}
 The above code brings potential risk of adding null value to 
 {{latestContainerInfoList}}. Since {{ConcurrentLinkedQueue}} implements a 
 wait-free algorithm, we can directly poll the queue, before checking whether 
 the value is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager

[
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314677#comment-14314677
]

Jason Lowe commented on YARN-914:
-

bq. However, YARN-2567 is about threshold thing, may be a wrong JIRA number?

That's the right JIRA. It's about waiting for a threshold number of nodes to
report back in after the RM recovers, and the RM would need to persist the
state about the nodes in the cluster to know what percentage of the old nodes
have reported back in.

As for whether we should just provide hooks vs. making it much more of a
turnkey solution, I'd be an advocate for initially seeing what we can do with
hooks. Based on what we learn with trying to do decommission with that we can
provide feedback into the process of making it a built-in, turnkey solution
later. I do agree with Vinod that there should minimally be an easy way, CLI
or otherwise, for outside scripts driving the decommission to either force it
or wait for it to complete. If waiting, there also needs to be a way to either
have the wait have a timeout which will force after that point or another
method with which to easily kill the containers still on that node.

Support graceful decommission of nodemanager

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314728#comment-14314728
 ] 

Devaraj K commented on YARN-2246:
-

{code:xml}
org.apache.hadoop.yarn.server.resourcemanager.TestRM.testNMTokenSentForNormalContainer[1]

Failing for the past 1 build (Since Failed#6580 )
Took 20 sec.
Error Message

test timed out after 2 milliseconds
{code}

This test failure is unrelated to the patch. It passes in my local.

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users


[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314743#comment-14314743
 ] 

Hadoop QA commented on YARN-2423:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697840/YARN-2423.007.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6582//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6582//console

This message is automatically generated.

 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
 YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice


[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314305#comment-14314305
 ] 

Devaraj K commented on YARN-2246:
-

Thanks [~jlowe] for looking into the patch and confirming the approach. I will 
update the patch with the tests.

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.7.0

 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2015-02-10 Thread Rushabh S Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314898#comment-14314898
 ] 

Rushabh S Shah commented on YARN-2902:
--

[~varun_saxena]: are you  still working on this jira ?

 Killing a container that is localizing can orphan resources in the 
 DOWNLOADING state
 

 Key: YARN-2902
 URL: https://issues.apache.org/jira/browse/YARN-2902
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Fix For: 2.7.0

 Attachments: YARN-2902.002.patch, YARN-2902.patch


 If a container is in the process of localizing when it is stopped/killed then 
 resources are left in the DOWNLOADING state.  If no other container comes 
 along and requests these resources they linger around with no reference 
 counts but aren't cleaned up during normal cache cleanup scans since it will 
 never delete resources in the DOWNLOADING state even if their reference count 
 is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Hitesh Shah (JIRA)

[
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314931#comment-14314931
]

Hitesh Shah commented on YARN-2928:
---

bq. We should have such a configuration that disables the timeline service
globally.

Please explain what globally means.

bq. Can it be handled as a flow of flows as described in the design? For
instance, tez application -- hive queries -- YARN apps? Or does it not
capture the relationship?

Not sure I understand clearly as to how the relationship is captured. Consider
this case: There are 5 hive queries: q1 to q5. There are 3 Tez apps: a1 to a3.
Now, q1 and q5 ran on a1, q2 ran on a2 and q3,q4 ran on a3. Given q1, I need to
know which app it ran on. Given a1, I need to know which queries ran on it.
Could you clarify how this should be represented as flows?

Application Timeline Server (ATS) next gen: phase 1
---

Key: YARN-2928
URL: https://issues.apache.org/jira/browse/YARN-2928
Project: Hadoop YARN
Issue Type: New Feature
Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal
v1.pdf

We have the application timeline server implemented in yarn per YARN-1530 and
YARN-321. Although it is a great feature, we have recognized several critical
issues and features that need to be addressed.
This JIRA proposes the design and implementation changes to address those.
This is phase 1 of this effort.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3124) Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track capacities-by-label

2015-02-10 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314971#comment-14314971
 ] 

Jian He commented on YARN-3124:
---


- Merge CapacitySchedulerConfiguration#setCapacitiesByLabels  and 
CSQueueUtils#setAbsoluteCapacitiesByNodeLabels into a single method
- CapacitySchedulerConfiguration#normalizeAccessibleNodeLabels - should 
AbstractCSQueue#accessibleLabels be updated as well ?
- why union? newCapacities.getExistingNodeLabels is enough ?
{code}
  for (String label : Sets.union(this.getExistingNodeLabels(),
  newCapacities.getExistingNodeLabels())) {
{code}
- Can the existing get*CapacityByLabel can be removed? use 
queueCapacities#get*capacity instead
 - null for the queueCapacity ? then we can remove the parameter
{code}
setupQueueConfigs(cs.getClusterResource(), userLimit, userLimitFactor,
maxApplications, maxAMResourcePerQueuePercent, maxApplicationsPerUser,
state, acls, cs.getConfiguration().getNodeLocalityDelay(),
accessibleLabels, defaultLabelExpression, cs.getConfiguration()
.getReservationContinueLook(), null, cs.getConfiguration()
.getMaximumAllocationPerQueue(getQueuePath()));
{code}
- remove this?
{code}
  @Override
  protected void initializeCapacitiesFromConf() {
// Do nothing
  }
{code}
- {{CSQueueUtils.setAbsoluteCapacitiesByNodeLabel}} may be inside 
AbstractCSQueue
- QueueCapacities#getExistingNodeLabels -  getNodeLabels?
- why {{CSQueueUtils.setAbsoluteCapacitiesByNodeLabels(queueCapacities, 
parent);}} has to be called in ReservationQueue#reinitialize

 Capacity Scheduler LeafQueue/ParentQueue should use QueueCapacities to track 
 capacities-by-label
 

 Key: YARN-3124
 URL: https://issues.apache.org/jira/browse/YARN-3124
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3124.1.patch, YARN-3124.2.patch


 After YARN-3098, capacities-by-label (include 
 used-capacity/maximum-capacity/absolute-maximum-capacity, etc.) should be 
 tracked in QueueCapacities.
 This patch is targeting to make capacities-by-label in CS Queues are all 
 tracked by QueueCapacities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2683) registry config options: document and move to core-default

2015-02-10 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314914#comment-14314914
 ] 

Sanjay Radia commented on YARN-2683:


yarn-registry.md
* This document describes a YARN service registry built to address a  
problem: change to address two problems:
* add:
** Allow Hadoop core services to be registered and discovered thereby reducing 
configuration parameters and to allow core services to be more easily moved.


 registry config options: document and move to core-default
 --

 Key: YARN-2683
 URL: https://issues.apache.org/jira/browse/YARN-2683
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, resourcemanager
Affects Versions: 2.6.0
Reporter: Steve Loughran
Assignee: Steve Loughran
 Attachments: HADOOP-10530-005.patch, YARN-2683-001.patch, 
 YARN-2683-002.patch, YARN-2683-003.patch, YARN-2683-006.patch

   Original Estimate: 1h
  Time Spent: 1h
  Remaining Estimate: 0.5h

 Add to {{yarn-site}} a page on registry configuration parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3165) Possible inconsistent queue state when queue reinitialization failed

2015-02-10 Thread Jian He (JIRA)

Jian He created YARN-3165:
-

 Summary: Possible inconsistent queue state when queue 
reinitialization failed
 Key: YARN-3165
 URL: https://issues.apache.org/jira/browse/YARN-3165
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He


This came up in a discussion with [~chris.douglas]. 
If queue reinitialization failed in the middle, it is possible that queues are 
left in an inconsistent state - some queues are already updated, but some are 
not.  One example is below code in leafQueue:
{code} 
if (newMax.getMemory()  oldMax.getMemory()
|| newMax.getVirtualCores()  oldMax.getVirtualCores()) {
  throw new IOException(
  Trying to reinitialize 
  + getQueuePath()
  +  the maximum allocation size can not be decreased!
  +  Current setting:  + oldMax
  + , trying to set it to:  + newMax);
}
{code}
If exception is thrown here, the previous queues are already updated, but 
latter queues are not.
So we should make queue reinitialization transactional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1

2015-02-10 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314933#comment-14314933
 ] 

Hitesh Shah commented on YARN-2928:
---

Also, [~sjlee0] [~zjshen] I am assuming you are already aware of YARN-2423 and 
plan to maintain compatibility with that implementation if that is introduced 
in a version earlier to the one in which this next-gen impl is supported? 

 Application Timeline Server (ATS) next gen: phase 1
 ---

 Key: YARN-2928
 URL: https://issues.apache.org/jira/browse/YARN-2928
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
 v1.pdf


 We have the application timeline server implemented in yarn per YARN-1530 and 
 YARN-321. Although it is a great feature, we have recognized several critical 
 issues and features that need to be addressed.
 This JIRA proposes the design and implementation changes to address those. 
 This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread Vinod Kumar Vavilapalli (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314997#comment-14314997
]

Vinod Kumar Vavilapalli commented on YARN-1621:
---

Thanks for working on this Bartosz. Quick comments on the patch:
- listcontainers - list-containers
- Add a negative test for pre-running applications

Overall, the CLI is pretty badly organized, and this patch is making it worse.
We have
- applicationattempt -list applicationId: Lists appattempts of an app
- container -list attemptId: Lists containers of an attempt
- application -list: List all apps

I don't like this, but it is what we have. For this patch, we can continue this
scheme and put a container -list appattemptid. And may be a create different
set of commands which make the listing work backwards in a separate effort.

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Key: YARN-1621
URL: https://issues.apache.org/jira/browse/YARN-1621
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Tassapol Athiapinya
Assignee: Bartosz Ługowski
Fix For: 2.7.0

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

As more applications are moved to YARN, we need generic CLI to list rows of
task attempt ID, container ID, host of container, state of container. Today
if YARN application running in a container does hang, there is no way to find
out more info because a user does not know where each attempt is running in.
For each running application, it is useful to differentiate between
running/succeeded/failed/killed containers.

{code:title=proposed yarn cli}
$ yarn application -list-containers -applicationId appId [-containerState
state of container]
where containerState is optional filter to list container in given state only.
container state can be running/succeeded/killed/failed/all.
A user can specify more than one container state at once e.g. KILLED,FAILED.
task attempt ID container ID host of container state of container
{code}
CLI should work with running application/completed application. If a
container runs many task attempts, all attempts should be shown. That will
likely be the case of Tez container-reuse application.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3020) n similar addContainerRequest()s produce n*(n+1)/2 containers

2015-02-10 Thread Peter D Kirchner (JIRA)

[
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315013#comment-14315013
]

Peter D Kirchner commented on YARN-3020:

Hi Wei Yan,
My point, adjusted to take the expected usage into account, is that when
matching requests and/or allocations are spread over multiple heartbeats, too
many containers are requested and received.

So, suppose my application calls addContainerRequest() 10 times.

Let's take your example where the AMRMClient sends 1 container request on
heartbeat 1, and 10 requests at heartbeat 2, overwriting the 1.
Say also that the second RPC returns with 1 container.

The second request is high by one, i.e. 10, because the application does not
yet know about the incoming allocation.
Subsequent updates are also high by approximately the number of incoming
containers.
My application heartbeat is 1 second and the RM is typically allocating 1
container/node/second so I'd expect 10 containers coming in on the third
heartbeat.
Per expected usage, my AMRMClient would have sent out an updated request for 9
containers at that time.
My application would zero-out the matching request on the fourth heartbeat and
release the nine extra containers (90% more) that it received that it never
intended to request.

In the present implementation, with the AMRMClient keeping track of the totals,
removeContainerRequest() properly decrements AMRMClient's idea of the
outstanding count.
But due to this information being a heartbeat out of date vs. the scheduler's,
(pending a definitive fix) a partial fix would be that the AMRMClient should
not routinely update the RM with this matching total, whenever the scheduler's
tally is likely to be more accurate.
Occasions when the RM should be updated are when there is a new matching
addContainerRequest(), i.e. the scheduler's target could otherwise be too low,
or when the AMRMClient's outstanding count is decremented to zero.

Please see my response to Wangda Tan 30 Jan 2015.
Thank you.

n similar addContainerRequest()s produce n*(n+1)/2 containers
-

Key: YARN-3020
URL: https://issues.apache.org/jira/browse/YARN-3020
Project: Hadoop YARN
Issue Type: Bug
Components: client
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Peter D Kirchner
Original Estimate: 24h
Remaining Estimate: 24h

BUG: If the application master calls addContainerRequest() n times, but with
the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 . The most
containers are requested when the interval between calls to
addContainerRequest() exceeds the heartbeat interval of calls to allocate()
(in AMRMClientImpl's run() method).
If the application master calls addContainerRequest() n times, but with a
unique priority each time, I get n containers (as I intended).
Analysis:
There is a logic problem in AMRMClientImpl.java.
Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent
calls to addContainerRequest(), addResourceRequest() finds the previous
matching remoteRequest and increments the container count rather than
starting anew, and does an addResourceRequestToAsk() which defeats the
ask.clear().
From documentation and code comments, it was hard for me to discern the
intended behavior of the API, but the inconsistency reported in this issue
suggests one case or the other is implemented incorrectly.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2246) Job History Link in RM UI is redirecting to the URL which contains Job Id twice

2015-02-10 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315018#comment-14315018
 ] 

Jonathan Eagles commented on YARN-2246:
---

I think this is going to fix my issue.

 Job History Link in RM UI is redirecting to the URL which contains Job Id 
 twice
 ---

 Key: YARN-2246
 URL: https://issues.apache.org/jira/browse/YARN-2246
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Devaraj K
Assignee: Devaraj K
 Attachments: MAPREDUCE-4064-1.patch, MAPREDUCE-4064.patch, 
 YARN-2246-3.patch, YARN-2246-4.patch, YARN-2246.2.patch, YARN-2246.patch


 {code:xml}
 http://xx.x.x.x:19888/jobhistory/job/job_1332435449546_0001/jobhistory/job/job_1332435449546_0001
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3166) Decide detailed package structures for timeline service v2 components


 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu reassigned YARN-3166:
---

Assignee: Li Lu

 Decide detailed package structures for timeline service v2 components
 -

 Key: YARN-3166
 URL: https://issues.apache.org/jira/browse/YARN-3166
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Li Lu
Assignee: Li Lu

 Open this JIRA to track all discussions on detailed package structures for 
 timeline services v2. This JIRA is for discussion only.
 For our current timeline service v2 design, aggregator (previously called 
 writer) implementation is in hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
 In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
 want to put reader related implementations into hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.reader}}
 Both readers and aggregators will expose features that may be used by YARN 
 and other 3rd party components, such as aggregator/reader APIs. For those 
 features, maybe we would like to expose their interfaces to 
 hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
 Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users

2015-02-10 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314952#comment-14314952
 ] 

Hitesh Shah commented on YARN-2423:
---

bq. This is based on the current implementation. We can try to add a 
compatibility layer or something in another JIRA. Though I'm not sure how 
feasible that will be; the data models are somewhat different.

If the current implementation is not planned to be supported in the long term, 
why introduce a java API that will soon be deprecated or rendered obsolete if 
the data models are different? Or is the only intention to backport this 
feature/API into 2.4, 2.5 and 2.6 for existing users of the current 
implementation of ATS?

 



 TimelineClient should wrap all GET APIs to facilitate Java users
 

 Key: YARN-2423
 URL: https://issues.apache.org/jira/browse/YARN-2423
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Robert Kanter
 Attachments: YARN-2423.004.patch, YARN-2423.005.patch, 
 YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, 
 YARN-2423.patch


 TimelineClient provides the Java method to put timeline entities. It's also 
 good to wrap over all GET APIs (both entity and domain), and deserialize the 
 json response into Java POJO objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3166) Decide detailed package structures for timeline service v2 components


 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3166:

Issue Type: Sub-task  (was: Task)
Parent: YARN-2928

 Decide detailed package structures for timeline service v2 components
 -

 Key: YARN-3166
 URL: https://issues.apache.org/jira/browse/YARN-3166
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Li Lu

 Open this JIRA to track all discussions on detailed package structures for 
 timeline services v2. This JIRA is for discussion only so I don't think it 
 should have any assignees. 
 For our current timeline service v2 design, aggregator (previously called 
 writer) implementation is in hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
 In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
 want to put reader related implementations into hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.reader}}
 Both readers and aggregators will expose features that may be used by YARN 
 and other 3rd party components, such as aggregator/reader APIs. For those 
 features, maybe we would like to expose their interfaces to 
 hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
 Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (YARN-3166) Decide detailed package structures for timeline service v2 components

Li Lu created YARN-3166:
---

 Summary: Decide detailed package structures for timeline service 
v2 components
 Key: YARN-3166
 URL: https://issues.apache.org/jira/browse/YARN-3166
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Li Lu


Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only so I don't think it 
should have any assignees. 

For our current timeline service v2 design, aggregator (previously called 
writer) implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3166) Decide detailed package structures for timeline service v2 components


 [ 
https://issues.apache.org/jira/browse/YARN-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3166:

Description: 
Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only.

For our current timeline service v2 design, aggregator (previously called 
writer) implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 

  was:
Open this JIRA to track all discussions on detailed package structures for 
timeline services v2. This JIRA is for discussion only so I don't think it 
should have any assignees. 

For our current timeline service v2 design, aggregator (previously called 
writer) implementation is in hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.aggregator}}

In YARN-2928's design, the next gen ATS reader is also a server. Maybe we want 
to put reader related implementations into hadoop-yarn-server's:
{{org.apache.hadoop.yarn.server.timelineservice.reader}}

Both readers and aggregators will expose features that may be used by YARN and 
other 3rd party components, such as aggregator/reader APIs. For those features, 
maybe we would like to expose their interfaces to hadoop-yarn-common's 
{{org.apache.hadoop.yarn.timelineservice}}? 

Let's use this JIRA as a centralized place to track all related discussions. 


 Decide detailed package structures for timeline service v2 components
 -

 Key: YARN-3166
 URL: https://issues.apache.org/jira/browse/YARN-3166
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Li Lu

 Open this JIRA to track all discussions on detailed package structures for 
 timeline services v2. This JIRA is for discussion only.
 For our current timeline service v2 design, aggregator (previously called 
 writer) implementation is in hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.aggregator}}
 In YARN-2928's design, the next gen ATS reader is also a server. Maybe we 
 want to put reader related implementations into hadoop-yarn-server's:
 {{org.apache.hadoop.yarn.server.timelineservice.reader}}
 Both readers and aggregators will expose features that may be used by YARN 
 and other 3rd party components, such as aggregator/reader APIs. For those 
 features, maybe we would like to expose their interfaces to 
 hadoop-yarn-common's {{org.apache.hadoop.yarn.timelineservice}}? 
 Let's use this JIRA as a centralized place to track all related discussions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread JIRA

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314744#comment-14314744
]

Bartosz Ługowski commented on YARN-1621:

Patch update.

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Key: YARN-1621
URL: https://issues.apache.org/jira/browse/YARN-1621
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Tassapol Athiapinya
Fix For: 2.7.0

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3041) create the ATS entity/event API


[ 
https://issues.apache.org/jira/browse/YARN-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314752#comment-14314752
 ] 

Sangjin Lee commented on YARN-3041:
---

Hitesh on YARN-2928 brought up an interesting point regarding the events (also 
see my reply).

For my own education, what is an event in current ATS? Is it explicitly about 
affecting state changes in entities? Or can it be something else?

How should events be defined in the next gen timeline service? And/or should 
the notion of the state be explicitly defined? Thoughts?

 create the ATS entity/event API
 ---

 Key: YARN-3041
 URL: https://issues.apache.org/jira/browse/YARN-3041
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Robert Kanter
 Attachments: YARN-3041.preliminary.001.patch


 Per design in YARN-2928, create the ATS entity and events API.
 Also, as part of this JIRA, create YARN system entities (e.g. cluster, user, 
 flow, flow run, YARN app, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread JIRA

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bartosz Ługowski updated YARN-1621:
---
Attachment: YARN-1621.3.patch

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Key: YARN-1621
URL: https://issues.apache.org/jira/browse/YARN-1621
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Tassapol Athiapinya
Fix For: 2.7.0

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread JIRA

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bartosz Ługowski updated YARN-1621:
---
Attachment: (was: YARN-1621.3.patch)

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Key: YARN-1621
URL: https://issues.apache.org/jira/browse/YARN-1621
Project: Hadoop YARN
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Tassapol Athiapinya
Fix For: 2.7.0

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1


[ 
https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314748#comment-14314748
 ] 

Sangjin Lee commented on YARN-2928:
---

bq. How is a workflow defined when an entity has 2 parents? Considering the 
tez-hive example, do you agree that both a Hive Query and a Tez application are 
workflows and share some entities?

Can it be handled as a flow of flows as described in the design? For 
instance, tez application -- hive queries -- YARN apps? Or does it not 
capture the relationship?

 Application Timeline Server (ATS) next gen: phase 1
 ---

 Key: YARN-2928
 URL: https://issues.apache.org/jira/browse/YARN-2928
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Critical
 Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal 
 v1.pdf


 We have the application timeline server implemented in yarn per YARN-1530 and 
 YARN-321. Although it is a great feature, we have recognized several critical 
 issues and features that need to be addressed.
 This JIRA proposes the design and implementation changes to address those. 
 This is phase 1 of this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) implement RM starting its ATS writer


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314763#comment-14314763
 ] 

Sangjin Lee commented on YARN-3034:
---

Thanks [~Naganarasimha]! I'll go over the patch today...

bq. Whether we require Multithreaded Dispatcher as we are not publishing 
container life cycle events and if normal dispatcher is ok whether to use 
rmcontext.getDispatcher ?

For publishing app lifecycle events only, I suspect a normal dispatcher might 
be OK. However, there could be more use cases in the future. If it is not too 
complicated, using a multi-threaded dispatcher might be bit preferable IMO. 
Thoughts?

bq. AppAttempt needs to be Entity or event of ApplicationEntity ? i feel later 
option is better

How is it today with the current ATS? If the same container can be part of 
different app attempts (e.g. successive AMs managing the same set of 
containers), then app attempts can't be separate entities? [~zjshen]?

 implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034.20150205-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk


[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314833#comment-14314833
 ] 

Varun Saxena commented on YARN-3074:


[~jlowe], kindly review

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
 YARN-3074.03.patch


 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3034) implement RM starting its ATS writer


[ 
https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314797#comment-14314797
 ] 

Sangjin Lee commented on YARN-3034:
---

Some feedback on the patch...

(1) this creates a dependency from RM to the timeline service; perhaps it is 
unavoidable...
(2) RMTimelineAggregator.java
- we need the license
- annotate with @Private and @Unstable
- line. 31: nit; spacing

(3) SystemMetricsPublisher.java
- instead of replacing the use of the existing ATS, I think we need to have 
both (the existing ATS calls as well as the new calls); we will need a global 
config that enables/disables the next gen timeline service


 implement RM starting its ATS writer
 

 Key: YARN-3034
 URL: https://issues.apache.org/jira/browse/YARN-3034
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
 Attachments: YARN-3034.20150205-1.patch


 Per design in YARN-2928, implement resource managers starting their own ATS 
 writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3074) Nodemanager dies when localizer runner tries to write to a full disk


[ 
https://issues.apache.org/jira/browse/YARN-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314830#comment-14314830
 ] 

Hadoop QA commented on YARN-3074:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697846/YARN-3074.03.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6584//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6584//console

This message is automatically generated.

 Nodemanager dies when localizer runner tries to write to a full disk
 

 Key: YARN-3074
 URL: https://issues.apache.org/jira/browse/YARN-3074
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Varun Saxena
 Attachments: YARN-3074.001.patch, YARN-3074.002.patch, 
 YARN-3074.03.patch


 When a LocalizerRunner tries to write to a full disk it can bring down the 
 nodemanager process.  Instead of failing the whole process we should fail 
 only the container and make a best attempt to keep going.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container


[ 
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314857#comment-14314857
 ] 

Hadoop QA commented on YARN-1621:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12697849/YARN-1621.3.patch
  against trunk revision 3f5431a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/6585//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/6585//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-client.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6585//console

This message is automatically generated.

 Add CLI to list rows of task attempt ID, container ID, host of container, 
 state of container
 --

 Key: YARN-1621
 URL: https://issues.apache.org/jira/browse/YARN-1621
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Tassapol Athiapinya
 Fix For: 2.7.0

 Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch


 As more applications are moved to YARN, we need generic CLI to list rows of 
 task attempt ID, container ID, host of container, state of container. Today 
 if YARN application running in a container does hang, there is no way to find 
 out more info because a user does not know where each attempt is running in.
 For each running application, it is useful to differentiate between 
 running/succeeded/failed/killed containers.
  
 {code:title=proposed yarn cli}
 $ yarn application -list-containers -applicationId appId [-containerState 
 state of container]
 where containerState is optional filter to list container in given state only.
 container state can be running/succeeded/killed/failed/all.
 A user can specify more than one container state at once e.g. KILLED,FAILED.
 task attempt ID container ID host of container state of container 
 {code}
 CLI should work with running application/completed application. If a 
 container runs many task attempts, all attempts should be shown. That will 
 likely be the case of Tez container-reuse application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread Wangda Tan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14314892#comment-14314892
]

Wangda Tan commented on YARN-1621:
--

Assigned to [~noddi].

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-1621) Add CLI to list rows of task attempt ID, container ID, host of container, state of container

2015-02-10 Thread Wangda Tan (JIRA)

[
https://issues.apache.org/jira/browse/YARN-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wangda Tan updated YARN-1621:
-
Assignee: Bartosz Ługowski

Add CLI to list rows of task attempt ID, container ID, host of container,
state of container
--

Attachments: YARN-1621.1.patch, YARN-1621.2.patch, YARN-1621.3.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-1237) Description for yarn.nodemanager.aux-services in yarn-default.xml is misleading


 [ 
https://issues.apache.org/jira/browse/YARN-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-1237:
--

Assignee: Brahma Reddy Battula

 Description for yarn.nodemanager.aux-services in yarn-default.xml is 
 misleading
 ---

 Key: YARN-1237
 URL: https://issues.apache.org/jira/browse/YARN-1237
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
Reporter: Hitesh Shah
Assignee: Brahma Reddy Battula
Priority: Minor

 Description states:
 the valid service name should only contain a-zA-Z0-9_ and can not start with 
 numbers 
 It seems to indicate only one service is supported. If multiple services are 
 allowed, it does not indicate how they should be specified i.e. 
 comma-separated or space-separated? If the service name cannot contain 
 spaces, does this imply that space-separated lists are also permitted?
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3169) drop the useless yarn overview document