[jira] [Commented] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532353#comment-14532353
 ] 

Hadoop QA commented on YARN-2684:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 33s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 46s | The applied patch generated  1 
new checkstyle issues (total was 74, now 75). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |  52m 33s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 45s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731104/0002-YARN-2684.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 305e473 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7757/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7757/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7757/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7757/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7757/console |


This message was automatically generated.

 FairScheduler should tolerate queue configuration changes across RM restarts
 

 Key: YARN-2684
 URL: https://issues.apache.org/jira/browse/YARN-2684
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, resourcemanager
Affects Versions: 2.5.1
Reporter: Karthik Kambatla
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch


 YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1564) add some basic workflow YARN services

2015-05-07 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532354#comment-14532354
 ] 

Steve Loughran commented on YARN-1564:
--

Sync on AtomicBool warnings are spurious; code is simply using the (final) 
objects as something to wait/notify off to indicate status updates within the 
atomic values themselves. Nothing wrong with that —avoids having a separate 
object, and makes it obvious what things are waiting on. Presumably the check 
is there for people who don't know what they are doing.

 add some basic workflow YARN services
 -

 Key: YARN-1564
 URL: https://issues.apache.org/jira/browse/YARN-1564
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Attachments: YARN-1564-001.patch, YARN-1564-002.patch

   Original Estimate: 24h
  Time Spent: 48h
  Remaining Estimate: 0h

 I've been using some alternative composite services to help build workflows 
 of process execution in a YARN AM.
 They and their tests could be moved in YARN for the use by others -this would 
 make it easier to build aggregate services in an AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532150#comment-14532150
 ] 

Hadoop QA commented on YARN-3044:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  6s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 42s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 49s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   1m 24s | The applied patch generated 
15 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 19s | The applied patch generated  5 
new checkstyle issues (total was 266, now 270). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 17s | The patch appears to introduce 
15 new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |  53m 37s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   0m 21s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  95m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-api |
|  |  
org.apache.hadoop.yarn.api.records.timelineservice.TimelineMetric$1.compare(Long,
 Long) negates the return value of Long.compareTo(Long)  At 
TimelineMetric.java:value of Long.compareTo(Long)  At TimelineMetric.java:[line 
47] |
| FindBugs | module:hadoop-yarn-server-resourcemanager |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to 
org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptFinishedEvent 
in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At 
TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptFinishedEvent
 in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At TimelineServiceV1Publisher.java:[line 103] |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to 
org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptRegisteredEvent 
in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At 
TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptRegisteredEvent
 in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At TimelineServiceV1Publisher.java:[line 100] |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to 
org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationACLsUpdatedEvent
 in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At 
TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationACLsUpdatedEvent
 in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At TimelineServiceV1Publisher.java:[line 97] |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to 
org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationCreatedEvent 
in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At 
TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationCreatedEvent
 in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At TimelineServiceV1Publisher.java:[line 91] |
|  |  Unchecked/unconfirmed cast from 
org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to 
org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationFinishedEvent 
in 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent)
  At 
TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationFinishedEvent
 in 

[jira] [Updated] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2684:
-
Attachment: 0002-YARN-2684.patch

[~xgong] thanks for looking in the patch..
Rebased the patch and updated. Kindly review the updated patch

 FairScheduler should tolerate queue configuration changes across RM restarts
 

 Key: YARN-2684
 URL: https://issues.apache.org/jira/browse/YARN-2684
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, resourcemanager
Affects Versions: 2.5.1
Reporter: Karthik Kambatla
Assignee: Rohith
Priority: Critical
 Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch


 YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3589) RM and AH web UI display DOCTYPE

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3589:
-
Attachment: 0001-YARN-3589.patch

Attached the patch fixing the issue. Kindly review the patch

 RM and AH web UI display DOCTYPE
 

 Key: YARN-3589
 URL: https://issues.apache.org/jira/browse/YARN-3589
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Rohith
Assignee: Rohith
 Attachments: 0001-YARN-3589.patch, YARN-3589.PNG


 RM web app UI display {{!DOCTYPE html PUBLIC -\/\/W3C\/\/DTD HTML 
 4.01\/\/EN http:\/\/www.w3.org\/TR\/html4\/strict.dtd}} which is not 
 necessary.
 This is because, content of html page is escaped which result browser cant 
 not parse it. Any content which is escaped should be with the HTML block , 
 but doc type is above html which browser can't parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2684:
-
Labels: BB2015-05-TBR  (was: )

 FairScheduler should tolerate queue configuration changes across RM restarts
 

 Key: YARN-2684
 URL: https://issues.apache.org/jira/browse/YARN-2684
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, resourcemanager
Affects Versions: 2.5.1
Reporter: Karthik Kambatla
Assignee: Rohith
Priority: Critical
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch


 YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3577) Misspelling of threshold in log4j.properties for tests

2015-05-07 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-3577:

Affects Version/s: 2.7.0
   Labels:   (was: BB2015-05-TBR)

 Misspelling of threshold in log4j.properties for tests
 --

 Key: YARN-3577
 URL: https://issues.apache.org/jira/browse/YARN-3577
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3577.patch


 log4j.properties file for test contains misspelling log4j.threshhold.
 We should use log4j.threshold correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-868) YarnClient should set the service address in tokens returned by getRMDelegationToken()

2015-05-07 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532250#comment-14532250
 ] 

Varun Saxena commented on YARN-868:
---

Thanks [~djp] and [~gtCarrera9] for reviewing this.

To give you a background, refer to linked [file in 
Tez|https://github.com/apache/tez/blob/master/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/client/ResourceMgrDelegate.java].
Here as you can see, Tez needs service address while getting Delegation token 
This address is currently taken from the config 
{{yarn.resourcemenager.address}} instead of the actual RM address which 
returned the token in case of HA.
I am not aware though about what Tez does with this service address though. 
Hitesh can probably throw some light on that.

Coming to other comments,
bq. Do we want to mark the newly added getServiceAddress to be public and 
stable (especially when we have a private and unstable setter)?
True, I will mark it as public and unstable.

bq.Since we're adding a new field in tokens, do you think it's worthwhile to 
add special test cases on the system working with old token formats? We may not 
want to simply exempt them in the tests but define a standard behavior for 
them. Personally, I think this is important to keep rolling upgrade feature 
safe.
The change in token applies only to client side. I haven't changed TokenProto 
so token transferred over wire would still remain same. I did not create a new 
class to incorporate this change to maintain API backwards compatible. Let me 
know if you feel differently.

Will fix checkstyle and whitespace issues.

 YarnClient should set the service address in tokens returned by 
 getRMDelegationToken()
 --

 Key: YARN-868
 URL: https://issues.apache.org/jira/browse/YARN-868
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Varun Saxena
  Labels: BB2015-05-TBR
 Attachments: YARN-868.patch


 Either the client should set this information into the token or the client 
 layer should expose an api that returns the service address.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3589) RM and AH web UI display DOCTYPE

2015-05-07 Thread Rohith (JIRA)
Rohith created YARN-3589:


 Summary: RM and AH web UI display DOCTYPE
 Key: YARN-3589
 URL: https://issues.apache.org/jira/browse/YARN-3589
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Rohith
Assignee: Rohith


RM web app UI display {{!DOCTYPE html PUBLIC -\/\/W3C\/\/DTD HTML 4.01\/\/EN 
http:\/\/www.w3.org\/TR\/html4\/strict.dtd}} which is not necessary.

This is because, content of html page is escaped which result browser cant not 
parse it. Any content which is escaped should be with the HTML block , but doc 
type is above html which browser can't parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service

2015-05-07 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532307#comment-14532307
 ] 

Steve Loughran commented on YARN-679:
-

I haven't gone near this recently; I'd like the workflow stuff to go in first 
as then they can hook up in order

 add an entry point that can start any Yarn service
 --

 Key: YARN-679
 URL: https://issues.apache.org/jira/browse/YARN-679
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: api
Affects Versions: 2.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran
  Labels: BB2015-05-TBR
 Attachments: YARN-679-001.patch, YARN-679-002.patch, 
 YARN-679-002.patch, YARN-679-003.patch, org.apache.hadoop.servic...mon 
 3.0.0-SNAPSHOT API).pdf

  Time Spent: 72h
  Remaining Estimate: 0h

 There's no need to write separate .main classes for every Yarn service, given 
 that the startup mechanism should be identical: create, init, start, wait for 
 stopped -with an interrupt handler to trigger a clean shutdown on a control-c 
 interrrupt.
 Provide one that takes any classname, and a list of config files/options



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3589) RM and AH web UI display DOCTYPE

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3589:
-
Attachment: YARN-3589.PNG

 RM and AH web UI display DOCTYPE
 

 Key: YARN-3589
 URL: https://issues.apache.org/jira/browse/YARN-3589
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-3589.PNG


 RM web app UI display {{!DOCTYPE html PUBLIC -\/\/W3C\/\/DTD HTML 
 4.01\/\/EN http:\/\/www.w3.org\/TR\/html4\/strict.dtd}} which is not 
 necessary.
 This is because, content of html page is escaped which result browser cant 
 not parse it. Any content which is escaped should be with the HTML block , 
 but doc type is above html which browser can't parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3590) Change the attribute maxRunningApps of FairSchudeler queue,it should be take effect for application in queue,but not

2015-05-07 Thread zhoulinlin (JIRA)
zhoulinlin created YARN-3590:


 Summary: Change the attribute maxRunningApps  of FairSchudeler 
queue,it should be take effect for application in queue,but not
 Key: YARN-3590
 URL: https://issues.apache.org/jira/browse/YARN-3590
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.2
Reporter: zhoulinlin


Change the queue attribute maxRunningApps  for  FairSchuduler, and then 
refresh queues, it should effect the application in queue immediately,but not.
To take effect, the condition is that another application move from this queue. 

For example:the maxRunningApps is 0, summbit a application A to this queue, 
it can't be run. Then change maxRunningApps from 0 to 2 and refresh quque, 
application A should be run,but not. If you summit another application B,when 
application B commplete, application A run .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2442) ResourceManager JMX UI does not give HA State

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2442:
-
Attachment: 0001-YARN-2442.patch

Attached the patch without test. Verified the patch in cluster retriving 
HAState using JMX. 

Kindly review the patch

 ResourceManager JMX UI does not give HA State
 -

 Key: YARN-2442
 URL: https://issues.apache.org/jira/browse/YARN-2442
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0, 2.6.0, 2.7.0
Reporter: Nishan Shetty
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-2442.patch


 ResourceManager JMX UI can show the haState (INITIALIZING, ACTIVE, STANDBY, 
 STOPPED)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2442) ResourceManager JMX UI does not give HA State

2015-05-07 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-2442:
-
Labels: BB2015-05-TBR  (was: )

 ResourceManager JMX UI does not give HA State
 -

 Key: YARN-2442
 URL: https://issues.apache.org/jira/browse/YARN-2442
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.5.0, 2.6.0, 2.7.0
Reporter: Nishan Shetty
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-2442.patch


 ResourceManager JMX UI can show the haState (INITIALIZING, ACTIVE, STANDBY, 
 STOPPED)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling

2015-05-07 Thread Xianyin Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532131#comment-14532131
 ] 

Xianyin Xin commented on YARN-3547:
---

Thanks for your feedback [~vinodkv]. We use 
SchedulerApplicationAttempt.getAppAttemptResourceUsage().getPending(), not 
hasPendingResourceRequest(). In fact i don't quite understand what you mean. 
attemptResourceUsage records the resource information of an attempt, no matter 
it is scheduled by CS or Fair. Would you please explain your concern?

 FairScheduler: Apps that have no resource demand should not participate 
 scheduling
 --

 Key: YARN-3547
 URL: https://issues.apache.org/jira/browse/YARN-3547
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: Xianyin Xin
Assignee: Xianyin Xin
  Labels: BB2015-05-TBR
 Attachments: YARN-3547.001.patch, YARN-3547.002.patch, 
 YARN-3547.003.patch, YARN-3547.004.patch


 At present, all of the 'running' apps participate the scheduling process, 
 however, most of them may have no resource demand on a production cluster, as 
 the app's status is running other than waiting for resource at the most of 
 the app's lifetime. It's not a wise way we sort all the 'running' apps and 
 try to fulfill them, especially on a large-scale cluster which has heavy 
 scheduling load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532167#comment-14532167
 ] 

Hadoop QA commented on YARN-3505:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 30s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 13s | The applied patch generated  4 
new checkstyle issues (total was 71, now 66). |
| {color:red}-1{color} | whitespace |   0m 21s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  9s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-server-common. |
| {color:green}+1{color} | yarn tests |   5m 50s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  62m 48s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 108m 28s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731082/YARN-3505.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 918af8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7755/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7755/console |


This message was automatically generated.

 Node's Log Aggregation Report with SUCCEED should not cached in RMApps
 --

 Key: YARN-3505
 URL: https://issues.apache.org/jira/browse/YARN-3505
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation
Affects Versions: 2.8.0
Reporter: Junping Du
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-3505.1.patch, YARN-3505.2.patch


 Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
 reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3589) RM and AH web UI display DOCTYPE

2015-05-07 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532323#comment-14532323
 ] 

Rohith commented on YARN-3589:
--

YARN-1993 escapes the content which are written with in the html block which is 
required for cross-site scripting. But the same logic is being used for content 
outside the html block which browser does not parse it and dispaly as it is.

 RM and AH web UI display DOCTYPE
 

 Key: YARN-3589
 URL: https://issues.apache.org/jira/browse/YARN-3589
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Rohith
Assignee: Rohith
 Attachments: YARN-3589.PNG


 RM web app UI display {{!DOCTYPE html PUBLIC -\/\/W3C\/\/DTD HTML 
 4.01\/\/EN http:\/\/www.w3.org\/TR\/html4\/strict.dtd}} which is not 
 necessary.
 This is because, content of html page is escaped which result browser cant 
 not parse it. Any content which is escaped should be with the HTML block , 
 but doc type is above html which browser can't parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-07 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-2821:

Attachment: YARN-2821.002.patch

Uploaded a new patch that correctly handles AM restarts and doesn't launch 
unnecessary containers.

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_03
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_05
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_04
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_05
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_03
 14/11/04 18:21:39 INFO 

[jira] [Updated] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-07 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3044:

Attachment: YARN-3044-YARN-2928.006.patch

Hi [~sjlee0]  [~djp], 
Uploading a patch with applicable , check style issues, find bugs issue and 
test case failures (those which are due to this patch or branch). Hope to have 
some reviews faster for this jira and 3045, so that i can try some other work 
on 2928

 [Event producers] Implement RM writing app lifecycle events to ATS
 --

 Key: YARN-3044
 URL: https://issues.apache.org/jira/browse/YARN-3044
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
  Labels: BB2015-05-TBR
 Attachments: YARN-3044-YARN-2928.004.patch, 
 YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
 YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, 
 YARN-3044.20150416-1.patch


 Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3592) Fix typos in RMNodeLabelsManager

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533114#comment-14533114
 ] 

Hadoop QA commented on YARN-3592:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 26s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | yarn tests |  54m 23s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  90m 21s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731185/0001-YARN-3592.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8e991f4 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7763/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7763/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7763/console |


This message was automatically generated.

 Fix typos in RMNodeLabelsManager
 

 Key: YARN-3592
 URL: https://issues.apache.org/jira/browse/YARN-3592
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Junping Du
Assignee: Sunil G
  Labels: newbie
 Attachments: 0001-YARN-3592.patch


 acccessibleNodeLabels = accessibleNodeLabels in many places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2918:
-
Attachment: YARN-2918.3.patch

Attached ver.3, addressed comments from [~jianhe]

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
  Labels: BB2015-05-TBR
 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7

2015-05-07 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533256#comment-14533256
 ] 

Wei Yan commented on YARN-2194:
---

[~vinodkv], discussed about this with Karthik offline. We thought it would be 
better just let the existing cgroups code work well in RedHat7. And after that, 
we can provide another systemd solution. 
I can update a patch later.

 Add Cgroup support for RedHat 7
 ---

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2194-1.patch


In previous versions of RedHat, we can build custom cgroup hierarchies 
 with use of the cgconfig command from the libcgroup package. From RedHat 7, 
 package libcgroup is deprecated and it is not recommended to use it since it 
 can easily create conflicts with the default cgroup hierarchy. The systemd is 
 provided and recommended for cgroup management. We need to add support for 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3460) Test TestSecureRMRegistryOperations failed with IBM_JAVA JVM

2015-05-07 Thread pascal oliva (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

pascal oliva updated YARN-3460:
---
Attachment: YARN-3460-3.patch

update whitespaces

 Test TestSecureRMRegistryOperations failed with IBM_JAVA JVM
 

 Key: YARN-3460
 URL: https://issues.apache.org/jira/browse/YARN-3460
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.6.0
 Environment: $ mvn -version
 Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 
 2014-02-14T11:37:52-06:00)
 Maven home: /opt/apache-maven-3.2.1
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-ppc64le-71/jre
 Default locale: en_US, platform encoding: UTF-8
 OS name: linux, version: 3.10.0-229.ael7b.ppc64le, arch: ppc64le, 
 family: unix
Reporter: pascal oliva
 Attachments: HADOOP-11810-1.patch, YARN-3460-1.patch, 
 YARN-3460-2.patch, YARN-3460-3.patch


 TestSecureRMRegistryOperations failed with JBM IBM JAVA
 mvn test -X 
 -Dtest=org.apache.hadoop.registry.secure.TestSecureRMRegistryOperations
 ModuleTotal Failure Error Skipped
 -
 hadoop-yarn-registry 12  0   12 0
 -
  Total  12  0   12 0
 With 
 javax.security.auth.login.LoginException: Bad JAAS configuration: 
 unrecognized option: isInitiator
 and 
 Bad JAAS configuration: unrecognized option: storeKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-05-07 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533313#comment-14533313
 ] 

Jonathan Eagles commented on YARN-3448:
---

Thanks so much, everyone!

 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 2.8.0

 Attachments: YARN-3448.1.patch, YARN-3448.10.patch, 
 YARN-3448.12.patch, YARN-3448.13.patch, YARN-3448.14.patch, 
 YARN-3448.15.patch, YARN-3448.16.patch, YARN-3448.17.patch, 
 YARN-3448.2.patch, YARN-3448.3.patch, YARN-3448.4.patch, YARN-3448.5.patch, 
 YARN-3448.7.patch, YARN-3448.8.patch, YARN-3448.9.patch


 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-2686) CgroupsLCEResourcesHandler does not support the default Redhat 7/CentOS 7

2015-05-07 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-2686.
---
Resolution: Duplicate

Tx for the discussion. Closing as dup of YARN-2194.

 CgroupsLCEResourcesHandler does not support the default Redhat 7/CentOS 7
 -

 Key: YARN-2686
 URL: https://issues.apache.org/jira/browse/YARN-2686
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Beckham007

 CgroupsLCEResourcesHandler uses , to seprating resourcesOption.
 Redhat 7 use /sys/fs/cgroup/cpu,cpuacct as the cpu mount dir. So 
 container-executor would use the wrong path /sys/fs/cgroup/cpu as the 
 container task file. It should be 
 /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/contain_id/tasks.
 We should someother character instand of ,.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7

2015-05-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533089#comment-14533089
 ] 

Vinod Kumar Vavilapalli commented on YARN-2194:
---

Also, it will be really useful if we could find a way to support existing code 
on RHEL7, even if libcgroups is deprecated and the approach ends up becoming 
heavy-handed, perhaps with some manual steps. This 
[page|https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Resource_Management_Guide/chap-Using_libcgroup_Tools.html]
 talks about ways of doing this.

 Add Cgroup support for RedHat 7
 ---

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2194-1.patch


In previous versions of RedHat, we can build custom cgroup hierarchies 
 with use of the cgconfig command from the libcgroup package. From RedHat 7, 
 package libcgroup is deprecated and it is not recommended to use it since it 
 can easily create conflicts with the default cgroup hierarchy. The systemd is 
 provided and recommended for cgroup management. We need to add support for 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533120#comment-14533120
 ] 

Wangda Tan commented on YARN-3362:
--

Hi [~Naganarasimha],
Thanks a lot for updating, looks much better now! I still have few minor 
comments:

*For UI:*

1) Configured Capacity, Configured Max Capacity should be a part of Queue 
Status for Partition...?
2) Not caused by your patch, Absolute Capacity should be Absolute Configured 
Capacity, Absolute Max Capacity should be Absolute Configured Max Capacity, 
could you update them in your patch?
3) Also not caused by your patch, there's no space between Active Users Info 
and following queue, I'm not sure if there's any easy fix can do, please feel 
free to file a separate ticket if it will be hard to be solved together.

*For implementation:*
1) One minor comment for style is, you can merge all capacities-related 
rendering in CapacitySchedulerPage to a method similar to 
renderCommonLeafQueueInfo, which you can merge some implementation for 
{{render}} and {{renderLeafQueueInfoWithoutParition}}. And add a method 
renderLeafQueueInfoWithParition to make {{render}} looks cleaner.

*For your question*
bq. May be i did not get this completely. label is exclusive-label i have done 
in CapacitySchedulerPage.QueuesBlock.render(l num - 357)
I think for both CapacitySchedulerPage and CapacitySchedulerInfo, it should be:
{code}
if (label.getIsExclusive()
 !((AbstractCSQueue) 
root).accessibleToPartition(label.getLabelName())) {
{code}
When label is exclusive (nobody can use the label except queue is accessible to 
the label) and (queue isn't accessible to the label), we don't need to continue.

Let me know your thoughts.

CC: [~vinodkv]/[~jianhe].


 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
 Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
 Screen Shot 2015-04-29 at 11.42.17 AM.png, 
 YARN-3362.20150428-3-modified.patch, YARN-3362.20150428-3.patch, 
 YARN-3362.20150506-1.patch, YARN-3362.20150507-1.patch, capacity-scheduler.xml


 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-644:
--
Attachment: (was: YARN-644.03.patch)

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-644:
--
Attachment: YARN-644.03.patch

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-07 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533174#comment-14533174
 ] 

Naganarasimha G R commented on YARN-3362:
-

Thanks for the review [~wangda],
bq. Also not caused by your patch, there's no space between Active Users Info 
and following queue, I'm not sure if there's any easy fix can do, please feel 
free to file a separate ticket if it will be hard to be solved together.
Put some considerable amount of effort, dint workout. Hamlet should have add 
some kind of doc or a book, its seems like an enigma code to me. Will file a 
separate jira as its required in many places of CS page : Active Users Info 
and following queue, in between Dump scheduler log button  and Application 
Queue , In CS Health Block
For other comments will rework ASAP.

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
 Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
 No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
 at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
 YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
 YARN-3362.20150507-1.patch, capacity-scheduler.xml


 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-644:
--
Attachment: YARN-644.03.patch

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-644:
--
Attachment: (was: YARN-644.03.patch)

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533179#comment-14533179
 ] 

Jian He commented on YARN-2918:
---

looks good overall, minor comments:
- simplify below a bit?
{code}
boolean queueCheck = true;
if (queueLabels == null) {
  queueCheck = false; 
} else {
  if (!queueLabels.contains(str)
   !queueLabels.contains(RMNodeLabelsManager.ANY)) {
queueCheck = false;
  }
}
if (!queueCheck) {
  return false;
}
{code}
- test newly added test conditions in TestSchedulerUtils seems like some of 
them are not being tested.

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
  Labels: BB2015-05-TBR
 Attachments: YARN-2918.1.patch, YARN-2918.2.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3539) Compatibility doc to state that ATS v1 is a stable REST API

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533208#comment-14533208
 ] 

Hadoop QA commented on YARN-3539:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 30s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:green}+1{color} | checkstyle |   3m 27s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   1m 38s | The patch has 18  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | common tests |  23m 22s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-server-common. |
| | |  76m 42s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731223/YARN-3539-008.patch |
| Optional Tests | site javadoc javac unit findbugs checkstyle |
| git revision | trunk / daf3e4e |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7768/console |


This message was automatically generated.

 Compatibility doc to state that ATS v1 is a stable REST API
 ---

 Key: YARN-3539
 URL: https://issues.apache.org/jira/browse/YARN-3539
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.7.0
Reporter: Steve Loughran
Assignee: Steve Loughran
  Labels: BB2015-05-TBR
 Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch, 
 YARN-3539-003.patch, YARN-3539-004.patch, YARN-3539-005.patch, 
 YARN-3539-006.patch, YARN-3539-007.patch, YARN-3539-008.patch, 
 timeline_get_api_examples.txt


 The ATS v2 discussion and YARN-2423 have raised the question: how stable are 
 the ATSv1 APIs?
 The existing compatibility document actually states that the History Server 
 is [a stable REST 
 API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs],
  which effectively means that ATSv1 has already been declared as a stable API.
 Clarify this by patching the compatibility document appropriately



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3589) RM and AH web UI display DOCTYPE

2015-05-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533206#comment-14533206
 ] 

Jian He commented on YARN-3589:
---

lgtm, thanks Rohith !

 RM and AH web UI display DOCTYPE
 

 Key: YARN-3589
 URL: https://issues.apache.org/jira/browse/YARN-3589
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Rohith
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-3589.patch, YARN-3589.PNG


 RM web app UI display {{!DOCTYPE html PUBLIC -\/\/W3C\/\/DTD HTML 
 4.01\/\/EN http:\/\/www.w3.org\/TR\/html4\/strict.dtd}} which is not 
 necessary.
 This is because, content of html page is escaped which result browser cant 
 not parse it. Any content which is escaped should be with the HTML block , 
 but doc type is above html which browser can't parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3593) Modifications in Node Labels Page for Partitions

2015-05-07 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3593:

Attachment: YARN-3593.20150507-1.patch
NodeLabelsPageModifications.png

attaching initial patch and the snapshot for modifications

 Modifications in Node Labels Page for Partitions
 

 Key: YARN-3593
 URL: https://issues.apache.org/jira/browse/YARN-3593
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: webapp
Affects Versions: 2.6.0, 2.7.0
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: NodeLabelsPageModifications.png, 
 YARN-3593.20150507-1.patch


 Need to support displaying the label type in Node Labels page and also 
 instead of NO_LABEL we need to show as DEFAULT_PARTITION 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-07 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3362:
-
Attachment: No-space-between-Active_user_info-and-next-queues.png

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
 Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
 No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
 at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
 YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
 YARN-3362.20150507-1.patch, capacity-scheduler.xml


 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3592) Fix typos in RMNodeLabelsManager

2015-05-07 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533159#comment-14533159
 ] 

Sunil G commented on YARN-3592:
---

testRMRestartGetApplicationList is passing locally. Test failure is unrelated.
New test cases are not needed for this as its a typo issue.

 Fix typos in RMNodeLabelsManager
 

 Key: YARN-3592
 URL: https://issues.apache.org/jira/browse/YARN-3592
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Junping Du
Assignee: Sunil G
  Labels: newbie
 Attachments: 0001-YARN-3592.patch


 acccessibleNodeLabels = accessibleNodeLabels in many places.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-644:
--
Attachment: YARN-644.03.patch

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2268) Disallow formatting the RMStateStore when there is an RM running

2015-05-07 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533311#comment-14533311
 ] 

Xuan Gong commented on YARN-2268:
-

Cancel the patch, looks like we need more discussion

 Disallow formatting the RMStateStore when there is an RM running
 

 Key: YARN-2268
 URL: https://issues.apache.org/jira/browse/YARN-2268
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Karthik Kambatla
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-2268.patch


 YARN-2131 adds a way to format the RMStateStore. However, it can be a problem 
 if we format the store while an RM is actively using it. It would be nice to 
 fail the format if there is an RM running and using this store. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3460) Test TestSecureRMRegistryOperations failed with IBM_JAVA JVM

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533310#comment-14533310
 ] 

Hadoop QA commented on YARN-3460:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 41s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 38s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   1m  1s | Tests passed in 
hadoop-yarn-registry. |
| | |  36m 34s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731250/YARN-3460-3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ab5058d |
| hadoop-yarn-registry test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7774/artifact/patchprocess/testrun_hadoop-yarn-registry.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7774/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7774/console |


This message was automatically generated.

 Test TestSecureRMRegistryOperations failed with IBM_JAVA JVM
 

 Key: YARN-3460
 URL: https://issues.apache.org/jira/browse/YARN-3460
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 2.6.0
 Environment: $ mvn -version
 Apache Maven 3.2.1 (ea8b2b07643dbb1b84b6d16e1f08391b666bc1e9; 
 2014-02-14T11:37:52-06:00)
 Maven home: /opt/apache-maven-3.2.1
 Java version: 1.7.0, vendor: IBM Corporation
 Java home: /usr/lib/jvm/ibm-java-ppc64le-71/jre
 Default locale: en_US, platform encoding: UTF-8
 OS name: linux, version: 3.10.0-229.ael7b.ppc64le, arch: ppc64le, 
 family: unix
Reporter: pascal oliva
 Attachments: HADOOP-11810-1.patch, YARN-3460-1.patch, 
 YARN-3460-2.patch, YARN-3460-3.patch


 TestSecureRMRegistryOperations failed with JBM IBM JAVA
 mvn test -X 
 -Dtest=org.apache.hadoop.registry.secure.TestSecureRMRegistryOperations
 ModuleTotal Failure Error Skipped
 -
 hadoop-yarn-registry 12  0   12 0
 -
  Total  12  0   12 0
 With 
 javax.security.auth.login.LoginException: Bad JAAS configuration: 
 unrecognized option: isInitiator
 and 
 Bad JAAS configuration: unrecognized option: storeKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-20) More information for yarn.resourcemanager.webapp.address in yarn-default.xml

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533074#comment-14533074
 ] 

Hadoop QA commented on YARN-20:
---

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  36m 13s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731219/YARN-20.2.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / daf3e4e |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7767/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7767/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7767/console |


This message was automatically generated.

 More information for yarn.resourcemanager.webapp.address in yarn-default.xml
 --

 Key: YARN-20
 URL: https://issues.apache.org/jira/browse/YARN-20
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: Nemon Lou
Assignee: Bartosz Ługowski
Priority: Trivial
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-20.1.patch, YARN-20.2.patch, YARN-20.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

   The parameter  yarn.resourcemanager.webapp.address in yarn-default.xml  is 
 in host:port format,which is noted in the cluster set up guide 
 (http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html).
   When i read though the code,i find host format is also supported. In 
 host format,the port will be random.
   So we may add more documentation in  yarn-default.xml for easy understood.
   I will submit a patch if it's helpful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533122#comment-14533122
 ] 

Wangda Tan commented on YARN-3362:
--

Attached 
https://issues.apache.org/jira/secure/attachment/12731236/No-space-between-Active_user_info-and-next-queues.png;
 to showing there's no space between Active User Info and next queue.

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
 Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
 No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
 at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
 YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
 YARN-3362.20150507-1.patch, capacity-scheduler.xml


 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation

2015-05-07 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-3434:

Attachment: YARN-3434-branch2.7.patch

Attaching patch for branch2.7.

[~leftnoteasy] could you take a look when you have a chance?

 Interaction between reservations and userlimit can result in significant ULF 
 violation
 --

 Key: YARN-3434
 URL: https://issues.apache.org/jira/browse/YARN-3434
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Thomas Graves
Assignee: Thomas Graves
 Fix For: 2.8.0

 Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, 
 YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, 
 YARN-3434.patch, YARN-3434.patch


 ULF was set to 1.0
 User was able to consume 1.4X queue capacity.
 It looks like when this application launched, it reserved about 1000 
 containers, each 8G each, within about 5 seconds. I think this allowed the 
 logic in assignToUser() to allow the userlimit to be surpassed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533163#comment-14533163
 ] 

Hadoop QA commented on YARN-3505:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  7s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 16s | The applied patch generated  3 
new checkstyle issues (total was 70, now 64). |
| {color:green}+1{color} | whitespace |   0m 17s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 12s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 24s | Tests passed in 
hadoop-yarn-server-common. |
| {color:red}-1{color} | yarn tests |   5m 58s | Tests failed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  63m 52s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 110m 37s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731215/YARN-3505.2.rebase.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8e991f4 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7764/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7764/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7764/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7764/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7764/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7764/console |


This message was automatically generated.

 Node's Log Aggregation Report with SUCCEED should not cached in RMApps
 --

 Key: YARN-3505
 URL: https://issues.apache.org/jira/browse/YARN-3505
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: log-aggregation
Affects Versions: 2.8.0
Reporter: Junping Du
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-3505.1.patch, YARN-3505.2.patch, 
 YARN-3505.2.rebase.patch


 Per discussions in YARN-1402, we shouldn't cache all node's log aggregation 
 reports in RMApps for always, especially for those finished with SUCCEED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533187#comment-14533187
 ] 

Wangda Tan commented on YARN-3362:
--

bq. Put some considerable amount of effort, dint workout. Hamlet should have 
add some kind of doc or a book, its seems like an enigma code to me. Will file 
a separate jira as its required in many places of CS page : Active Users Info 
and following queue, in between Dump scheduler log button and Application Queue 
, In CS Health Block

Thanks for looking at this, that sounds good, we can address of them together 
in a separated JIRA.

 Add node label usage in RM CapacityScheduler web UI
 ---

 Key: YARN-3362
 URL: https://issues.apache.org/jira/browse/YARN-3362
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, resourcemanager, webapp
Reporter: Wangda Tan
Assignee: Naganarasimha G R
 Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
 Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, CSWithLabelsView.png, 
 No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
 at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, 
 YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
 YARN-3362.20150507-1.patch, capacity-scheduler.xml


 We don't have node label usage in RM CapacityScheduler web UI now, without 
 this, user will be hard to understand what happened to nodes have labels 
 assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533156#comment-14533156
 ] 

Wangda Tan commented on YARN-3581:
--

Hi [~Naganarasimha],
Thanks for trying and thinking about this, assigning to you, please go ahead. 

#1/#3/#4 make sense to me

For #2, good point, I think we need add {{..}} for {{label(...),label(...)}}, 
like {{label(...),label(...)}}. This should in usage/help message as well. 
I'm open to change ( ) to [ ], but I'm not sure if other shell requires 
escape [..] too. For now I prefer to keep (...), not adding new syntax, but 
document it in help message.

In addition, similar to #2, -replaceLabelsOnNode has problem when admin 
specified multiple host= in a same line without using ... I think we should 
add using ... in help message as well. If admin forgot to add ..., we 
should handle it instead of ignore host= after first one.

Let me know your thoughts.

Thanks,
Wangda

 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
 -

 Key: YARN-3581
 URL: https://issues.apache.org/jira/browse/YARN-3581
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan

 In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make 
 RM can start with label-configured queue settings. After YARN-2918, we don't 
 need this option any more, admin can configure queue setting, start RM and 
 configure node label via RMAdminCLI without any error.
 In addition, this option is very restrictive, first it needs to run on the 
 same node where RM is running if admin configured to store labels in local 
 disk.
 Second, when admin run the option when RM is running, multiple process write 
 to a same file can happen, this could make node label store becomes invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI

2015-05-07 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3581:
-
Assignee: Naganarasimha G R  (was: Wangda Tan)

 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
 -

 Key: YARN-3581
 URL: https://issues.apache.org/jira/browse/YARN-3581
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R

 In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make 
 RM can start with label-configured queue settings. After YARN-2918, we don't 
 need this option any more, admin can configure queue setting, start RM and 
 configure node label via RMAdminCLI without any error.
 In addition, this option is very restrictive, first it needs to run on the 
 same node where RM is running if admin configured to store labels in local 
 disk.
 Second, when admin run the option when RM is running, multiple process write 
 to a same file can happen, this could make node label store becomes invalid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3593) Modifications in Node Labels Page for Partitions

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533234#comment-14533234
 ] 

Hadoop QA commented on YARN-3593:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 45s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |  53m 40s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 57s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731228/YARN-3593.20150507-1.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / daf3e4e |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7769/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7769/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7769/console |


This message was automatically generated.

 Modifications in Node Labels Page for Partitions
 

 Key: YARN-3593
 URL: https://issues.apache.org/jira/browse/YARN-3593
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: webapp
Affects Versions: 2.6.0, 2.7.0
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: NodeLabelsPageModifications.png, 
 YARN-3593.20150507-1.patch


 Need to support displaying the label type in Node Labels page and also 
 instead of NO_LABEL we need to show as DEFAULT_PARTITION 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533233#comment-14533233
 ] 

Hadoop QA commented on YARN-2821:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 49s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 22s | The applied patch generated  6 
new checkstyle issues (total was 151, now 155). |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 35s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | yarn tests |  15m 25s | Tests failed in 
hadoop-yarn-applications-distributedshell. |
| | |  50m 56s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels |
| Timed out tests | 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731235/YARN-2821.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / daf3e4e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7770/artifact/patchprocess/diffcheckstylehadoop-yarn-applications-distributedshell.txt
 |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7770/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7770/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7770/console |


This message was automatically generated.

 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Attachments: YARN-2821.002.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 

[jira] [Commented] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533304#comment-14533304
 ] 

Varun Saxena commented on YARN-644:
---

Fixed the tab issue and added tests.

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7

2015-05-07 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533075#comment-14533075
 ] 

Vinod Kumar Vavilapalli commented on YARN-2194:
---

Is there no API that we can use instead of spawning shells to set this up?

We should have some auto-detection to chose the right plugin for the right OS.

IAC, YARN-3443 changed the way resource isolation code is organized in the 
ResourceManager. YARN-3542 is migrating existing cgroups+cpu support to the new 
layout. We need to relook at this patch in light of those changes.

 Add Cgroup support for RedHat 7
 ---

 Key: YARN-2194
 URL: https://issues.apache.org/jira/browse/YARN-2194
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wei Yan
Assignee: Wei Yan
 Attachments: YARN-2194-1.patch


In previous versions of RedHat, we can build custom cgroup hierarchies 
 with use of the cgconfig command from the libcgroup package. From RedHat 7, 
 package libcgroup is deprecated and it is not recommended to use it since it 
 can easily create conflicts with the default cgroup hierarchy. The systemd is 
 provided and recommended for cgroup management. We need to add support for 
 this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533199#comment-14533199
 ] 

Wangda Tan commented on YARN-2918:
--

In addition, Test failure of TestFifoScheduler is not related to the patch (ran 
it locally).

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
  Labels: BB2015-05-TBR
 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-07 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533389#comment-14533389
 ] 

Jian He commented on YARN-3480:
---

[~hex108], thanks for your explanations.
I can see this will be a problem for long running apps, as the number of 
attempts will just keep increasing.
To be consistent with the attemptFailureValidityWindow for long running apps, 
instead of introducing a global hard limit, How about removing the attempt 
records that's beyond the validity window ?

 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
 YARN-3480.03.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3597) Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs

2015-05-07 Thread Gabor Liptak (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated YARN-3597:
---
Attachment: YARN-3597.patch

 Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs
 --

 Key: YARN-3597
 URL: https://issues.apache.org/jira/browse/YARN-3597
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Gabor Liptak
Priority: Trivial
 Attachments: YARN-3597.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1329) yarn-config.sh overwrites YARN_CONF_DIR indiscriminately

2015-05-07 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533468#comment-14533468
 ] 

Li Lu commented on YARN-1329:
-

Quick note: this patch no longer applies to trunk. The current yarn-config.sh 
does not perform {{export YARN_CONF_DIR}}. We may want to verify if the problem 
still exists. 

 yarn-config.sh overwrites YARN_CONF_DIR indiscriminately 
 -

 Key: YARN-1329
 URL: https://issues.apache.org/jira/browse/YARN-1329
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Reporter: Aaron Gottlieb
Assignee: haosdent
  Labels: BB2015-05-TBR, easyfix
 Attachments: YARN-1329.patch


 The script yarn-daemons.sh calls 
 {code}${HADOOP_LIBEXEC_DIR}/yarn-config.sh{code}
 yarn-config.sh overwrites any previously set value of environment variable 
 YARN_CONF_DIR starting at line 40:
 {code:title=yarn-config.sh|borderStyle=solid}
 #check to see if the conf dir is given as an optional argument
 if [ $# -gt 1 ]
 then
 if [ --config = $1 ]
 then
 shift
 confdir=$1
 shift
 YARN_CONF_DIR=$confdir
 fi
 fi
  
 # Allow alternate conf dir location.
 export YARN_CONF_DIR=${HADOOP_CONF_DIR:-$HADOOP_YARN_HOME/conf}
 {code}
 The last line should check for the existence of YARN_CONF_DIR first.
 {code}
 DEFAULT_CONF_DIR=${HADOOP_CONF_DIR:-$YARN_HOME/conf}
 export YARN_CONF_DIR=${YARN_CONF_DIR:-$DEFAULT_CONF_DIR}
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2421) CapacityScheduler still allocates containers to an app in the FINISHING state

2015-05-07 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533524#comment-14533524
 ] 

Chang Li commented on YARN-2421:


updated my patch, the current approach ignores allocate requests when app 
attempt is in final_saving, finishing or isAppFinalStateStored

 CapacityScheduler still allocates containers to an app in the FINISHING state
 -

 Key: YARN-2421
 URL: https://issues.apache.org/jira/browse/YARN-2421
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.4.1
Reporter: Thomas Graves
Assignee: Chang Li
 Attachments: YARN-2421.4.patch, yarn2421.patch, yarn2421.patch, 
 yarn2421.patch


 I saw an instance of a bad application master where it unregistered with the 
 RM but then continued to call into allocate.  The RMAppAttempt went to the 
 FINISHING state, but the capacity scheduler kept allocating it containers.   
 We should probably have the capacity scheduler check that the application 
 isn't in one of the terminal states before giving it containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3596) Fix the javadoc of DelegationTokenSecretManager in hadoop-common

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533544#comment-14533544
 ] 

Hadoop QA commented on YARN-3596:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 12s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  9s | The applied patch generated  1 
new checkstyle issues (total was 49, now 50). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 45s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | common tests |  23m 49s | Tests passed in 
hadoop-common. |
| | |  62m  7s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731284/YARN-3596.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b88700d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7781/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7781/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7781/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7781/console |


This message was automatically generated.

 Fix the javadoc of DelegationTokenSecretManager in hadoop-common
 

 Key: YARN-3596
 URL: https://issues.apache.org/jira/browse/YARN-3596
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Gabor Liptak
Priority: Trivial
 Attachments: YARN-3596.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533571#comment-14533571
 ] 

Sergey Shelukhin commented on YARN-3600:


[~vinodkv] ran into this recently... can you guys take a look

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Affects Version/s: 2.8.0

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Description: 
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
{noformat}
which obviously doesn't work


  was:
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: 
http://(snip):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is 
{noformat}http://cn042-10.l42scl.hortonworks.com:8088/cluster/N/A
{noformat}
which obviously doesn't work



 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated YARN-3600:
---
Description: 
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container link goes to {noformat}http://(snip RM host 
name):8088/cluster/N/A
{noformat}
which obviously doesn't work


  was:
Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
I have an application that ran fine for a while and then I yarn kill-ed it. Now 
when I go to the only app attempt URL (like so: http://(snip RM host 
name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)

I see:
AM Container:   container_1429683757595_0795_01_01
Node:   N/A 

and the container URL is {noformat}http://(snip RM host name):8088/cluster/N/A
{noformat}
which obviously doesn't work



 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: http://(snip RM host 
 name):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container link goes to {noformat}http://(snip RM host 
 name):8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-3134:
--
Attachment: 
hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out

I played with the latest patch locally, but I found 
ConcurrentModificationException issue is still not resolved. I attached my NM 
log. Please take a look.

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-07 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533604#comment-14533604
 ] 

Junping Du commented on YARN-3411:
--

Thank you, [~vrushalic]! :)

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, 
 YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-07 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533731#comment-14533731
 ] 

Jun Gong commented on YARN-3480:


[~jianhe] just catch your option. Do you mean that the configure value for max 
attempts stored in RMAppImpl and RMStateStore is not needed at all, we could 
just remove the attempts records that's beyond the validity window? If so, I 
will update the patch.

 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
 YARN-3480.03.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533594#comment-14533594
 ] 

Zhijie Shen edited comment on YARN-3134 at 5/7/15 11:47 PM:


I played with the latest patch locally, but I found 
ConcurrentModificationException issue is still not resolved. I attached my NM 
log. Please take a look.

My local environment: 1 HBase 0.98.11; 2. Phoenix 4.3.1


was (Author: zjshen):
I played with the latest patch locally, but I found 
ConcurrentModificationException issue is still not resolved. I attached my NM 
log. Please take a look.

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3584) [Log mesage correction] : MIssing space in Diagnostics message

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533635#comment-14533635
 ] 

Hudson commented on YARN-3584:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #178 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/178/])
YARN-3584. Fixed attempt diagnostics format shown on the UI. Contributed by 
nijel (jianhe: rev b88700dcd0b9aa47662009241dfb83bc4446548d)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java


 [Log mesage correction] : MIssing space in Diagnostics message
 --

 Key: YARN-3584
 URL: https://issues.apache.org/jira/browse/YARN-3584
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: nijel
Assignee: nijel
Priority: Trivial
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3584-1.patch, YARN-3584-2.patch


 For more detailed output, check application tracking page: 
 https://szxciitslx17640:26001/cluster/app/application_1430810985970_0020{color:red}Then{color},
  click on links to logs of each attempt.
 In this Then is not part of thr URL. Better to use a space in between so that 
 the URL can be copied directly for analysis



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533637#comment-14533637
 ] 

Hudson commented on YARN-3448:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #178 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/178/])
YARN-3448. Added a rolling time-to-live LevelDB timeline store implementation. 
Contributed by Jonathan Eagles. (zjshen: rev 
daf3e4ef8bf73cbe4a799d51b4765809cd81089f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestRollingLevelDBTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestRollingLevelDB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDBTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/util/LeveldbUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLeveldbTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelinePutResponse.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TimelineStoreTestUtils.java


 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 2.8.0

 Attachments: YARN-3448.1.patch, YARN-3448.10.patch, 
 YARN-3448.12.patch, YARN-3448.13.patch, YARN-3448.14.patch, 
 YARN-3448.15.patch, YARN-3448.16.patch, YARN-3448.17.patch, 
 YARN-3448.2.patch, YARN-3448.3.patch, YARN-3448.4.patch, YARN-3448.5.patch, 
 YARN-3448.7.patch, YARN-3448.8.patch, YARN-3448.9.patch


 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533656#comment-14533656
 ] 

Zhijie Shen commented on YARN-3134:
---

Noticed pom.xml is using 4.3.0 Phoenix. I retried with this version, the 
problem still happened.

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3599) Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533664#comment-14533664
 ] 

Hadoop QA commented on YARN-3599:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |   9m 40s | The applied patch generated  1  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 49s | The applied patch generated  1 
new checkstyle issues (total was 8, now 6). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m  5s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   3m  8s | Tests passed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |  52m 14s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  93m  3s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731288/YARN-3599.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4d9f9e5 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/7784/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7784/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7784/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7784/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7784/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7784/console |


This message was automatically generated.

 Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn
 --

 Key: YARN-3599
 URL: https://issues.apache.org/jira/browse/YARN-3599
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Gabor Liptak
Priority: Trivial
 Attachments: YARN-3599.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533670#comment-14533670
 ] 

Li Lu commented on YARN-3134:
-

Sure, I'll make those clean up changes. 

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533671#comment-14533671
 ] 

Li Lu commented on YARN-3134:
-

Looking into this. Seems like we need more fine tunes for the synchronizations. 

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2421) CapacityScheduler still allocates containers to an app in the FINISHING state

2015-05-07 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-2421:
---
Affects Version/s: (was: 2.4.1)

 CapacityScheduler still allocates containers to an app in the FINISHING state
 -

 Key: YARN-2421
 URL: https://issues.apache.org/jira/browse/YARN-2421
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Thomas Graves
Assignee: Chang Li
 Attachments: YARN-2421.4.patch, yarn2421.patch, yarn2421.patch, 
 yarn2421.patch


 I saw an instance of a bad application master where it unregistered with the 
 RM but then continued to call into allocate.  The RMAppAttempt went to the 
 FINISHING state, but the capacity scheduler kept allocating it containers.   
 We should probably have the capacity scheduler check that the application 
 isn't in one of the terminal states before giving it containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533716#comment-14533716
 ] 

Hudson commented on YARN-2918:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7764 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7764/])
YARN-2918. RM should not fail on startup if queue's configured labels do not 
exist in cluster-node-labels. Contributed by Wangda Tan (jianhe: rev 
f489a4ec969f3727d03c8e85d51af1018fc0b2a1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java


 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2421) CapacityScheduler still allocates containers to an app in the FINISHING state

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533727#comment-14533727
 ] 

Hadoop QA commented on YARN-2421:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 51s | The applied patch generated  2 
new checkstyle issues (total was 30, now 31). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |  52m 19s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 41s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731296/YARN-2421.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4d9f9e5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7785/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7785/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7785/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7785/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7785/console |


This message was automatically generated.

 CapacityScheduler still allocates containers to an app in the FINISHING state
 -

 Key: YARN-2421
 URL: https://issues.apache.org/jira/browse/YARN-2421
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Thomas Graves
Assignee: Chang Li
 Attachments: YARN-2421.4.patch, yarn2421.patch, yarn2421.patch, 
 yarn2421.patch


 I saw an instance of a bad application master where it unregistered with the 
 RM but then continued to call into allocate.  The RMAppAttempt went to the 
 FINISHING state, but the capacity scheduler kept allocating it containers.   
 We should probably have the capacity scheduler check that the application 
 isn't in one of the terminal states before giving it containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-2918:
--
Labels:   (was: BB2015-05-TBR)

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
 Fix For: 2.8.0

 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3597) Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533710#comment-14533710
 ] 

Hadoop QA commented on YARN-3597:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 20s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  0s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 40s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  7s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 28s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 165m 32s | Tests failed in hadoop-hdfs. |
| | | 208m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.tracing.TestTraceAdmin |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731285/YARN-3597.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / b88700d |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7780/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7780/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7780/console |


This message was automatically generated.

 Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs
 --

 Key: YARN-3597
 URL: https://issues.apache.org/jira/browse/YARN-3597
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Gabor Liptak
Priority: Trivial
 Attachments: YARN-3597.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1427) yarn-env.cmd should have the analog comments that are in yarn-env.sh

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533711#comment-14533711
 ] 

Hadoop QA commented on YARN-1427:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731303/YARN-1427-trunk.2.patch
 |
| Optional Tests |  |
| git revision | trunk / f489a4e |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7787/console |


This message was automatically generated.

 yarn-env.cmd should have the analog comments that are in yarn-env.sh
 

 Key: YARN-1427
 URL: https://issues.apache.org/jira/browse/YARN-1427
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Zhijie Shen
  Labels: BB2015-05-TBR, newbie, windows
 Attachments: YARN-1427-trunk.2.patch, YARN-1427.1.patch


 There're the paragraphs of about RM/NM env vars (probably AHS as well soon) 
 in yarn-env.sh. Should the windows version script provide the similar 
 comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new)

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533742#comment-14533742
 ] 

Hadoop QA commented on YARN-3276:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 48s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 54s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   1m 23s | The applied patch generated 
13 release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 16s | The applied patch generated  3 
new checkstyle issues (total was 76, now 79). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 2  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 56s | The patch appears to introduce 4 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 22s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| | |  42m 37s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-api |
|  |  
org.apache.hadoop.yarn.api.records.timelineservice.TimelineMetric$1.compare(Long,
 Long) negates the return value of Long.compareTo(Long)  At 
TimelineMetric.java:value of Long.compareTo(Long)  At TimelineMetric.java:[line 
47] |
| FindBugs | module:hadoop-yarn-common |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.builder;
 locked 92% of time  Unsynchronized access at AllocateResponsePBImpl.java:92% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.proto;
 locked 94% of time  Unsynchronized access at AllocateResponsePBImpl.java:94% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
|  |  Inconsistent synchronization of 
org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.viaProto;
 locked 94% of time  Unsynchronized access at AllocateResponsePBImpl.java:94% 
of time  Unsynchronized access at AllocateResponsePBImpl.java:[line 391] |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731302/YARN-3276-YARN-2928.v3.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / d4a2362 |
| Release Audit | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/patchReleaseAuditProblems.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-common.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7786/console |


This message was automatically generated.

 Refactor and fix null casting in some map cast for TimelineEntity (old and 
 new)
 ---

 Key: YARN-3276
 URL: https://issues.apache.org/jira/browse/YARN-3276
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Junping Du
Assignee: Junping Du
 Attachments: YARN-3276-YARN-2928.v3.patch, YARN-3276-v2.patch, 
 YARN-3276-v3.patch, YARN-3276.patch



[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533600#comment-14533600
 ] 

Junping Du commented on YARN-3134:
--

bq. We've also got some code cleanup work to do, but I put them in a priority 
lower than getting the performance evaluation done for now. 
I disagree. Actually, it usually takes more time for reviewer to identify these 
code style issues than simply fix it. Please respect the effort unless you have 
different opinions against these comments.

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf, 
 hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3523) Cleanup ResourceManagerAdministrationProtocol interface audience

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533638#comment-14533638
 ] 

Hudson commented on YARN-3523:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #178 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/178/])
YARN-3523. Cleanup ResourceManagerAdministrationProtocol interface audience. 
Contributed by Naganarasimha G R (junping_du: rev 
8e991f4b1d7226fdcd75c5dc9fe6e5ce721679b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java
* hadoop-yarn-project/CHANGES.txt


 Cleanup ResourceManagerAdministrationProtocol interface audience
 

 Key: YARN-3523
 URL: https://issues.apache.org/jira/browse/YARN-3523
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3523.20150422-1.patch, YARN-3523.20150504-1.patch, 
 YARN-3523.20150505-1.patch


 I noticed ResourceManagerAdministrationProtocol has @Private audience for the 
 class and @Public audience for methods. It doesn't make sense to me. We 
 should make class audience and methods audience consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2015-05-07 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533632#comment-14533632
 ] 

Wangda Tan commented on YARN-1680:
--

Had some offline discussion with [~jianhe] and [~cwelch]. Some takings from my 
side:
- If we want to get accurate headroom for application which has blacklisted 
nodes, it looks unavoidable to get sum(app.blacklisted_nodes.avail) while 
calculation headroom for the app. This requires when a node doing heartbeat 
with changed available resource, all apps blacklisted the node need to be 
notified, when there're lots of application blacklisted large amount of nodes, 
performance regression could happen.
- If we consider sum(app.blacklisted_nodes.total) instead of considering 
sum(app.blacklisted_nodes.avail), headroom for app could be under estimated, 
this could lead to app with blacklisted nodes always receive 0 headroom when a 
large cluster with highly resource utilization (like 99%). 

Some fallbacks strategies:
# Only do accurate headroom calculation when there're not too much blacklisted 
nodes as well as apps with blacklisted nodes.
# Tolerance under estimation of headroom

Some alternatives:
- MAPREDUCE-6302 is targeting to preempt reducer even if we reported inaccurate 
headroom for apps. I think the approach looks good to me.
- Move headroom calculation to application side, I think now we cannot do it at 
least for now. Application will only receive updated NodeReport from when node 
changes heathy status instead of regular heartbeat. We cannot send so much data 
to AM during heartbeat.

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Craig Welch
 Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
 YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2015-05-07 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533684#comment-14533684
 ] 

Craig Welch commented on YARN-1680:
---

bq. This requires when a node doing heartbeat with changed available resource, 
all apps blacklisted the node need to be notified

Well, that's not quite so.  From what we were talking about, it means that the 
blacklist deduction can't be a fixed amount but that it needs to be calculated 
by looking at the unused resource of the blacklisted nodes during headroom 
calculation.  The rest of the above proposal for detecting changes, etc, works, 
but instead of a static deduction value we would need a reference to the 
blacklisted nodes for the app and look at their unused resources during the 
apps headroom calculation, so there is that cost, but it's not related to the 
heartbeat or a notification as such

bq. headroom for app could be under estimated

I think, generally, we should not take an approach which will 
underestimate/underutilize if we have 6302 to fall back on.  If we don't, then 
we might want to do it only if we decide not to do the accurate calculation in 
some cases based on limits (see immediately below), but not as a matter of 
course.

bq. Only do accurate headroom calculation when there're not too much 
blacklisted nodes as well as apps with blacklisted nodes.

I think if we put a limit on it, it should be a purely local decision, to only 
do the calculation with  x blacklisted nodes for an app, which we would expect 
to rarely be an issue.  There is a potential for performance issues here, but 
we don't really know how great a concern it is.

bq.  MAPREDUCE-6302 is targeting to preempt reducer even if we reported 
inaccurate headroom for apps. I think the approach looks good to me

I think that may work as a fallback option for MR, assuming it works out 
without issue, if we decide to not do the proper headroom calculation in some 
cases, but that's MR specific so it won't help non MR apps, and it has the 
issues I brought up before with performance degradation vs the proper headroom 
calculation.  For these reasons I don't think it's a substitute for fixing this 
issue overall, it may be a fallback option if we limit the cases where we do 
the proper adjustment.

bq. Move headroom calculation to application side, I think now we cannot do it 
at least for now...Application will only receive updated NodeReport from when 
node changes heathy status instead of regular heartbeat

Well, in some sense that works OK for this because we really only need to know 
about those changes in node status status wrt the blacklist to detect 
recalculation changes with the approach proposed above.  The problem is that we 
will also need a way to query for current usage per node while doing the 
calculation, I don't know if an efficient call for that exists (it would 
ideally be batch for N nodes where we would ask for all the blacklisted nodes 
at once.)  There is also the broader issue that we don't seem to have a single 
entry point client-side for doing this right now, so we would need to touch a 
few points to add a library/something of that nature to do this, and for AM's 
we may not be aware of/that are not part of the core, they would have to 
potentially do some integration to get this.

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Craig Welch
 Attachments: YARN-1680-WIP.patch, YARN-1680-v2.patch, 
 YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS

2015-05-07 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533697#comment-14533697
 ] 

Sangjin Lee commented on YARN-3044:
---

Sorry Naga it's taking a while. I'm going to review it tonight and comment.

 [Event producers] Implement RM writing app lifecycle events to ATS
 --

 Key: YARN-3044
 URL: https://issues.apache.org/jira/browse/YARN-3044
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Naganarasimha G R
  Labels: BB2015-05-TBR
 Attachments: YARN-3044-YARN-2928.004.patch, 
 YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, 
 YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, 
 YARN-3044.20150416-1.patch


 Per design in YARN-2928, implement RM writing app lifecycle events to ATS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-07 Thread Jun Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533705#comment-14533705
 ] 

Jun Gong commented on YARN-3480:


[~jianhe], thanks for your comments and suggestion.

The latest patch(YARN-3480.03.patch) have been working as expected: just remove 
the attempt records that's beyond the validity window. For the special case 
that the validity window is -1, it might be better to remove attempts until the 
number of attempts is less than hard limit, because 
yarn.resourcemanager.am.max-attempts might be very large(like our scenario) . 
What's you option?

 Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable
 

 Key: YARN-3480
 URL: https://issues.apache.org/jira/browse/YARN-3480
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3480.01.patch, YARN-3480.02.patch, 
 YARN-3480.03.patch


 When RM HA is enabled and running containers are kept across attempts, apps 
 are more likely to finish successfully with more retries(attempts), so it 
 will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However 
 it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make 
 RM recover process much slower. It might be better to set max attempts to be 
 stored in RMStateStore.
 BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to 
 a small value, retried attempts might be very large. So we need to delete 
 some attempts stored in RMStateStore and RMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533329#comment-14533329
 ] 

Hadoop QA commented on YARN-2918:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 35s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  5 
new checkstyle issues (total was 363, now 367). |
| {color:red}-1{color} | whitespace |   0m 12s | The patch has 28  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 15s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |  52m 56s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 22s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731243/YARN-2918.3.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ab5058d |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/7772/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7772/artifact/patchprocess/whitespace.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7772/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7772/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7772/console |


This message was automatically generated.

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
  Labels: BB2015-05-TBR
 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - 

[jira] [Updated] (YARN-3358) Audit log not present while refreshing Service ACLs'

2015-05-07 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3358:
---
Attachment: YARN-3358.01.patch

 Audit log not present while refreshing Service ACLs'
 

 Key: YARN-3358
 URL: https://issues.apache.org/jira/browse/YARN-3358
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
Priority: Minor
 Attachments: YARN-3358.01.patch


 There should be a success audit log in AdminService#refreshServiceAcls



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533393#comment-14533393
 ] 

Li Lu commented on YARN-3134:
-

I just opened YARN-3595 to trace all connection cache related discussions. I 
wrote a summary for the background of this problem. Please feel free to add 
more. 

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
  Labels: BB2015-05-TBR
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533397#comment-14533397
 ] 

Hudson commented on YARN-3448:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #188 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/188/])
YARN-3448. Added a rolling time-to-live LevelDB timeline store implementation. 
Contributed by Jonathan Eagles. (zjshen: rev 
daf3e4ef8bf73cbe4a799d51b4765809cd81089f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TimelineStoreTestUtils.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestRollingLevelDB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestLeveldbTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timeline/TimelinePutResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/util/LeveldbUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDBTimelineStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/TimelineDataManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/RollingLevelDB.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/TestRollingLevelDBTimelineStore.java


 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Fix For: 2.8.0

 Attachments: YARN-3448.1.patch, YARN-3448.10.patch, 
 YARN-3448.12.patch, YARN-3448.13.patch, YARN-3448.14.patch, 
 YARN-3448.15.patch, YARN-3448.16.patch, YARN-3448.17.patch, 
 YARN-3448.2.patch, YARN-3448.3.patch, YARN-3448.4.patch, YARN-3448.5.patch, 
 YARN-3448.7.patch, YARN-3448.8.patch, YARN-3448.9.patch


 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3584) [Log mesage correction] : MIssing space in Diagnostics message

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533394#comment-14533394
 ] 

Hudson commented on YARN-3584:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #188 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/188/])
YARN-3584. Fixed attempt diagnostics format shown on the UI. Contributed by 
nijel (jianhe: rev b88700dcd0b9aa47662009241dfb83bc4446548d)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* hadoop-yarn-project/CHANGES.txt


 [Log mesage correction] : MIssing space in Diagnostics message
 --

 Key: YARN-3584
 URL: https://issues.apache.org/jira/browse/YARN-3584
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: nijel
Assignee: nijel
Priority: Trivial
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3584-1.patch, YARN-3584-2.patch


 For more detailed output, check application tracking page: 
 https://szxciitslx17640:26001/cluster/app/application_1430810985970_0020{color:red}Then{color},
  click on links to logs of each attempt.
 In this Then is not part of thr URL. Better to use a space in between so that 
 the URL can be copied directly for analysis



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3523) Cleanup ResourceManagerAdministrationProtocol interface audience

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533398#comment-14533398
 ] 

Hudson commented on YARN-3523:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #188 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/188/])
YARN-3523. Cleanup ResourceManagerAdministrationProtocol interface audience. 
Contributed by Naganarasimha G R (junping_du: rev 
8e991f4b1d7226fdcd75c5dc9fe6e5ce721679b9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java
* hadoop-yarn-project/CHANGES.txt


 Cleanup ResourceManagerAdministrationProtocol interface audience
 

 Key: YARN-3523
 URL: https://issues.apache.org/jira/browse/YARN-3523
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
  Labels: newbie
 Fix For: 2.8.0

 Attachments: YARN-3523.20150422-1.patch, YARN-3523.20150504-1.patch, 
 YARN-3523.20150505-1.patch


 I noticed ResourceManagerAdministrationProtocol has @Private audience for the 
 class and @Public audience for methods. It doesn't make sense to me. We 
 should make class audience and methods audience consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3134:

Labels:   (was: BB2015-05-TBR)

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3134:

Attachment: (was: YARN-3134-YARN-2928.006.patch)

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134DataSchema.pdf


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-644) Basic null check is not performed on passed in arguments before using them in ContainerManagerImpl.startContainer

2015-05-07 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533448#comment-14533448
 ] 

Li Lu commented on YARN-644:


Hi [~varun_saxena], thanks for the patch! Asserting on the content of the 
exception message may unnecessarily couple the exception handling message with 
the test, which makes future changes harder. Maybe we'd like to provide some 
central place for those exception message constants? Thanks! 

 Basic null check is not performed on passed in arguments before using them in 
 ContainerManagerImpl.startContainer
 -

 Key: YARN-644
 URL: https://issues.apache.org/jira/browse/YARN-644
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Omkar Vinit Joshi
Assignee: Varun Saxena
Priority: Minor
  Labels: BB2015-05-TBR, newbie
 Attachments: YARN-644.001.patch, YARN-644.002.patch, YARN-644.03.patch


 I see that validation/ null check is not performed on passed in parameters. 
 Ex. tokenId.getContainerID().getApplicationAttemptId() inside 
 ContainerManagerImpl.authorizeRequest()
 I guess we should add these checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend

2015-05-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533481#comment-14533481
 ] 

Hadoop QA commented on YARN-3134:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 51s | Pre-patch YARN-2928 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 34s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  25m 54s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12731281/YARN-3134-YARN-2928.006.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / d4a2362 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7782/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7782/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7782/console |


This message was automatically generated.

 [Storage implementation] Exploiting the option of using Phoenix to access 
 HBase backend
 ---

 Key: YARN-3134
 URL: https://issues.apache.org/jira/browse/YARN-3134
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Zhijie Shen
Assignee: Li Lu
 Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, 
 YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, 
 YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, 
 YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, 
 YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, 
 YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, 
 YARN-3134DataSchema.pdf


 Quote the introduction on Phoenix web page:
 {code}
 Apache Phoenix is a relational database layer over HBase delivered as a 
 client-embedded JDBC driver targeting low latency queries over HBase data. 
 Apache Phoenix takes your SQL query, compiles it into a series of HBase 
 scans, and orchestrates the running of those scans to produce regular JDBC 
 result sets. The table metadata is stored in an HBase table and versioned, 
 such that snapshot queries over prior versions will automatically use the 
 correct schema. Direct use of the HBase API, along with coprocessors and 
 custom filters, results in performance on the order of milliseconds for small 
 queries, or seconds for tens of millions of rows.
 {code}
 It may simply our implementation read/write data from/to HBase, and can 
 easily build index and compose complex query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (YARN-3600) AM container link is broken (on a killed application, at least)

2015-05-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin moved HIVE-10654 to YARN-3600:
---

 Component/s: (was: Web UI)
Target Version/s: 2.8.0
 Key: YARN-3600  (was: HIVE-10654)
 Project: Hadoop YARN  (was: Hive)

 AM container link is broken (on a killed application, at least)
 ---

 Key: YARN-3600
 URL: https://issues.apache.org/jira/browse/YARN-3600
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Sergey Shelukhin

 Running some fairly recent (couple weeks ago) version of 2.8.0-SNAPSHOT. 
 I have an application that ran fine for a while and then I yarn kill-ed it. 
 Now when I go to the only app attempt URL (like so: 
 http://(snip):8088/cluster/appattempt/appattempt_1429683757595_0795_01)
 I see:
 AM Container: container_1429683757595_0795_01_01
 Node: N/A 
 and the container URL is 
 {noformat}http://cn042-10.l42scl.hortonworks.com:8088/cluster/N/A
 {noformat}
 which obviously doesn't work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3385) Race condition: KeeperException$NoNodeException will cause RM shutdown during ZK node deletion.

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532429#comment-14532429
 ] 

Hudson commented on YARN-3385:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #187 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/187/])
YARN-3385. Fixed a race-condition in ResourceManager's ZooKeeper based 
state-store to avoid crashing on duplicate deletes. Contributed by Zhihai Xu. 
(vinodkv: rev 4c7b9b6abe2452c9752a11214762be2e7665fb32)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


 Race condition: KeeperException$NoNodeException will cause RM shutdown during 
 ZK node deletion.
 ---

 Key: YARN-3385
 URL: https://issues.apache.org/jira/browse/YARN-3385
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
  Labels: BB2015-05-TBR
 Fix For: 2.7.1

 Attachments: YARN-3385.000.patch, YARN-3385.001.patch, 
 YARN-3385.002.patch, YARN-3385.003.patch, YARN-3385.004.patch


 Race condition: KeeperException$NoNodeException will cause RM shutdown during 
 ZK node deletion(Op.delete).
 The race condition is similar as YARN-3023.
 since the race condition exists for ZK node creation, it should also exist 
 for  ZK node deletion.
 We see this issue with the following stack trace:
 {code}
 2015-03-17 19:18:58,958 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a 
 org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type 
 STATE_STORE_OP_FAILED. Cause:
 org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
   at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:945)
   at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:911)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:857)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$4.run(ZKRMStateStore.java:854)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:973)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:992)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:854)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.removeApplicationStateInternal(ZKRMStateStore.java:647)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:691)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:766)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:761)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
   at java.lang.Thread.run(Thread.java:745)
 2015-03-17 19:18:58,959 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
 status 1
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3577) Misspelling of threshold in log4j.properties for tests

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532424#comment-14532424
 ] 

Hudson commented on YARN-3577:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #187 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/187/])
YARN-3577. Misspelling of threshold in log4j.properties for tests. Contributed 
by Brahma Reddy Battula. (aajisaka: rev 
918af8efff805d8f204ecfa6fe12c290a7a8e509)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/log4j.properties
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/resources/log4j.properties
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/resources/log4j.properties


 Misspelling of threshold in log4j.properties for tests
 --

 Key: YARN-3577
 URL: https://issues.apache.org/jira/browse/YARN-3577
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Minor
 Fix For: 2.8.0

 Attachments: YARN-3577.patch


 log4j.properties file for test contains misspelling log4j.threshhold.
 We should use log4j.threshold correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3491) PublicLocalizer#addResource is too slow.

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532420#comment-14532420
 ] 

Hudson commented on YARN-3491:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #187 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/187/])
YARN-3491. PublicLocalizer#addResource is too slow. (zxu via rkanter) (rkanter: 
rev b72507810aece08e17ab4b5aae1f7eae1fe98609)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDirectoryCollection.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DirectoryCollection.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


 PublicLocalizer#addResource is too slow.
 

 Key: YARN-3491
 URL: https://issues.apache.org/jira/browse/YARN-3491
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3491.000.patch, YARN-3491.001.patch, 
 YARN-3491.002.patch, YARN-3491.003.patch, YARN-3491.004.patch


 Based on the profiling, The bottleneck in PublicLocalizer#addResource is 
 getInitializedLocalDirs. getInitializedLocalDirs call checkLocalDir.
 checkLocalDir is very slow which takes about 10+ ms.
 The total delay will be approximately number of local dirs * 10+ ms.
 This delay will be added for each public resource localization.
 Because PublicLocalizer#addResource is slow, the thread pool can't be fully 
 utilized. Instead of doing public resource localization in 
 parallel(multithreading), public resource localization is serialized most of 
 the time.
 And also PublicLocalizer#addResource is running in Dispatcher thread, 
 So the Dispatcher thread will be blocked by PublicLocalizer#addResource for 
 long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.

2015-05-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532418#comment-14532418
 ] 

Hudson commented on YARN-3243:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #187 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/187/])
YARN-3243. Moving CHANGES.txt entry to the right release. (vinodkv: rev 
185e63a72638f01bb27e790c8dedf458923a2cae)
* hadoop-yarn-project/CHANGES.txt


 CapacityScheduler should pass headroom from parent to children to make sure 
 ParentQueue obey its capacity limits.
 -

 Key: YARN-3243
 URL: https://issues.apache.org/jira/browse/YARN-3243
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
 YARN-3243.4.patch, YARN-3243.5.patch


 Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
 its capacity limits, for example:
 1) When allocating container of a parent queue, it will only check 
 parentQueue.usage  parentQueue.max. If leaf queue allocated a container.size 
  (parentQueue.max - parentQueue.usage), parent queue can excess its max 
 resource limit, as following example:
 {code}
 A  (usage=54, max=55)
/ \
   A1 A2 (usage=1, max=55)
 (usage=53, max=53)
 {code}
 Queue-A2 is able to allocate container since its usage  max, but if we do 
 that, A's usage can excess A.max.
 2) When doing continous reservation check, parent queue will only tell 
 children you need unreserve *some* resource, so that I will less than my 
 maximum resource, but it will not tell how many resource need to be 
 unreserved. This may lead to parent queue excesses configured maximum 
 capacity as well.
 With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
 *here is my proposal*:
 - ParentQueue will set its children's ResourceUsage.headroom, which means, 
 *maximum resource its children can allocate*.
 - ParentQueue will set its children's headroom to be (saying parent's name is 
 qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
 ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
 parent).
 - {{needToUnReserve}} is not necessary, instead, children can get how much 
 resource need to be unreserved to keep its parent's resource limit.
 - More over, with this, YARN-3026 will make a clear boundary between 
 LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >