[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage

2015-08-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704368#comment-14704368
 ] 

Varun Saxena commented on YARN-4053:


bq. it might be good to restrict the numeric types the metric will support. 
long and double sounds good to me. Can add verification as you said.

bq. HBase already provides a facility to encode and decode between numbers and 
bytes
Yes I know. As I had to append one byte in front of the byte array, I moved the 
logic in Bytes.toBytes to a separate method. This was done to avoid creation of 
2 byte arrays(one inside Bytes.toBytes and one in ATS code) and henceforth 
copying over result from Bytes.toBytes to the byte array created inside ATS 
code. Although this is just 8 bytes. So maybe can do above.

bq. Also, instead of encoding the info whether this is an integral type vs. 
floating type into the value, it would be better to have this information in 
the column qualifier. 
I see some issue in having this info in column qualifier. Because certain HBase 
filters like SingleColumnValueFilter require exact column qualifier name. So we 
will have to again guess about the type(similar to current patch) when we use 
it. 
Probably we can discuss this offline and conclude there. Will send a mail.




 Change the way metric values are stored in HBase Storage
 

 Key: YARN-4053
 URL: https://issues.apache.org/jira/browse/YARN-4053
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-4053-YARN-2928.01.patch


 Currently HBase implementation uses GenericObjectMapper to convert and store 
 values in backend HBase storage. This converts everything into a string 
 representation(ASCII/UTF-8 encoded byte array).
 While this is fine in most cases, it does not quite serve our use case for 
 metrics. 
 So we need to decide how are we going to encode and decode metric values and 
 store them in HBase.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4044) Running applications information changes such as movequeue is not published to TimeLine server

2015-08-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706224#comment-14706224
 ] 

Rohith Sharma K S commented on YARN-4044:
-

Thanks [~sunilg] for the patch.. The patch mostly looks good to me.. Have you 
verified in the real cluster?

 Running applications information changes such as movequeue is not published 
 to TimeLine server
 --

 Key: YARN-4044
 URL: https://issues.apache.org/jira/browse/YARN-4044
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.7.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-4044.patch


 SystemMetricsPublisher need to expose an appUpdated api to update any change 
 for a running application.
 Events can be 
   - change of queue for a running application.
 - change of application priority for a running application.
 This ticket intends to handle both RM and timeline side changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4044) Running applications information changes such as movequeue is not published to TimeLine server

2015-08-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706234#comment-14706234
 ] 

Sunil G commented on YARN-4044:
---

Thank you [~rohithsharma]
Yes. I have verified this in real cluster. I will upload few screen shots later.

 Running applications information changes such as movequeue is not published 
 to TimeLine server
 --

 Key: YARN-4044
 URL: https://issues.apache.org/jira/browse/YARN-4044
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, timelineserver
Affects Versions: 2.7.0
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-4044.patch


 SystemMetricsPublisher need to expose an appUpdated api to update any change 
 for a running application.
 Events can be 
   - change of queue for a running application.
 - change of application priority for a running application.
 This ticket intends to handle both RM and timeline side changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4068) Support appUpdated event in TimelineV2 to publish details for movetoqueue, change in priority

2015-08-20 Thread Sunil G (JIRA)
Sunil G created YARN-4068:
-

 Summary: Support appUpdated event in TimelineV2 to publish details 
for movetoqueue, change in priority
 Key: YARN-4068
 URL: https://issues.apache.org/jira/browse/YARN-4068
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Reporter: Sunil G
Assignee: Sunil G


YARN-4044 supports appUpdated event changes to TimelineV1. This jira is to 
track and port appUpdated changes in V2 for
- movetoqueue
- updateAppPriority



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3986) getTransferredContainers in AbstractYarnScheduler should be present in YarnScheduler interface instead

2015-08-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706264#comment-14706264
 ] 

Hudson commented on YARN-3986:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8334 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8334/])
YARN-3986. getTransferredContainers in AbstractYarnScheduler should be present 
in YarnScheduler interface (rohithsharmaks: rev 
22de7c1dca1be63d523de833163ae51bfe638a79)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
* hadoop-yarn-project/CHANGES.txt


 getTransferredContainers in AbstractYarnScheduler should be present in 
 YarnScheduler interface instead
 --

 Key: YARN-3986
 URL: https://issues.apache.org/jira/browse/YARN-3986
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.7.0
Reporter: Varun Saxena
Assignee: Varun Saxena
 Fix For: 2.8.0

 Attachments: YARN-3986.01.patch, YARN-3986.02.patch, 
 YARN-3986.03.patch


 Currently getTransferredContainers is present in {{AbstractYarnScheduler}}.
 *But in ApplicationMasterService, while registering AM, we are calling this 
 method by typecasting it to AbstractYarnScheduler, which is incorrect.*
 This method should be moved to YarnScheduler.
 Because if a custom scheduler is to be added, it will implement 
 YarnScheduler, not AbstractYarnScheduler.
 As ApplicationMasterService is calling getTransferredContainers by 
 typecasting it to AbstractYarnScheduler, it is imposing an indirect 
 dependency on AbstractYarnScheduler for any pluggable custom scheduler.
 We can move the method to YarnScheduler and leave the definition in 
 AbstractYarnScheduler as it is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-20 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706052#comment-14706052
 ] 

Li Lu commented on YARN-3862:
-

bq.  it would be useful to maintain separation between limiting the contents 
that are returned (akin to contents of SELECT in SQL) and limiting the rows 
that are selected (akin to the WHERE clause in SQL).
I agree we should distinguish those two use cases. Restricting our filters to 
be predicates on rows will work perfectly for relational databases (and launch 
SQL queries), but if we storage data in our current fashion, we may also need 
to dynamically filter some columns I assume? For example, we may have a 
column filter that selects all configs that starts with 
yarn.timelineservice.. I think most of these column filters will work on 
column qualifiers but not the values. 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705986#comment-14705986
 ] 

Sangjin Lee commented on YARN-3862:
---

Sorry it took me awhile to get around to looking at this.

As for the timeline filters, strictly speaking these are filters that filter 
based on the column qualifiers, and *not* on the values, right? Or are we 
combining both types of filtering here? IMO, it would be good to limit this to 
filtering of columns only for column qualifiers and not do the values. I think 
those 2 things are conceptually separate, and would cause confusion if they're 
mixed.

The reason I ask that is the patch has comparison filters 
({{TimelineCompareFilter}}) and operators that are related to comparisons. I'm 
not sure how they relate to the filtering based on the column qualifiers. So 
far we're talking about prefix match for the most part...

On a similar note, how about the filter based on the limit as suggested by 
[~gtCarrera9]? Are we also mixing concepts there? The filters that are 
mentioned here do not select rows but rather pick out *contents* to return 
(i.e. columns or cells), whereas the limit filter would be selecting rows. I 
chatted with Joep on this, and I personally feel that it would be useful to 
maintain separation between limiting the contents that are returned (akin to 
contents of SELECT in SQL) and limiting the rows that are selected (akin to the 
WHERE clause in SQL).

Thoughts?

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler

2015-08-20 Thread Johan Gustavsson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706035#comment-14706035
 ] 

Johan Gustavsson commented on YARN-4066:


As I done seem to be able to edit the above comment and the tree ended up weird 
I'll repast it below

root:
1
q1:
 veryhigh
 high
 default
 low
 verylow


 Large number of queues choke fair scheduler
 ---

 Key: YARN-4066
 URL: https://issues.apache.org/jira/browse/YARN-4066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.1
Reporter: Johan Gustavsson
 Attachments: yarn-4066-1.patch


 Due to synchronization and all the loops performed during queue creation, 
 setting a large amount of queues (12000+) will completely choke the 
 scheduler. To deal with this some optimization to 
 QueueManager.updateAllocationConfiguration(AllocationConfiguration 
 queueConf) should be done to reduce the amount of unnesecary loops. The 
 attached patch have been tested to work with atleast 96000 queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-20 Thread Hong Zhiguo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Zhiguo updated YARN-4024:
--
Attachment: YARN-4024-v6.patch

the findbugs warning is about unchecked rawtypes in AMLivelinessMonitor.java. 
I fixed it in the v6 patch. 

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, 
 YARN-4024-v6.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat

2015-08-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705999#comment-14705999
 ] 

Wangda Tan commented on YARN-4055:
--

Merged this to branch-2

 Report node resource utilization in heartbeat
 -

 Key: YARN-4055
 URL: https://issues.apache.org/jira/browse/YARN-4055
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.1
Reporter: Inigo Goiri
Assignee: Inigo Goiri
 Fix For: 2.8.0

 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch


 Send the resource utilization from the node (obtained in the 
 NodeResourceMonitor) to the RM in the heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2015-08-20 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706013#comment-14706013
 ] 

Naganarasimha G R commented on YARN-2923:
-

Thanks [~leftnoteasy]  [~vinodkv], for review and committing this jira !

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, 
 YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, 
 YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch, 
 YARN-2923.20150818-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler

2015-08-20 Thread Johan Gustavsson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706031#comment-14706031
 ] 

Johan Gustavsson commented on YARN-4066:


Basically its a tree as follows ranging from 1 to 16000. For each user group 
there is one general queue and one with weight divided sub queues.

root
  - 1
  - q1
   - veryhigh
   - high
   - default
   - low
   - verylow


 Large number of queues choke fair scheduler
 ---

 Key: YARN-4066
 URL: https://issues.apache.org/jira/browse/YARN-4066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.1
Reporter: Johan Gustavsson
 Attachments: yarn-4066-1.patch


 Due to synchronization and all the loops performed during queue creation, 
 setting a large amount of queues (12000+) will completely choke the 
 scheduler. To deal with this some optimization to 
 QueueManager.updateAllocationConfiguration(AllocationConfiguration 
 queueConf) should be done to reduce the amount of unnesecary loops. The 
 attached patch have been tested to work with atleast 96000 queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3603) Application Attempts page confusing

2015-08-20 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3603:
--
Attachment: 0003-YARN-3603.patch

Rebasing patch against latest trunk.

 Application Attempts page confusing
 ---

 Key: YARN-3603
 URL: https://issues.apache.org/jira/browse/YARN-3603
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 2.8.0
Reporter: Thomas Graves
Assignee: Sunil G
 Attachments: 0001-YARN-3603.patch, 0002-YARN-3603.patch, 
 0003-YARN-3603.patch, ahs1.png


 The application attempts page 
 (http://RM:8088/cluster/appattempt/appattempt_1431101480046_0003_01)
 is a bit confusing on what is going on.  I think the table of containers 
 there is for only Running containers and when the app is completed or killed 
 its empty.  The table should have a label on it stating so.  
 Also the AM Container field is a link when running but not when its killed. 
  That might be confusing.
 There is no link to the logs in this page but there is in the app attempt 
 table when looking at http://
 rm:8088/cluster/app/application_1431101480046_0003



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2015-08-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705533#comment-14705533
 ] 

Wangda Tan commented on YARN-2923:
--

Thanks for update, [~Naganarasimha], latest patch LGTM, yarn-default.xml has 
wrong indent, I will fix them while commit.

 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, 
 YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, 
 YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch, 
 YARN-2923.20150818-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat

2015-08-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705403#comment-14705403
 ] 

Wangda Tan commented on YARN-4024:
--

[~zhiguohong], thanks for update, patch generally looks good, could you take a 
look at findbugs warning? 
bq. FindBugsmodule:hadoop-yarn-server-resourcemanager
Sometimes it shows 0 findbugs when you click at the findbugs report link, but 
it has some bugs. You can go to yarn-resourcemanager project to run mvn clean 
findbugs:findbugs.

 YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
 --

 Key: YARN-4024
 URL: https://issues.apache.org/jira/browse/YARN-4024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Wangda Tan
Assignee: Hong Zhiguo
 Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, 
 YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch


 Currently, YARN RM NodesListManager will resolve IP address every time when 
 node doing heartbeat. When DNS server becomes slow, NM heartbeat will be 
 blocked and cannot make progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat

2015-08-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705560#comment-14705560
 ] 

Wangda Tan commented on YARN-4055:
--

[~kasha], did you forget to merge it to branch-2? I didn't find it in branch-2.

 Report node resource utilization in heartbeat
 -

 Key: YARN-4055
 URL: https://issues.apache.org/jira/browse/YARN-4055
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.1
Reporter: Inigo Goiri
Assignee: Inigo Goiri
 Fix For: 2.8.0

 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch


 Send the resource utilization from the node (obtained in the 
 NodeResourceMonitor) to the RM in the heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler

2015-08-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705015#comment-14705015
 ] 

Karthik Kambatla commented on YARN-4066:


The patch looks reasonable. Could you comment on your queue setup (depth, 
average breadth etc.) for the 96,000 queues that you tested this on? Just 
curious. 

 Large number of queues choke fair scheduler
 ---

 Key: YARN-4066
 URL: https://issues.apache.org/jira/browse/YARN-4066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.1
Reporter: Johan Gustavsson
 Attachments: yarn-4066-1.patch


 Due to synchronization and all the loops performed during queue creation, 
 setting a large amount of queues (12000+) will completely choke the 
 scheduler. To deal with this some optimization to 
 QueueManager.updateAllocationConfiguration(AllocationConfiguration 
 queueConf) should be done to reduce the amount of unnesecary loops. The 
 attached patch have been tested to work with atleast 96000 queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing

2015-08-20 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704926#comment-14704926
 ] 

MENG DING commented on YARN-1644:
-

The findbugs warning is not related. The link given shows 0 warnings:
https://builds.apache.org/job/PreCommit-YARN-Build/8884/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-common.html.
 
My own tests locally did not show any findbugs warnings.

Thanks for the review!

 RM-NM protocol changes and NodeStatusUpdater implementation to support 
 container resizing
 -

 Key: YARN-1644
 URL: https://issues.apache.org/jira/browse/YARN-1644
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Wangda Tan
Assignee: MENG DING
 Attachments: YARN-1644-YARN-1197.4.patch, 
 YARN-1644-YARN-1197.5.patch, YARN-1644-YARN-1197.6.patch, YARN-1644.1.patch, 
 YARN-1644.2.patch, YARN-1644.3.patch, yarn-1644.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4068) Support appUpdated event in TimelineV2 to publish details for movetoqueue, change in priority

2015-08-20 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4068:
--
Issue Type: Sub-task  (was: Bug)
Parent: YARN-2928

 Support appUpdated event in TimelineV2 to publish details for movetoqueue, 
 change in priority
 -

 Key: YARN-4068
 URL: https://issues.apache.org/jira/browse/YARN-4068
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sunil G
Assignee: Sunil G

 YARN-4044 supports appUpdated event changes to TimelineV1. This jira is to 
 track and port appUpdated changes in V2 for
 - movetoqueue
 - updateAppPriority



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority

2015-08-20 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706269#comment-14706269
 ] 

Rohith Sharma K S commented on YARN-4014:
-

bq. When 2nd or subsequent AM attempt is spawned, we are never setting the old 
attempt as null in SchedulerApplication, correct? Hence there is a chance that 
we set priority to old attempt while new attempt is getting created.. 
Right.. Since latest priority has been reset to attempt after attempt got 
updated in the SchedulerApplication#setCurrentAttempt, I think there would NOT 
ocur any possibility where currentAttempt has old priority. So I believe 
currentAttempt NEED NOT to be volatile.
[~jianhe] Could you give your opinion on this?

 Support user cli interface in for Application Priority
 --

 Key: YARN-4014
 URL: https://issues.apache.org/jira/browse/YARN-4014
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S
 Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 
 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch, 
 0004-YARN-4014.patch


 Track the changes for user-RM client protocol i.e ApplicationClientProtocol 
 changes and discussions in this jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Description: as mentioned in YARN-4045 by [~wangda], drc could set negative 
resource if available resource's memory go negative.

 DRC from 2.7.1 could set negative available resource
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], drc could set negative resource if 
 available resource's memory go negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4045) Negative avaialbleMB is being reported for root queue.

2015-08-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705789#comment-14705789
 ] 

Chang Li commented on YARN-4045:


[~wangda] I opened a jira YARN-4067 regarding drc issue in 2.7.1, and post a 
patch.

 Negative avaialbleMB is being reported for root queue.
 --

 Key: YARN-4045
 URL: https://issues.apache.org/jira/browse/YARN-4045
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Rushabh S Shah

 We recently deployed 2.7 in one of our cluster.
 We are seeing negative availableMB being reported for queue=root.
 This is from the jmx output:
 {noformat}
 clusterMetrics
 ...
 availableMB-163328/availableMB
 ...
 /clusterMetrics
 {noformat}
 The following is the RM log:
 {noformat}
 2015-08-10 14:42:28,280 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:28,404 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,548 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,549 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,088 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,089 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,338 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,339 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,757 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,758 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,056 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,070 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:44,486 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: 

[jira] [Updated] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Fix Version/s: 2.7.1

 DRC from 2.7.1 could set negative available resource
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], drc could set negative resource if 
 available resource's memory go negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Affects Version/s: 2.7.1

 DRC from 2.7.1 could set negative available resource
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], drc could set negative resource if 
 available resource's memory go negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Attachment: YARN-4067.patch

 DRC from 2.7.1 could set negative available resource
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li
 Attachments: YARN-4067.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)
Chang Li created YARN-4067:
--

 Summary: DRC from 2.7.1 could set negative available resource
 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chang Li
Assignee: Chang Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler

2015-08-20 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705094#comment-14705094
 ] 

Arun Suresh commented on YARN-4066:
---

+1 pending jenkins

 Large number of queues choke fair scheduler
 ---

 Key: YARN-4066
 URL: https://issues.apache.org/jira/browse/YARN-4066
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.7.1
Reporter: Johan Gustavsson
 Attachments: yarn-4066-1.patch


 Due to synchronization and all the loops performed during queue creation, 
 setting a large amount of queues (12000+) will completely choke the 
 scheduler. To deal with this some optimization to 
 QueueManager.updateAllocationConfiguration(AllocationConfiguration 
 queueConf) should be done to reduce the amount of unnesecary loops. The 
 attached patch have been tested to work with atleast 96000 queues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup

2015-08-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705570#comment-14705570
 ] 

Hudson commented on YARN-2923:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8329 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8329/])
YARN-2923. Support configuration based NodeLabelsProvider Service in 
Distributed Node Label Configuration Setup. (Naganarasimha G R) (wangda: rev 
fc07464d1a48b0413da5e921614430e41263fdb7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/AbstractNodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/ConfigurationNodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/TestConfigurationNodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* hadoop-yarn-project/CHANGES.txt


 Support configuration based NodeLabelsProvider Service in Distributed Node 
 Label Configuration Setup 
 -

 Key: YARN-2923
 URL: https://issues.apache.org/jira/browse/YARN-2923
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Fix For: 2.8.0

 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, 
 YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, 
 YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch, 
 YARN-2923.20150818-1.patch


 As part of Distributed Node Labels configuration we need to support Node 
 labels to be configured in Yarn-site.xml. And on modification of Node Labels 
 configuration in yarn-site.xml, NM should be able to get modified Node labels 
 from this NodeLabelsprovider service without NM restart



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4045) Negative avaialbleMB is being reported for root queue.

2015-08-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705667#comment-14705667
 ] 

Chang Li commented on YARN-4045:


Hi [~wangda], for the first case, should we check availableResource of root 
queue when a node gets removed? Then if available memory is negative, we 
proceed to unreserve some resource until the available memory of root queue 
becomes positive. 

 Negative avaialbleMB is being reported for root queue.
 --

 Key: YARN-4045
 URL: https://issues.apache.org/jira/browse/YARN-4045
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Rushabh S Shah

 We recently deployed 2.7 in one of our cluster.
 We are seeing negative availableMB being reported for queue=root.
 This is from the jmx output:
 {noformat}
 clusterMetrics
 ...
 availableMB-163328/availableMB
 ...
 /clusterMetrics
 {noformat}
 The following is the RM log:
 {noformat}
 2015-08-10 14:42:28,280 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:28,404 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,548 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,549 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,088 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,089 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,338 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,339 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,757 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,758 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,056 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,070 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 

[jira] [Updated] (YARN-4067) DRC from 2.7.1 could set negative available resource

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Attachment: (was: YARN-4067.patch)

 DRC from 2.7.1 could set negative available resource
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1


 as mentioned in YARN-4045 by [~wangda], drc could set negative resource if 
 available resource's memory go negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Summary: available resource could be set negative  (was: DRC from 2.7.1 
could set negative available resource)

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1


 as mentioned in YARN-4045 by [~wangda], drc could set negative resource if 
 available resource's memory go negative.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Attachment: YARN-4067.patch

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], available memory could be negative 
 due to reservation, propose to use componentwiseMax to set negative value 
 resource to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Description: as mentioned in YARN-4045 by [~wangda], available memory could 
be negative due to reservation, propose to use componentwiseMax to set negative 
value resource to zero  (was: as mentioned in YARN-4045 by [~wangda], drc could 
set negative resource if available resource's memory go negative.)

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], available memory could be negative 
 due to reservation, propose to use componentwiseMax to set negative value 
 resource to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4045) Negative avaialbleMB is being reported for root queue.

2015-08-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705878#comment-14705878
 ] 

Chang Li commented on YARN-4045:


[~leftnoteasy], sorry, was tagging to the wrong user name

 Negative avaialbleMB is being reported for root queue.
 --

 Key: YARN-4045
 URL: https://issues.apache.org/jira/browse/YARN-4045
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Rushabh S Shah

 We recently deployed 2.7 in one of our cluster.
 We are seeing negative availableMB being reported for queue=root.
 This is from the jmx output:
 {noformat}
 clusterMetrics
 ...
 availableMB-163328/availableMB
 ...
 /clusterMetrics
 {noformat}
 The following is the RM log:
 {noformat}
 2015-08-10 14:42:28,280 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:28,404 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:30,913 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:33,093 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,548 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:35,549 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,088 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,089 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,338 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,339 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,757 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:39,758 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,056 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root usedCapacity=1.0029854 
 absoluteUsedCapacity=1.0029854 used=memory:5332480, vCores:6202 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:43,070 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: assignedContainer queue=root usedCapacity=1.0032743 
 absoluteUsedCapacity=1.0032743 used=memory:5334016, vCores:6212 
 cluster=memory:5316608, vCores:28320
 2015-08-10 14:42:44,486 [ResourceManager Event Processor] INFO 
 capacity.ParentQueue: completedContainer queue=root 

[jira] [Commented] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705880#comment-14705880
 ] 

Chang Li commented on YARN-4067:


[~leftnoteasy]

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], available memory could be negative 
 due to reservation, propose to use componentwiseMax to updateQueueStatistics 
 in order to cap negative value to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Description: as mentioned in YARN-4045 by [~wangda], available memory could 
be negative due to reservation, propose to use componentwiseMax to 
updateQueueStatistics in order to cap negative value to zero  (was: as 
mentioned in YARN-4045 by [~wangda], available memory could be negative due to 
reservation, propose to use componentwiseMax to set negative value resource to 
zero)

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], available memory could be negative 
 due to reservation, propose to use componentwiseMax to updateQueueStatistics 
 in order to cap negative value to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705858#comment-14705858
 ] 

Chang Li commented on YARN-4067:


[~wangda] could you please help take a look at this proposed change? Thanks

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~wangda], available memory could be negative 
 due to reservation, propose to use componentwiseMax to updateQueueStatistics 
 in order to cap negative value to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4067) available resource could be set negative

2015-08-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-4067:
---
Description: as mentioned in YARN-4045 by [~leftnoteasy], available memory 
could be negative due to reservation, propose to use componentwiseMax to 
updateQueueStatistics in order to cap negative value to zero  (was: as 
mentioned in YARN-4045 by [~wangda], available memory could be negative due to 
reservation, propose to use componentwiseMax to updateQueueStatistics in order 
to cap negative value to zero)

 available resource could be set negative
 

 Key: YARN-4067
 URL: https://issues.apache.org/jira/browse/YARN-4067
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.1
Reporter: Chang Li
Assignee: Chang Li
 Fix For: 2.7.1

 Attachments: YARN-4067.patch


 as mentioned in YARN-4045 by [~leftnoteasy], available memory could be 
 negative due to reservation, propose to use componentwiseMax to 
 updateQueueStatistics in order to cap negative value to zero



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3868) ContainerManager recovery for container resizing

2015-08-20 Thread MENG DING (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

MENG DING updated YARN-3868:

Attachment: YARN-3868-YARN-1197.5.patch

Attach the latest patch to add a case for testing the recovery of container 
resource change in {{TestNMLeveldbStateStoreService}}

 ContainerManager recovery for container resizing
 

 Key: YARN-3868
 URL: https://issues.apache.org/jira/browse/YARN-3868
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: MENG DING
Assignee: MENG DING
 Attachments: YARN-3868-YARN-1197.3.patch, 
 YARN-3868-YARN-1197.4.patch, YARN-3868-YARN-1197.5.patch, YARN-3868.1.patch, 
 YARN-3868.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3868) ContainerManager recovery for container resizing

2015-08-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705887#comment-14705887
 ] 

Hadoop QA commented on YARN-3868:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m 54s | Findbugs (version ) appears to 
be broken on YARN-1197. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 20s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  5s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   7m 17s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  45m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12751513/YARN-3868-YARN-1197.5.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-1197 / 4dd004b |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8890/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8890/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8890/console |


This message was automatically generated.

 ContainerManager recovery for container resizing
 

 Key: YARN-3868
 URL: https://issues.apache.org/jira/browse/YARN-3868
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: MENG DING
Assignee: MENG DING
 Attachments: YARN-3868-YARN-1197.3.patch, 
 YARN-3868-YARN-1197.4.patch, YARN-3868-YARN-1197.5.patch, YARN-3868.1.patch, 
 YARN-3868.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)