[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287446#comment-14287446 ] Karthik Kambatla commented on YARN-2194: container-executor.c - the new method significantly duplicates the existing one. Can we have separate methods to capture the differences and leave the original method as is. Add Cgroup support for RedHat 7 --- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-2194-1.patch In previous versions of RedHat, we can build custom cgroup hierarchies with use of the cgconfig command from the libcgroup package. From RedHat 7, package libcgroup is deprecated and it is not recommended to use it since it can easily create conflicts with the default cgroup hierarchy. The systemd is provided and recommended for cgroup management. We need to add support for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour
[ https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287278#comment-14287278 ] Hadoop QA commented on YARN-1743: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693846/YARN-1743-2.patch against trunk revision 786dbdf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:red}-1 javac{color}. The applied patch generated 1205 javac compiler warnings (more than the trunk's current 1204 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 3 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6389//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6389//artifact/patchprocess/patchReleaseAuditProblems.txt Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6389//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6389//console This message is automatically generated. Decorate event transitions and the event-types with their behaviour --- Key: YARN-1743 URL: https://issues.apache.org/jira/browse/YARN-1743 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Jeff Zhang Labels: documentation Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, YARN-1743.patch Helps to annotate the transitions with (start-state, end-state) pair and the events with (source, destination) pair. Not just readability, we may also use them to generate the event diagrams across components. Not a blocker for 0.23, but let's see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2973) Capacity scheduler configuration ACLs not work.
[ https://issues.apache.org/jira/browse/YARN-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287301#comment-14287301 ] Rohith commented on YARN-2973: -- Going through queueACLs more deeper I believe as per design , it is expected to disable ACLs for root queue. ACLs disables can be done configuring (space) for the queue configuration. Capacity scheduler configuration ACLs not work. --- Key: YARN-2973 URL: https://issues.apache.org/jira/browse/YARN-2973 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Affects Versions: 2.5.0 Environment: ubuntu 12.04, cloudera manager, cdh5.2.1 Reporter: Jimmy Song Assignee: Rohith Labels: acl, capacity-scheduler, yarn I follow this page to configure yarn: http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html. I configured YARN to use capacity scheduler in yarn-site.xml with yarn.resourcemanager.scheduler.class for org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler. Then modified capacity-scheduler.xml, ___ ?xml version=1.0? configuration property nameyarn.scheduler.capacity.root.queues/name valuedefault,extract,report,tool/value /property property nameyarn.scheduler.capacity.root.state/name valueRUNNING/value /property property nameyarn.scheduler.capacity.root.default.acl_submit_applications/name valuejcsong2, y2 /value /property property nameyarn.scheduler.capacity.root.default.acl_administer_queue/name valuejcsong2, y2 /value /property property nameyarn.scheduler.capacity.root.default.capacity/name value35/value /property property nameyarn.scheduler.capacity.root.extract.acl_submit_applications/name valuejcsong2 /value /property property nameyarn.scheduler.capacity.root.extract.acl_administer_queue/name valuejcsong2 /value /property property nameyarn.scheduler.capacity.root.extract.capacity/name value15/value /property property nameyarn.scheduler.capacity.root.report.acl_submit_applications/name valuey2 /value /property property nameyarn.scheduler.capacity.root.report.acl_administer_queue/name valuey2 /value /property property nameyarn.scheduler.capacity.root.report.capacity/name value35/value /property property nameyarn.scheduler.capacity.root.tool.acl_submit_applications/name value /value /property property nameyarn.scheduler.capacity.root.tool.acl_administer_queue/name value /value /property property nameyarn.scheduler.capacity.root.tool.capacity/name value15/value /property /configuration ___ I have enabled the acl in yarn-site.xml, but the user jcsong2 can submit applications to every queue. The queue acl does't work! And the queue used capacity more than it was configured! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287336#comment-14287336 ] Michael Br commented on YARN-3084: -- Hi, first thanks for your quick reply... 1. Where can i see the queue? i looked for it but i still dont get how to get to it 2. I cant reach the logs... The only clue I have regarding this job fails is: Thats the only information i can see from the URL: http://192.168.38.133:8088/ws/v1/cluster/apps/application_1421661392788_0039 (the links there when i try to use them for the logs dont workhttp://sandbox.hortonworks.com:8088/proxy/application_1421661392788_0039/ [i replace and insert my host ip] and also the link for the container logs dont work. http://sandbox.hortonworks.com:8042/node/containerlogs/container_1421661392788_0039_02_01/dr.who -- app idapplication_1421661392788_0039/id userdr.who/user nametest_33/name queuedefault/queue stateFAILED/state finalStatusFAILED/finalStatus progress0.0/progress trackingUIHistory/trackingUI trackingUrl http://sandbox.hortonworks.com:8088/cluster/app/application_1421661392788_0039 /trackingUrl diagnostics Application application_1421661392788_0039 failed 2 times due to AM Container for appattempt_1421661392788_0039_02 exited with exitCode: 0 For more detailed output, check application tracking page:http://sandbox.hortonworks.com:8088/proxy/application_1421661392788_0039/Then, click on links to logs of each attempt. Diagnostics: Failing this attempt. Failing the application. /diagnostics clusterId1421661392788/clusterId applicationTypeMAPREDUCE/applicationType applicationTagsmichael,pi example/applicationTags startedTime1421923561425/startedTime finishedTime1421923723426/finishedTime elapsedTime162001/elapsedTime amContainerLogs http://sandbox.hortonworks.com:8042/node/containerlogs/container_1421661392788_0039_02_01/dr.who /amContainerLogs amHostHttpAddresssandbox.hortonworks.com:8042/amHostHttpAddress allocatedMB-1/allocatedMB allocatedVCores-1/allocatedVCores runningContainers-1/runningContainers memorySeconds200857/memorySeconds vcoreSeconds160/vcoreSeconds preemptedResourceMB0/preemptedResourceMB preemptedResourceVCores0/preemptedResourceVCores numNonAMContainerPreempted0/numNonAMContainerPreempted numAMContainerPreempted0/numAMContainerPreempted /app YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run Key: YARN-3084 URL: https://issues.apache.org/jira/browse/YARN-3084 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, webapp Affects Versions: 2.6.0 Environment: Using eclipse on windows 7 (client)to run the map reduce job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version 6.0.2 build-1744117) Reporter: Michael Br Priority: Minor Hello, 1.I want to run the simple Map Reduce job example (with the REST API 2.6 for yarn applications) and to calculate PI… for now it doesn’t work. When I use the command in the hortonworks terminal it works: “hadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10”. But I want to submit the job with the REST API and not in the terminal as a command line. [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] 2.I do succeed with other REST API requests: get state, get new application id and even kill(change state), but when I try to submit my example, the response is: -- -- The Response Header: Key : null ,Value : [HTTP/1.1 202 Accepted] Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Content-Length ,Value : [0] Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Location ,Value : [http://[my port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] Key : Content-Type ,Value : [application/json] Key : Server ,Value : [Jetty(6.1.26.hwx)] Key : Pragma ,Value : [no-cache, no-cache] Key : Cache-Control ,Value : [no-cache] The Respone Body: Null (No Response) -- -- 3.I need help with the http request body filling. I am doing a POST http request and I know that I am doing it right (in java). 4.I think the problem is in the request body. 5.I used this guy’s answer to help me build my map reduce example xml but it does not work:
[jira] [Commented] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287320#comment-14287320 ] Rohith commented on YARN-2684: -- [~kasha] Kindly review the patch FairScheduler should tolerate queue configuration changes across RM restarts Key: YARN-2684 URL: https://issues.apache.org/jira/browse/YARN-2684 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Rohith Priority: Critical Attachments: 0001-YARN-2684.patch YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287491#comment-14287491 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #78 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/78/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java * hadoop-yarn-project/CHANGES.txt LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened YARN-3083: -- This was fixed by YARN-1975. Reopening to resolve this as a duplicate of that. Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this:lt;memory:65536, vCores:0 gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-3083. -- Resolution: Duplicate Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this:lt;memory:65536, vCores:0 gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287473#comment-14287473 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2013 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2013/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java * hadoop-yarn-project/CHANGES.txt LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3082) Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
[ https://issues.apache.org/jira/browse/YARN-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287095#comment-14287095 ] Tsuyoshi OZAWA commented on YARN-3082: -- Looks good to me overall. Minor nits: {code} +} +finally { + threadPool.shutdownNow(); {code} This finally statement should be moved after a brace in the line like this: {code} } finally { threadPool.shutdownNow(); {code} Non thread safe access to systemCredentials in NodeHeartbeatResponse processing --- Key: YARN-3082 URL: https://issues.apache.org/jira/browse/YARN-3082 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3082.001.patch When you use system credentials via feature added in YARN-2704, the proto conversion code throws exception in converting ByteBuffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3082) Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
[ https://issues.apache.org/jira/browse/YARN-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287117#comment-14287117 ] Hadoop QA commented on YARN-3082: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693790/YARN-3082.001.patch against trunk revision 786dbdf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6385//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6385//console This message is automatically generated. Non thread safe access to systemCredentials in NodeHeartbeatResponse processing --- Key: YARN-3082 URL: https://issues.apache.org/jira/browse/YARN-3082 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3082.001.patch When you use system credentials via feature added in YARN-2704, the proto conversion code throws exception in converting ByteBuffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1743) Decorate event transitions and the event-types with their behaviour
[ https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated YARN-1743: - Attachment: YARN-1743-2.patch Decorate event transitions and the event-types with their behaviour --- Key: YARN-1743 URL: https://issues.apache.org/jira/browse/YARN-1743 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Jeff Zhang Labels: documentation Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, YARN-1743.patch Helps to annotate the transitions with (start-state, end-state) pair and the events with (source, destination) pair. Not just readability, we may also use them to generate the event diagrams across components. Not a blocker for 0.23, but let's see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1743) Decorate event transitions and the event-types with their behaviour
[ https://issues.apache.org/jira/browse/YARN-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287192#comment-14287192 ] Jeff Zhang commented on YARN-1743: -- [~leftnoteasy] Upload a new patch * Change the annotation type to be Class * Add more java doc to explain the usage of the 2 annotations * The patch only use the annotation on ApplicationEventType, for the other events we can create following up jira on that. Decorate event transitions and the event-types with their behaviour --- Key: YARN-1743 URL: https://issues.apache.org/jira/browse/YARN-1743 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Jeff Zhang Labels: documentation Attachments: NodeManager.gv, NodeManager.pdf, YARN-1743-2.patch, YARN-1743.patch Helps to annotate the transitions with (start-state, end-state) pair and the events with (source, destination) pair. Not just readability, we may also use them to generate the event diagrams across components. Not a blocker for 0.23, but let's see. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287199#comment-14287199 ] Sunil G commented on YARN-2896: --- Thank you [~eepayne] and [~leftnoteasy] for sharing your thoughts. It is easier in implementation with integers and directly it can be configured and operated with. On the other hand Labels makes more readable and easier for a user to submit an app and visualize same in UI. Same for admins. But as you mentioned it comes with slighter complexity at RM to be an interface to scheduler. This thought of labels was one of initial idea where in design was formulated. As [~vinodkv] also participated in design, its good to hear his thoughts too. Also I surely feel that we still need PriorityLabelManager, it can help us in storing the data from various options such as config files or admin commands or even REST. Its better to take such complexities out of RMAppManager. Also for HA cases, PriorityLabelManager itself can help to bring back config loaded from memory/files. Also scheduler need not have to take any load, and PrioirtyLabelManager can help in providing priority for app and its acls. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Labels: UI fair (was: patch) Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fair In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Labels: UI fairscheduler (was: UI fair) Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Description: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. was: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Description: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. was: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287174#comment-14287174 ] Tsuyoshi OZAWA commented on YARN-2800: -- I mention following line in CommonNodeLabelsManager: {code} protected boolean nodeLabelsEnabled = false; {code} Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287173#comment-14287173 ] Tsuyoshi OZAWA commented on YARN-2800: -- [~leftnoteasy], the patch looks good to me overall, but it is outdated. Could you rebase the patch to include the changes against TestCapacitySchedulerNodeLabelUpdate? Additionally, I think nodeLabelsEnabled should be defined as volatile variable since it's accessed from multiple threads. Could you update it? Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Description: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0 gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. was: In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this:lt;memory:65536, vCores:0 gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-2800: - Attachment: YARN-2800-20141205-1.patch Reattaching the patch by Wangda to kick Jenkins. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
Xia Hu created YARN-3083: Summary: Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xia Hu updated YARN-3083: - Attachment: fairscheduler-ui-format.patch Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fair Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this: lt;memory:65536, vCores:0gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287176#comment-14287176 ] Hadoop QA commented on YARN-2800: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693838/YARN-2800-20141205-1.patch against trunk revision 786dbdf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6386//console This message is automatically generated. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3083) Resource format isn't correct in Fair Scheduler web page
[ https://issues.apache.org/jira/browse/YARN-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287181#comment-14287181 ] Hadoop QA commented on YARN-3083: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693842/fairscheduler-ui-format.patch against trunk revision 786dbdf. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6387//console This message is automatically generated. Resource format isn't correct in Fair Scheduler web page Key: YARN-3083 URL: https://issues.apache.org/jira/browse/YARN-3083 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.3.0 Reporter: Xia Hu Labels: UI, fairscheduler Attachments: fairscheduler-ui-format.patch In my fair scheduler web page, the resources shown of each queue is like this:lt;memory:65536, vCores:0 gt; so obviously lt; should be shown as , bet it isn't. After reading the codes, I suppose it's because method StringEscapeUtils.escapeHtml is called twice. But I only found one place. Anyway, I modify the code, and this problem seems to be solved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3024) LocalizerRunner should give DIE action when all resources are localized
[ https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengbing Liu updated YARN-3024: Attachment: YARN-3024.04.patch Modified the test to verify the changed logic. LocalizerRunner should give DIE action when all resources are localized --- Key: YARN-3024 URL: https://issues.apache.org/jira/browse/YARN-3024 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Assignee: Chengbing Liu Attachments: YARN-3024.01.patch, YARN-3024.02.patch, YARN-3024.03.patch, YARN-3024.04.patch We have observed that {{LocalizerRunner}} always gives a LIVE action at the end of localization process. The problem is {{findNextResource()}} can return null even when {{pending}} was not empty prior to the call. This method removes localized resources from {{pending}}, therefore we should check the return value, and gives DIE action when it returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran moved HADOOP-11504 to YARN-3084: --- Tags: (was: HADOOP MAPREDUCE YARN REST API SUBMIT) Component/s: (was: documentation) documentation Target Version/s: (was: 2.6.0) Affects Version/s: (was: 2.6.0) 2.6.0 Key: YARN-3084 (was: HADOOP-11504) Project: Hadoop YARN (was: Hadoop Common) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run Key: YARN-3084 URL: https://issues.apache.org/jira/browse/YARN-3084 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.6.0 Environment: Using eclipse on windows 7 (client)to run the map reduce job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version 6.0.2 build-1744117) Reporter: Michael Br Priority: Minor Hello, 1.I want to run the simple Map Reduce job example (with the REST API 2.6 for yarn applications) and to calculate PI… for now it doesn’t work. When I use the command in the hortonworks terminal it works: “hadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10”. But I want to submit the job with the REST API and not in the terminal as a command line. [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] 2.I do succeed with other REST API requests: get state, get new application id and even kill(change state), but when I try to submit my example, the response is: -- -- The Response Header: Key : null ,Value : [HTTP/1.1 202 Accepted] Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Content-Length ,Value : [0] Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Location ,Value : [http://[my port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] Key : Content-Type ,Value : [application/json] Key : Server ,Value : [Jetty(6.1.26.hwx)] Key : Pragma ,Value : [no-cache, no-cache] Key : Cache-Control ,Value : [no-cache] The Respone Body: Null (No Response) -- -- 3.I need help with the http request body filling. I am doing a POST http request and I know that I am doing it right (in java). 4.I think the problem is in the request body. 5.I used this guy’s answer to help me build my map reduce example xml but it does not work: [http://hadoop-forum.org/forum/general-hadoop-discussion/miscellaneous/2136-how-can-i-run-mapreduce-job-by-rest-api]. 6.What am I missing? (the description is not clear to me in the submit section of the rest api 2.6) 7.Does someone have an xml example for using a simple MR job? 8.Thanks! Here is the XML file I am using for the request body: -- -- ?xml version=1.0 encoding=UTF-8 standalone=yes? application-submission-context application-idapplication_1421661392788_0038/application-id application-nametest_21_1/application-name queuedefault/queue priority3/priority am-container-spec environment entry keyCLASSPATH/key value/usr/hdp/2.2.0.0-2041/hadoop/conflt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/./lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/.//*lt;CPSgt;lt;CPSgt;/usr/share/java/mysql-connector-java-5.1.17.jarlt;CPSgt;/usr/share/java/mysql-connector-java.jarlt;CPSgt;/usr/hdp/current/hadoop-mapreduce-client/*lt;CPSgt;/usr/hdp/current/tez-client/*lt;CPSgt;/usr/hdp/current/tez-client/lib/*lt;CPSgt;/etc/tez/conf/lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/lib/*lt;CPSgt;/etc/tez/conf/value /entry /environment commands commandhadoop jar
[jira] [Updated] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-3084: - Component/s: (was: documentation) webapp resourcemanager YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run Key: YARN-3084 URL: https://issues.apache.org/jira/browse/YARN-3084 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, webapp Affects Versions: 2.6.0 Environment: Using eclipse on windows 7 (client)to run the map reduce job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version 6.0.2 build-1744117) Reporter: Michael Br Priority: Minor Hello, 1.I want to run the simple Map Reduce job example (with the REST API 2.6 for yarn applications) and to calculate PI… for now it doesn’t work. When I use the command in the hortonworks terminal it works: “hadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10”. But I want to submit the job with the REST API and not in the terminal as a command line. [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] 2.I do succeed with other REST API requests: get state, get new application id and even kill(change state), but when I try to submit my example, the response is: -- -- The Response Header: Key : null ,Value : [HTTP/1.1 202 Accepted] Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Content-Length ,Value : [0] Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Location ,Value : [http://[my port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] Key : Content-Type ,Value : [application/json] Key : Server ,Value : [Jetty(6.1.26.hwx)] Key : Pragma ,Value : [no-cache, no-cache] Key : Cache-Control ,Value : [no-cache] The Respone Body: Null (No Response) -- -- 3.I need help with the http request body filling. I am doing a POST http request and I know that I am doing it right (in java). 4.I think the problem is in the request body. 5.I used this guy’s answer to help me build my map reduce example xml but it does not work: [http://hadoop-forum.org/forum/general-hadoop-discussion/miscellaneous/2136-how-can-i-run-mapreduce-job-by-rest-api]. 6.What am I missing? (the description is not clear to me in the submit section of the rest api 2.6) 7.Does someone have an xml example for using a simple MR job? 8.Thanks! Here is the XML file I am using for the request body: -- -- ?xml version=1.0 encoding=UTF-8 standalone=yes? application-submission-context application-idapplication_1421661392788_0038/application-id application-nametest_21_1/application-name queuedefault/queue priority3/priority am-container-spec environment entry keyCLASSPATH/key value/usr/hdp/2.2.0.0-2041/hadoop/conflt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/./lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/.//*lt;CPSgt;lt;CPSgt;/usr/share/java/mysql-connector-java-5.1.17.jarlt;CPSgt;/usr/share/java/mysql-connector-java.jarlt;CPSgt;/usr/hdp/current/hadoop-mapreduce-client/*lt;CPSgt;/usr/hdp/current/tez-client/*lt;CPSgt;/usr/hdp/current/tez-client/lib/*lt;CPSgt;/etc/tez/conf/lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/lib/*lt;CPSgt;/etc/tez/conf/value /entry /environment commands commandhadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10/command /commands /am-container-spec unmanaged-AMfalse/unmanaged-AM max-app-attempts2/max-app-attempts resource memory1024/memory vCores1/vCores /resource
[jira] [Commented] (YARN-3084) YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run
[ https://issues.apache.org/jira/browse/YARN-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287214#comment-14287214 ] Steve Loughran commented on YARN-3084: -- 202, accepted, looks like the RM accepted it. # does it appear in the queue of job submissions?? # what does the RM log say? YARN REST API 2.6 - can't submit simple job in hortonworks-allways job failes to run Key: YARN-3084 URL: https://issues.apache.org/jira/browse/YARN-3084 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, webapp Affects Versions: 2.6.0 Environment: Using eclipse on windows 7 (client)to run the map reduce job on the host of Hortonworks HDP 2.2 (hortonworks is on vmware version 6.0.2 build-1744117) Reporter: Michael Br Priority: Minor Hello, 1.I want to run the simple Map Reduce job example (with the REST API 2.6 for yarn applications) and to calculate PI… for now it doesn’t work. When I use the command in the hortonworks terminal it works: “hadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10”. But I want to submit the job with the REST API and not in the terminal as a command line. [http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application] 2.I do succeed with other REST API requests: get state, get new application id and even kill(change state), but when I try to submit my example, the response is: -- -- The Response Header: Key : null ,Value : [HTTP/1.1 202 Accepted] Key : Date ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Content-Length ,Value : [0] Key : Expires ,Value : [Thu, 22 Jan 2015 07:47:24 GMT, Thu, 22 Jan 2015 07:47:24 GMT] Key : Location ,Value : [http://[my port]:8088/ws/v1/cluster/apps/application_1421661392788_0038] Key : Content-Type ,Value : [application/json] Key : Server ,Value : [Jetty(6.1.26.hwx)] Key : Pragma ,Value : [no-cache, no-cache] Key : Cache-Control ,Value : [no-cache] The Respone Body: Null (No Response) -- -- 3.I need help with the http request body filling. I am doing a POST http request and I know that I am doing it right (in java). 4.I think the problem is in the request body. 5.I used this guy’s answer to help me build my map reduce example xml but it does not work: [http://hadoop-forum.org/forum/general-hadoop-discussion/miscellaneous/2136-how-can-i-run-mapreduce-job-by-rest-api]. 6.What am I missing? (the description is not clear to me in the submit section of the rest api 2.6) 7.Does someone have an xml example for using a simple MR job? 8.Thanks! Here is the XML file I am using for the request body: -- -- ?xml version=1.0 encoding=UTF-8 standalone=yes? application-submission-context application-idapplication_1421661392788_0038/application-id application-nametest_21_1/application-name queuedefault/queue priority3/priority am-container-spec environment entry keyCLASSPATH/key value/usr/hdp/2.2.0.0-2041/hadoop/conflt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/./lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-hdfs/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-yarn/.//*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/lib/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/.//*lt;CPSgt;lt;CPSgt;/usr/share/java/mysql-connector-java-5.1.17.jarlt;CPSgt;/usr/share/java/mysql-connector-java.jarlt;CPSgt;/usr/hdp/current/hadoop-mapreduce-client/*lt;CPSgt;/usr/hdp/current/tez-client/*lt;CPSgt;/usr/hdp/current/tez-client/lib/*lt;CPSgt;/etc/tez/conf/lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/*lt;CPSgt;/usr/hdp/2.2.0.0-2041/tez/lib/*lt;CPSgt;/etc/tez/conf/value /entry /environment commands commandhadoop jar /usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-mapreduce-examples-2.6.0.2.2.0.0-2041.jar pi 10 10/command /commands /am-container-spec unmanaged-AMfalse/unmanaged-AM max-app-attempts2/max-app-attempts resource
[jira] [Commented] (YARN-3024) LocalizerRunner should give DIE action when all resources are localized
[ https://issues.apache.org/jira/browse/YARN-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287220#comment-14287220 ] Hadoop QA commented on YARN-3024: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693845/YARN-3024.04.patch against trunk revision 786dbdf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6388//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6388//console This message is automatically generated. LocalizerRunner should give DIE action when all resources are localized --- Key: YARN-3024 URL: https://issues.apache.org/jira/browse/YARN-3024 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Chengbing Liu Assignee: Chengbing Liu Attachments: YARN-3024.01.patch, YARN-3024.02.patch, YARN-3024.03.patch, YARN-3024.04.patch We have observed that {{LocalizerRunner}} always gives a LIVE action at the end of localization process. The problem is {{findNextResource()}} can return null even when {{pending}} was not empty prior to the call. This method removes localized resources from {{pending}}, therefore we should check the return value, and gives DIE action when it returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287237#comment-14287237 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #81 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/81/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287254#comment-14287254 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Yarn-trunk #815 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/815/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287931#comment-14287931 ] Chen He commented on YARN-2466: --- This is a good point. For isolation concern, I think we should not need to inform RM about this since it is just one type of ContainerExecutor, RM should look it as a general container as others (default, lxc, etc). But as [~eronwright] mentioned, we should find a way to avoid being killed because of timeout. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287941#comment-14287941 ] Chen He commented on YARN-2466: --- Hi [~ashahab], if you don't mind, I will create a sub-task JIRA to trace this issue. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3087) the REST server (web server) for per-node aggregator does not work if it runs inside node manager
Sangjin Lee created YARN-3087: - Summary: the REST server (web server) for per-node aggregator does not work if it runs inside node manager Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287645#comment-14287645 ] Wei Yan commented on YARN-2194: --- Sure, will update a new patch by combing [~bcwalrus] comments. Add Cgroup support for RedHat 7 --- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-2194-1.patch In previous versions of RedHat, we can build custom cgroup hierarchies with use of the cgconfig command from the libcgroup package. From RedHat 7, package libcgroup is deprecated and it is not recommended to use it since it can easily create conflicts with the default cgroup hierarchy. The systemd is provided and recommended for cgroup management. We need to add support for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2194) Add Cgroup support for RedHat 7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287644#comment-14287644 ] Wei Yan commented on YARN-2194: --- Sure, will update a new patch by combing [~bcwalrus] comments. Add Cgroup support for RedHat 7 --- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-2194-1.patch In previous versions of RedHat, we can build custom cgroup hierarchies with use of the cgconfig command from the libcgroup package. From RedHat 7, package libcgroup is deprecated and it is not recommended to use it since it can easily create conflicts with the default cgroup hierarchy. The systemd is provided and recommended for cgroup management. We need to add support for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287601#comment-14287601 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2032 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2032/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java * hadoop-yarn-project/CHANGES.txt LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3087) the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-3087: -- Description: This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} was:This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3079) Scheduler should also update maximumAllocation when updateNodeResource.
[ https://issues.apache.org/jira/browse/YARN-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288324#comment-14288324 ] Wangda Tan commented on YARN-3079: -- Hi [~zxu], Thanks for taking up this task, just reviewed your patch, some comments 1) Suggest to change signature of updateMaximumAllocation(SchedulerNode, bool) to updateMaximumAllocation(Resource nodeResource, bool), since we only uses nodeResource here. 2) Change resource for a NM is equivalent to {{updateMaximumAllocation(oldNodeResource, false)}} and {{updateMaximumAllocation(newNoderesource, true)}}. We can avoid some duplicated logic. 3) Suggest rename updateMaximumAllocation(void) to refreshMaximumAllocation() or other name reflects the behavior: scan all cluster nodes and get maximum allocation. 4) Not related to this fix -- I found only max allocation is protected by R/W lock, it seems not correct to me, I think we should address it in a separated JIRA. Will file a ticket later. Wangda Scheduler should also update maximumAllocation when updateNodeResource. --- Key: YARN-3079 URL: https://issues.apache.org/jira/browse/YARN-3079 Project: Hadoop YARN Issue Type: Bug Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3079.000.patch, YARN-3079.001.patch Scheduler should also update maximumAllocation when updateNodeResource. Otherwise even the node resource is changed by AdminService#updateNodeResource, maximumAllocation won't be changed. Also RMNodeReconnectEvent called from ResourceTrackerService#registerNodeManager will also trigger AbstractYarnScheduler#updateNodeResource being called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3028) Better syntax for replace label CLI
[ https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288273#comment-14288273 ] Wangda Tan commented on YARN-3028: -- [~rohithsharma], I see, just re-reviewed, my bad. Yes, 1#/2# are all addressed. A nit for test: I suggest to merge {{testReplaceLabelsOnNodeWithPort}} to {{testReplaceLabelsOnNode}}. It's no need to split them. And a case for = but without port should added. Nit for help message: 1) port should be optional, you can make a small change here for help message: \[node1:port=label1,label2 node2:port=label1,label2\] should be \[node1\[:port\]=label1...\] 2) {{printHelp}} should be updated as well. Also, I still suggest add a small comment before {code} String[] splits = nodeToLabels.split(=); {code} To explicitly indicate we support , for compatibility. Thanks, Wangda Better syntax for replace label CLI --- Key: YARN-3028 URL: https://issues.apache.org/jira/browse/YARN-3028 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3028.patch The command to replace label now is such: {code} yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2] {code} Instead of {code} node1:port,label1,label2 {code} I think it's better to say {code} node1:port=label1,label2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3078) LogCLIHelpers lacks of a blank space before string 'does not exist'
[ https://issues.apache.org/jira/browse/YARN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287568#comment-14287568 ] Hudson commented on YARN-3078: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #82 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/82/]) YARN-3078. LogCLIHelpers lacks of a blank space before string 'does not exist'. Contributed by Sam Liu. (ozawa: rev 5712c9f96a2cf4ff63d36906ab3876444c0cddec) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/LogCLIHelpers.java * hadoop-yarn-project/CHANGES.txt LogCLIHelpers lacks of a blank space before string 'does not exist' --- Key: YARN-3078 URL: https://issues.apache.org/jira/browse/YARN-3078 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation Affects Versions: 2.6.0 Reporter: sam liu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3078.001.patch, YARN-3078.002.patch LogCLIHelpers lacks of a blank space before string 'does not exist' and it will bring incorrect return message. For example, I ran command 'yarn logs -applicationId application_1421742816585_0003', and the return message includes 'logs/application_1421742816585_0003does not exist'. Obviously it's incorrect and the correct return message should be 'logs/application_1421742816585_0003 does not exist' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3085) Application summary should include the application type
Jason Lowe created YARN-3085: Summary: Application summary should include the application type Key: YARN-3085 URL: https://issues.apache.org/jira/browse/YARN-3085 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Jason Lowe Adding the application type to the RM application summary log makes it easier to audit the number of applications from various app frameworks that are running on the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3085) Application summary should include the application type
[ https://issues.apache.org/jira/browse/YARN-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith reassigned YARN-3085: Assignee: Rohith Application summary should include the application type --- Key: YARN-3085 URL: https://issues.apache.org/jira/browse/YARN-3085 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Jason Lowe Assignee: Rohith Adding the application type to the RM application summary log makes it easier to audit the number of applications from various app frameworks that are running on the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3091: - Target Version/s: 2.7.0 [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288446#comment-14288446 ] Wangda Tan commented on YARN-3091: -- I think may it can be a part of separated fine-grained-lock-enhancement-for-FairScheduler (If there's other similar fine-grained changes needed)? To keep every patches to be easier reviewed, the {{AbstractYarnScheduler - CapacityScheduler - FairScheduler}} could address only general synchronized lock - r/w lock. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1393) Add how-to-use instruction in README for Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1393: --- Attachment: YARN-1393-1.patch Thanks for the docs, Wei. Made minor changes - removed potentially redundant information. If you think the patch is good, I ll go ahead and commit it. Thanks. Add how-to-use instruction in README for Yarn Scheduler Load Simulator -- Key: YARN-1393 URL: https://issues.apache.org/jira/browse/YARN-1393 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-1393-1.patch, YARN-1393.patch The instructions are put in the .pdf document and site page. The README needs to include a simple instruction for users to quickly pick up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1393) Add how-to-use instruction in README for Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1393: -- Attachment: YARN-1393-2.patch Thanks, [~kasha], I updated a new patch by clearing the file path. Add how-to-use instruction in README for Yarn Scheduler Load Simulator -- Key: YARN-1393 URL: https://issues.apache.org/jira/browse/YARN-1393 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-1393-1.patch, YARN-1393-2.patch, YARN-1393.patch The instructions are put in the .pdf document and site page. The README needs to include a simple instruction for users to quickly pick up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288608#comment-14288608 ] Leitao Guo commented on YARN-2466: -- Currently, if I want to use DCE in my cluster, all the application should be running in DCE, that is not practical in our cluster. Can yarn.nodemanager.container-executor.class support configurable per application? So that, we can use DCE in some applications, others can still use LCE. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288615#comment-14288615 ] Beckham007 commented on YARN-2466: -- We use YARN-2718 for this. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288633#comment-14288633 ] Li Lu commented on YARN-3091: - Maybe we want to tweak the wording/organization of this JIRA a little bit? In the description of this JIRA, two major points are raised: bq. Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps I agree that readers-writer lock is a viable approach for many synchronization performance issues, but other synchronization mechanisms (such as concurrent data structures) may also be our options. bq. Some fields not properly locked (Like clusterResource) Improperly synchronized accesses may cause data races, and are generally considered as bugs in Java programs (even though the Java memory model provides some sort of guarantee on racy programs). To me, it would be better if the second point could be categorized as bug fixes, rather than improvements, for the RM scheduler code. Therefore, maybe we want to solve the problem by two steps: a) fixing improperly synchronized data accesses in RM scheduler (correctness) and b) improve synchronization performance for RM scheduler code (performance)? I'm not sure if there should be two separate JIRAs to trace this, or we can combine both in one giant JIRA. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-914) Support graceful decommission of nodemanager
[ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288644#comment-14288644 ] Junping Du commented on YARN-914: - Sorry for replying late. These are all good points, a couple of comments: bq. Sounds like we need a new state for NM, called decommission_in_progress when NM is draining the containers. Agree. We need a dedicated state for NM in this situation and both AM and RM should be aware of it for properly handle it. bq. To clarify my early comment all its map output are fetched or until all the applications the node touches have completed, the question is when YARN can declare a node's state has been gracefully drained and thus the node gracefully decommissioned ( admins can shutdown the whole machine without any impact on jobs ). For MR, the state could be running tasks/containers or mapper outputs. Say we have timeout of 30 minutes for decommission, it takes 3 minutes to finish the mappers on the node, another 5 minutes for the job to finish, then YARN can declare the node gracefully decommissioned in 8 minutes, instead of waiting for 30 minutes. RM knows all applications on any given NM. So if all applications on any given node have completed, RM can mark the node decommissioned. The first step I was thinking to keep NM running in a low resource mode after graceful decommissioned - no running containers, no new containers get spawned, no obviously resources consumption, etc. and just like putting these nodes into maintenance mode. Timeout value there is used to kill unfinished containers to release resources. Not quite sure if we have to terminate NM after timeout but would like to understand your use case here. bq. Yes, I meant long running services. If YARN just kills the containers upon decommission request, the impact could vary. Some services might not have states to drain. Or maybe the services can handle the state migration on their own without YARN's help. For such services, maybe we can just use ResourceOption's timeout for that; set timeout to 0 and NM will just kill the containers. I believe most of these services already take care of losing nodes as each node in YARN cluster cannot be reliable always. However, I am not sure if they can handle state migration to new node ahead of predictable node lost here, or be stateless more or less make more sense here? If we have an example application that could easy migrate a node's state to another, then we can discuss how to provide some rudimentary support here. bq. Given we don't plan to have applications checkpoint and migrate states, it doesn't seem to be necessary to have YARN notify applications upon decommission requests. Just to call it out. These notification may still be necessary, so AM won't add these nodes into blacklist if container get killed afterwards. Thoughts? bq. It might be useful to have a new state called decommissioned_timeout, so that admins know the node has been gracefully decommissioned or not. Just like my above comments, we can see if we have to terminate the NM. If not, I prefer to use maintenance state and Admin can decide if to fully decommission it later. Again, we should talk on your scenarios here. Support graceful decommission of nodemanager Key: YARN-914 URL: https://issues.apache.org/jira/browse/YARN-914 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Luke Lu Assignee: Junping Du When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable to minimize the impact to running applications. Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled on other NMs. Further more, for finished map tasks, if their map output are not fetched by the reducers of the job, these map tasks will need to be rerun as well. We propose to introduce a mechanism to optionally gracefully decommission a node manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288746#comment-14288746 ] Rohith commented on YARN-3091: -- bq. fixing improperly synchronized data accesses in RM scheduler (correctness) currently findbug exclude xml mask these warnings like IS2_INCONSISTENT_SYNC. I believe these exclude lists are reviewed and now assumptions like a class expected to be thread-safe. Recently had discussion on this in community [Discussion thread|http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201412.mbox/%3CCALwhT97BqK_zjQ=MCO_c=Y=7r9ewLN2Ab_qm=vqekvxgzrq...@mail.gmail.com%3E] For identifying 1st level of problems, I think enabling these findbug type would help in better way. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288755#comment-14288755 ] Li Lu commented on YARN-3091: - I agree we should review the exclude list for potential synchronization problems. However note that findbugs uses static analysis to analyze Java source code, which may introduce both false positives and false negatives when detecting concurrency related bugs. In long term we may want to consider other tools to help detect improper synchronization (although a perfect solution would be hard). For the short term, I think [~leftnoteasy] raised a very valid point (this JIRA) and let's the the problems solved. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2828) Enable auto refresh of web pages (using http parameter)
[ https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288348#comment-14288348 ] Hadoop QA commented on YARN-2828: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693988/YARN-2828.001.patch against trunk revision 825923f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6392//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6392//console This message is automatically generated. Enable auto refresh of web pages (using http parameter) --- Key: YARN-2828 URL: https://issues.apache.org/jira/browse/YARN-2828 Project: Hadoop YARN Issue Type: Improvement Reporter: Tim Robertson Assignee: Vijay Bhat Priority: Minor Attachments: YARN-2828.001.patch The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that could be appended to URLs which enabled a page reload. This was very useful when developing mapreduce jobs, especially to watch counters changing. This is lost in the the Yarn interface. Could be implemented as a page element (e.g. drop down or so), but I'd recommend that the page not be more cluttered, and simply bring back the optional refresh HTTP param. It worked really nicely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288400#comment-14288400 ] Wangda Tan commented on YARN-3091: -- Since some classes hierarchy are across module (like AbstractYarnScheduler inheriented by CS and fair). I suggest to make sub tasks class-family-wise. What I proposed for sub tasks are # AbstractYarnScheduler - CapacityScheduler - FairScheduler # SchedulerApplicationAttempt - FiCaSchedulerApp - FSAppAttempt # AbstractCSQueue - ParentQueue - LeafQueue # AppSchedulingInfo Hope to get your thoughts on this, if you agree, I will go ahead and create sub-tickets. Thanks, Wangda [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288449#comment-14288449 ] Wangda Tan commented on YARN-2896: -- +1 for moving it to YARN-1963 Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288460#comment-14288460 ] Varun Saxena commented on YARN-3091: Ok... [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288411#comment-14288411 ] Varun Saxena commented on YARN-3091: Similar to YARN-3008 ? Maybe that can be linked to this. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288421#comment-14288421 ] Varun Saxena commented on YARN-3091: Yeah. I meant YARN-3008 can probably made one subtask of this. It can address this part : AbstractYarnScheduler - CapacityScheduler - FairScheduler [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288431#comment-14288431 ] Anubhav Dhoot commented on YARN-2868: - volatile cannot be used for thread synchronization instead of AtomicLong. Lets revert those changes back to previous patch. Add metric for initial container launch time Key: YARN-2868 URL: https://issues.apache.org/jira/browse/YARN-2868 Project: Hadoop YARN Issue Type: Improvement Reporter: Ray Chiang Assignee: Ray Chiang Labels: metrics, supportability Attachments: YARN-2868-01.patch, YARN-2868.002.patch, YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, YARN-2868.006.patch Add a metric to measure the latency between starting container allocation and first container actually allocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288445#comment-14288445 ] Eric Payne commented on YARN-2896: -- [~sunilg], [~leftnoteasy], and [~vinodkv], can we move this discussion to YARN-1963 in order to achieve a higher visibility? Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3092) Create common resource usage class to track labeled resource/capacity in Capacity Scheduler
Wangda Tan created YARN-3092: Summary: Create common resource usage class to track labeled resource/capacity in Capacity Scheduler Key: YARN-3092 URL: https://issues.apache.org/jira/browse/YARN-3092 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan Since we have labels on nodes, so we need to track resource usage *by labels*, includes - AM resource (to enforce max-am-resource-by-label after YARN-2637) - Used resource (includes AM resource usage) - Reserved resource - Pending resource - Headroom Benefits to have such a common class are: - Reuse lots of code in different places (Queue/App/User), better maintainability and readability. - Can make fine-grained locking (e.g. accessing used resource in a queue doesn't need lock a queue) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3091) [Umbrella] Improve locks of RM scheduler
Wangda Tan created YARN-3091: Summary: [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288412#comment-14288412 ] Varun Saxena commented on YARN-3091: That one is for FairScheduler [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288417#comment-14288417 ] Wangda Tan commented on YARN-3091: -- Thanks [~varun_saxena] pointing this, I think such fine-grained locking enhancement should also be included in this umbrella ticket. This JIRA is intended to track scheduler lock improvements, not only for one special scheduler type. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3009) TimelineWebServices always parses primary and secondary filters as numbers if first char is a number
[ https://issues.apache.org/jira/browse/YARN-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288514#comment-14288514 ] Naganarasimha G R commented on YARN-3009: - Hi [~cwensel] From the above discussions i could conclude that other options for overcoming issue is not currently possible(due to impacts specified by [~zjshen]) and no proper work around too, so i plan to close this issue as wont fix, please inform if any issues. TimelineWebServices always parses primary and secondary filters as numbers if first char is a number Key: YARN-3009 URL: https://issues.apache.org/jira/browse/YARN-3009 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.6.0 Reporter: Chris K Wensel Assignee: Naganarasimha G R Attachments: YARN-3009.20150108-1.patch, YARN-3009.20150111-1.patch If you pass a filter value that starts with a number (7CCA...), the filter value will be parsed into the Number '7' causing the filter to fail the search. Should be noted the actual value as stored via a PUT operation is properly parsed and stored as a String. This manifests as a very hard to identify issue with DAGClient in Apache Tez and naming dags/vertices with alphanumeric guid values. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3086) Make NodeManager memory configurable in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288803#comment-14288803 ] Tsuyoshi OZAWA commented on YARN-3086: -- The following is a suggestion from [~rmetzger] in yarn-dev mailing list: {quote} If you want to maintain the current value of 4GB as the default value, I probably need to introduce a new configuration value, similar to: public static final String YARN_MINICLUSTER_CONTROL_RESOURCE_MONITORING = YARN_PREFIX + minicluster.control-resource-monitoring; Or is there another approach you would prefer? I'll add a patch to the JIRA once I've found time to fix it. {quote} Make NodeManager memory configurable in MiniYARNCluster --- Key: YARN-3086 URL: https://issues.apache.org/jira/browse/YARN-3086 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Robert Metzger Priority: Minor Apache Flink has a build-in YARN client to deploy it to YARN clusters. Recently, we added more tests for the client, using the MiniYARNCluster. One of the tests is requesting more containers than available. This test works well on machines with enough memory, but on travis-ci (our test environment), the available main memory is limited to 3 GB. Therefore, I want to set custom amount of memory for each NodeManager. Right now, the NodeManager memory is hardcoded to 4GB. As discussed on the yarn-dev list, I'm going to create a patch for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288808#comment-14288808 ] Sunil G commented on YARN-2896: --- Yes. +1. I will move this to parent JIRA for better visibility. Thank you Eric and Wangda for the comments. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288807#comment-14288807 ] Sunil G commented on YARN-2896: --- Yes. +1. I will move this to parent JIRA for better visibility. Thank you Eric and Wangda for the comments. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288805#comment-14288805 ] Sunil G commented on YARN-2896: --- Yes. +1. I will move this to parent JIRA for better visibility. Thank you Eric and Wangda for the comments. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288806#comment-14288806 ] Sunil G commented on YARN-2896: --- Yes. +1. I will move this to parent JIRA for better visibility. Thank you Eric and Wangda for the comments. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3086) Make NodeManager memory configurable in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288817#comment-14288817 ] Tsuyoshi OZAWA commented on YARN-3086: -- [~rmetzger] Yeah, the approach you suggested looks good to me basically. We've already had YARN_MC_PREFIX, so please use it. {code} public static final String YARN_MC_PREFIX = YARN_PREFIX + minicluster.; {code} I think new configuration name should be YARN_MINICLUSTER_NM_PMEM_MB straightforwardly. Make NodeManager memory configurable in MiniYARNCluster --- Key: YARN-3086 URL: https://issues.apache.org/jira/browse/YARN-3086 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Robert Metzger Priority: Minor Apache Flink has a build-in YARN client to deploy it to YARN clusters. Recently, we added more tests for the client, using the MiniYARNCluster. One of the tests is requesting more containers than available. This test works well on machines with enough memory, but on travis-ci (our test environment), the available main memory is limited to 3 GB. Therefore, I want to set custom amount of memory for each NodeManager. Right now, the NodeManager memory is hardcoded to 4GB. As discussed on the yarn-dev list, I'm going to create a patch for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3028) Better syntax for replace label CLI
[ https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3028: - Attachment: 0002-YARN-3028.patch Better syntax for replace label CLI --- Key: YARN-3028 URL: https://issues.apache.org/jira/browse/YARN-3028 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3028.patch, 0002-YARN-3028.patch The command to replace label now is such: {code} yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2] {code} Instead of {code} node1:port,label1,label2 {code} I think it's better to say {code} node1:port=label1,label2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3028) Better syntax for replace label CLI
[ https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288840#comment-14288840 ] Rohith commented on YARN-3028: -- Kindly review the updated patch Better syntax for replace label CLI --- Key: YARN-3028 URL: https://issues.apache.org/jira/browse/YARN-3028 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3028.patch, 0002-YARN-3028.patch The command to replace label now is such: {code} yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2] {code} Instead of {code} node1:port,label1,label2 {code} I think it's better to say {code} node1:port=label1,label2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
[ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288856#comment-14288856 ] Sunil G commented on YARN-3091: --- As per code review, we can get more issues in these areas mentioned by [~leftnoteasy]. I feel a task we can start to run Jcarder to clearly pinpoint the locks problems. So it can help us in designing these subtasks with more clarity and may be helpful in verifying these changes again. [Umbrella] Improve locks of RM scheduler Key: YARN-3091 URL: https://issues.apache.org/jira/browse/YARN-3091 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager, scheduler Reporter: Wangda Tan In existing YARN RM scheduler, there're some issues of using locks. For example: - Many unnecessary synchronized locks, we have seen several cases recently that too frequent access of scheduler makes scheduler hang. Which could be addressed by using read/write lock. Components include scheduler, CS queues, apps - Some fields not properly locked (Like clusterResource) We can address them together in this ticket. (More details see comments below) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Attachment: YARN-2800-20150122-1.patch Updated patch. I just checked code, only RMNodeLabelsManager possibly has multi-thread access to NodeLabelsManager, {{nodeLabelsEnabled}} will be protected by write lock of CommonNodeLabelsManager. So I think we don't need add volatile to it. And in addition, it is only used in CommonsNodeLabelsManager, so make it private. Please kindly review. Thanks, Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20150122-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2694) Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY
[ https://issues.apache.org/jira/browse/YARN-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2694: - Attachment: YARN-2694-20150122-1.patch Updated patch addressed tests failures. Ensure only single node labels specified in resource request / host, and node label expression only specified when resourceName=ANY --- Key: YARN-2694 URL: https://issues.apache.org/jira/browse/YARN-2694 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2694-20141020-1.patch, YARN-2694-20141021-1.patch, YARN-2694-20141023-1.patch, YARN-2694-20141023-2.patch, YARN-2694-20141101-1.patch, YARN-2694-20141101-2.patch, YARN-2694-20150121-1.patch, YARN-2694-20150122-1.patch Currently, node label expression supporting in capacity scheduler is partial completed. Now node label expression specified in Resource Request will only respected when it specified at ANY level. And a ResourceRequest/host with multiple node labels will make user limit, etc. computation becomes more tricky. Now we need temporarily disable them, changes include, - AMRMClient - ApplicationMasterService - RMAdminCLI - CommonNodeLabelsManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2896) Server side PB changes for Priority Label Manager and Admin CLI support
[ https://issues.apache.org/jira/browse/YARN-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288082#comment-14288082 ] Wangda Tan commented on YARN-2896: -- Thanks [~sunilg], I agree to have a storage to simply configuration (no need to specify highest-priority for each user under each queue), all can be done via REST API or CLI. For other part, we can wait [~vinodkv]'s feedback. Server side PB changes for Priority Label Manager and Admin CLI support --- Key: YARN-2896 URL: https://issues.apache.org/jira/browse/YARN-2896 Project: Hadoop YARN Issue Type: Sub-task Components: api, resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2896.patch, 0002-YARN-2896.patch, 0003-YARN-2896.patch, 0004-YARN-2896.patch Common changes: * PB support changes required for Admin APIs * PB support for File System store (Priority Label Store) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3082) Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
[ https://issues.apache.org/jira/browse/YARN-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288099#comment-14288099 ] Hadoop QA commented on YARN-3082: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12693959/YARN-3082.002.patch against trunk revision 825923f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6390//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6390//console This message is automatically generated. Non thread safe access to systemCredentials in NodeHeartbeatResponse processing --- Key: YARN-3082 URL: https://issues.apache.org/jira/browse/YARN-3082 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3082.001.patch, YARN-3082.002.patch When you use system credentials via feature added in YARN-2704, the proto conversion code throws exception in converting ByteBuffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3082) Non thread safe access to systemCredentials in NodeHeartbeatResponse processing
[ https://issues.apache.org/jira/browse/YARN-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3082: Attachment: YARN-3082.002.patch Addressed feedback Non thread safe access to systemCredentials in NodeHeartbeatResponse processing --- Key: YARN-3082 URL: https://issues.apache.org/jira/browse/YARN-3082 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3082.001.patch, YARN-3082.002.patch When you use system credentials via feature added in YARN-2704, the proto conversion code throws exception in converting ByteBuffer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288029#comment-14288029 ] Varun Vasudev commented on YARN-160: bq. HADOOP_HEAPSIZE_MAX in trunk. HADOOP_HEAPSIZE was deprecated. Thanks for pointing this out Allen. I'll provide a patch for trunk and one for branch-2 when I address Vinod's comments. bq. yarn.nodemanager.count-logical-processors-as-cores: Not sure of the use for this. On Linux, shouldn't we simply use the the returned numCores if they are valid? And fall-back to numProcessors? Some people prefer to count hyperthreads as a CPU and some don't. This lets users choose. bq. yarn.nodemanager.enable-hardware-capability-detection: I think specifying the capabilities to be -1 is already a way to trigger this automatic detection, let's simply drop the flag and assume it to be true all the time? Junping felt we should add it to cover upgrade scenarios. What do you think? bq. We already have resource.percentage-physical-cpu-limit for CPUs - YARN-2440. How about simply adding a resource.percentage-pmem-limit instead making it a magic number in the code? Of course, we can have a default reserved percentage. I think resource.percentage-pmem-limit should be analogous to resource.percentage-physical-cpu-limit in that it sets the limit as a percentage of total memory. What about something like yarn.nodemanager.default-percentage-pmem-limit? nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Fix For: 2.7.0 Attachments: apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3028) Better syntax for replace label CLI
[ https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288069#comment-14288069 ] Wangda Tan commented on YARN-3028: -- [~rohithsharma], Thanks for updating, but did you forget attaching the patch? :-) Better syntax for replace label CLI --- Key: YARN-3028 URL: https://issues.apache.org/jira/browse/YARN-3028 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3028.patch The command to replace label now is such: {code} yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2] {code} Instead of {code} node1:port,label1,label2 {code} I think it's better to say {code} node1:port=label1,label2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3049) implement existing ATS queries in the new ATS design
[ https://issues.apache.org/jira/browse/YARN-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-3049: -- Assignee: Zhijie Shen (was: Varun Saxena) implement existing ATS queries in the new ATS design Key: YARN-3049 URL: https://issues.apache.org/jira/browse/YARN-3049 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Zhijie Shen Implement existing ATS queries with the new ATS reader design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3040) implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-3040: -- Assignee: Robert Kanter (was: Naganarasimha G R) implement client-side API for handling flows Key: YARN-3040 URL: https://issues.apache.org/jira/browse/YARN-3040 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Robert Kanter Per design in YARN-2928, implement client-side API for handling *flows*. Frameworks should be able to define and pass in all attributes of flows and flow runs to YARN, and they should be passed into ATS writers. YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287708#comment-14287708 ] Sangjin Lee commented on YARN-2928: --- Just a reminder that we have an IRC channel for quick discussions on this effort at ##hadoop-ats on irc.freenode.net. We also have regular Google hangout status calls. Email me if you'd like to participate in the status calls. Application Timeline Server (ATS) next gen: phase 1 --- Key: YARN-2928 URL: https://issues.apache.org/jira/browse/YARN-2928 Project: Hadoop YARN Issue Type: New Feature Components: timelineserver Reporter: Sangjin Lee Assignee: Vinod Kumar Vavilapalli Priority: Critical Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf We have the application timeline server implemented in yarn per YARN-1530 and YARN-321. Although it is a great feature, we have recognized several critical issues and features that need to be addressed. This JIRA proposes the design and implementation changes to address those. This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3086) Make NodeManager memory configurable in MiniYARNCluster
Robert Metzger created YARN-3086: Summary: Make NodeManager memory configurable in MiniYARNCluster Key: YARN-3086 URL: https://issues.apache.org/jira/browse/YARN-3086 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Robert Metzger Priority: Minor Apache Flink has a build-in YARN client to deploy it to YARN clusters. Recently, we added more tests for the client, using the MiniYARNCluster. One of the tests is requesting more containers than available. This test works well on machines with enough memory, but on travis-ci (our test environment), the available main memory is limited to 3 GB. Therefore, I want to set custom amount of memory for each NodeManager. Right now, the NodeManager memory is hardcoded to 4GB. As discussed on the yarn-dev list, I'm going to create a patch for this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2990) FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests
[ https://issues.apache.org/jira/browse/YARN-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287871#comment-14287871 ] Sandy Ryza commented on YARN-2990: -- Other than the addition of the anyLocalRequests check core here: {code} + if (offSwitchRequest.getNumContainers() 0 + (!anyLocalRequests(priority) + || allowedLocality.equals(NodeType.OFF_SWITCH))) { {code} are the other changes core to the fix? If not, given that this is touchy code, can we leave things the way they are or make the changes in a separate cleanup JIRA? Also, a couple nits: * Need some extra indentation in the snippet above * anyLocalRequests is kind of a confusing name for that method, because any often means off-switch when thinking about locality. Maybe hasNodeOrRackRequests. FairScheduler's delay-scheduling always waits for node-local and rack-local delays, even for off-rack-only requests --- Key: YARN-2990 URL: https://issues.apache.org/jira/browse/YARN-2990 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-2990-0.patch, yarn-2990-1.patch, yarn-2990-test.patch Looking at the FairScheduler, it appears the node/rack locality delays are used for all requests, even those that are only off-rack. More details in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287921#comment-14287921 ] Wangda Tan commented on YARN-2800: -- Thanks [~ozawa] for review! Rebasing and will upload soon. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch, YARN-2800-20141119-1.patch, YARN-2800-20141203-1.patch, YARN-2800-20141205-1.patch, YARN-2800-20141205-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3088) LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error
Jason Lowe created YARN-3088: Summary: LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error Key: YARN-3088 URL: https://issues.apache.org/jira/browse/YARN-3088 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe If the native executor returns an error trying to delete a path as a particular user when dir==null then the code can NPE trying to build a log message for the error. It blindly deferences dir in the log message despite the code just above explicitly handling the cases when dir could be null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3088) LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error
[ https://issues.apache.org/jira/browse/YARN-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne reassigned YARN-3088: Assignee: Eric Payne LinuxContainerExecutor.deleteAsUser can throw NPE if native executor returns an error - Key: YARN-3088 URL: https://issues.apache.org/jira/browse/YARN-3088 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Eric Payne If the native executor returns an error trying to delete a path as a particular user when dir==null then the code can NPE trying to build a log message for the error. It blindly deferences dir in the log message despite the code just above explicitly handling the cases when dir could be null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288189#comment-14288189 ] Jason Lowe commented on YARN-3089: -- The failures are unfortunately not present in the NM log due to bug YARN-3088 preventing the log message from being generated properly. While debugging an instance of the nodemanager, I was able to see the error messages from the LCE executable, and they looked like the following: {noformat} Directory not found /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/syslog/ rmdir of /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/syslog/ failed - Permission denied Directory not found /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/stderr/ rmdir of /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/stderr/ failed - Permission denied Directory not found /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/stdout/ rmdir of /somepath/application_1421927171686_0163/container_1421927171686_0163_01_09/stdout/ failed - Permission denied {noformat} LinuxContainerExecutor does not handle file arguments to deleteAsUser - Key: YARN-3089 URL: https://issues.apache.org/jira/browse/YARN-3089 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Priority: Blocker YARN-2468 added the deletion of individual logs that are aggregated, but this fails to delete log files when the LCE is being used. The LCE native executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
[ https://issues.apache.org/jira/browse/YARN-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne reassigned YARN-3089: Assignee: Eric Payne LinuxContainerExecutor does not handle file arguments to deleteAsUser - Key: YARN-3089 URL: https://issues.apache.org/jira/browse/YARN-3089 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Eric Payne Priority: Blocker YARN-2468 added the deletion of individual logs that are aggregated, but this fails to delete log files when the LCE is being used. The LCE native executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2868) Add metric for initial container launch time
[ https://issues.apache.org/jira/browse/YARN-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288175#comment-14288175 ] Wangda Tan commented on YARN-2868: -- [~rchiang], Just reviewed patch, I'm not sure if you misunderstood what I and Karthik meant. I agree with what you mentioned in https://issues.apache.org/jira/browse/YARN-2868?focusedCommentId=14274308page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14274308 (comment#1) and also Karthik's comment: https://issues.apache.org/jira/browse/YARN-2868?focusedCommentId=14274317page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14274317. It's better to keep AtomicLong as what you originally done. Lock application from caller is not clear to me. Thanks, Add metric for initial container launch time Key: YARN-2868 URL: https://issues.apache.org/jira/browse/YARN-2868 Project: Hadoop YARN Issue Type: Improvement Reporter: Ray Chiang Assignee: Ray Chiang Labels: metrics, supportability Attachments: YARN-2868-01.patch, YARN-2868.002.patch, YARN-2868.003.patch, YARN-2868.004.patch, YARN-2868.005.patch, YARN-2868.006.patch Add a metric to measure the latency between starting container allocation and first container actually allocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3089) LinuxContainerExecutor does not handle file arguments to deleteAsUser
Jason Lowe created YARN-3089: Summary: LinuxContainerExecutor does not handle file arguments to deleteAsUser Key: YARN-3089 URL: https://issues.apache.org/jira/browse/YARN-3089 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Priority: Blocker YARN-2468 added the deletion of individual logs that are aggregated, but this fails to delete log files when the LCE is being used. The LCE native executable assumes the paths being passed are paths and the delete fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3028) Better syntax for replace label CLI
[ https://issues.apache.org/jira/browse/YARN-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288197#comment-14288197 ] Rohith commented on YARN-3028: -- bq. but did you forget attaching the patch? I mean to say for previously attached patch only. Better syntax for replace label CLI --- Key: YARN-3028 URL: https://issues.apache.org/jira/browse/YARN-3028 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3028.patch The command to replace label now is such: {code} yarn rmadmin -replaceLabelsOnNode [node1:port,label1,label2 node2:port,label1,label2] {code} Instead of {code} node1:port,label1,label2 {code} I think it's better to say {code} node1:port=label1,label2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3090) DeletionService can silently ignore deletion task failures
Jason Lowe created YARN-3090: Summary: DeletionService can silently ignore deletion task failures Key: YARN-3090 URL: https://issues.apache.org/jira/browse/YARN-3090 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe If a non-I/O exception occurs while the DeletionService is executing a deletion task then it will be silently ignored. The exception bubbles up to the thread workers of the ScheduledThreadPoolExecutor which simply attaches the throwable to the Future that was returned when the task was scheduled. However the thread pool is used as a fire-and-forget pool, so nothing ever looks at the Future and therefore the exception is never logged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3090) DeletionService can silently ignore deletion task failures
[ https://issues.apache.org/jira/browse/YARN-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288218#comment-14288218 ] Jason Lowe commented on YARN-3090: -- An easy way to at least log a message when something terrible occurrs to a deletion task is to use a derived ScheduledThreadPoolExecutor that overrides the afterExecute method to log any throwable that was associated with the task. DeletionService can silently ignore deletion task failures -- Key: YARN-3090 URL: https://issues.apache.org/jira/browse/YARN-3090 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe If a non-I/O exception occurs while the DeletionService is executing a deletion task then it will be silently ignored. The exception bubbles up to the thread workers of the ScheduledThreadPoolExecutor which simply attaches the throwable to the Future that was returned when the task was scheduled. However the thread pool is used as a fire-and-forget pool, so nothing ever looks at the Future and therefore the exception is never logged. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2828) Enable auto refresh of web pages (using http parameter)
[ https://issues.apache.org/jira/browse/YARN-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay Bhat updated YARN-2828: - Attachment: YARN-2828.001.patch Enable auto refresh of web pages (using http parameter) --- Key: YARN-2828 URL: https://issues.apache.org/jira/browse/YARN-2828 Project: Hadoop YARN Issue Type: Improvement Reporter: Tim Robertson Assignee: Vijay Bhat Priority: Minor Attachments: YARN-2828.001.patch The MR1 Job Tracker had a useful HTTP parameter of e.g. refresh=3 that could be appended to URLs which enabled a page reload. This was very useful when developing mapreduce jobs, especially to watch counters changing. This is lost in the the Yarn interface. Could be implemented as a page element (e.g. drop down or so), but I'd recommend that the page not be more cluttered, and simply bring back the optional refresh HTTP param. It worked really nicely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)