[jira] [Updated] (YARN-3131) YarnClientImpl should check FAILED and KILLED state in submitApplication
[ https://issues.apache.org/jira/browse/YARN-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated YARN-3131: --- Attachment: yarn_3131_v7.patch YarnClientImpl should check FAILED and KILLED state in submitApplication Key: YARN-3131 URL: https://issues.apache.org/jira/browse/YARN-3131 Project: Hadoop YARN Issue Type: Bug Reporter: Chang Li Assignee: Chang Li Attachments: yarn_3131_v1.patch, yarn_3131_v2.patch, yarn_3131_v3.patch, yarn_3131_v4.patch, yarn_3131_v5.patch, yarn_3131_v6.patch, yarn_3131_v7.patch Just run into a issue when submit a job into a non-existent queue and YarnClient raise no exception. Though that job indeed get submitted successfully and just failed immediately after, it will be better if YarnClient can handle the immediate fail situation like YarnRunner does -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2986) (Umbrella) Support hierarchical and unified scheduler configuration
[ https://issues.apache.org/jira/browse/YARN-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337372#comment-14337372 ] Wangda Tan commented on YARN-2986: -- Hi [~jianhe], I think it should only take effect when user puts it to correct place, we can either ignore it or throw exception when user puts it to a incorrect tag. Adding policy-properties is a double-edged sword, as you said, user has to learn which configuration should be policy-properties. But in the other hand, if we support different policy for different queues in the future (e.g. some queues are fair-based scheduling, some queues are fifo-based scheduling, some queues are priority based scheduling). Admin can get clear understanding, when policy need changed/updated, which fields need to be added/removed/updated. I more prefer adding the policy-properties. Thanks, (Umbrella) Support hierarchical and unified scheduler configuration --- Key: YARN-2986 URL: https://issues.apache.org/jira/browse/YARN-2986 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Wangda Tan Attachments: YARN-2986.1.patch Today's scheduler configuration is fragmented and non-intuitive, and needs to be improved. Details in comments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3264) [Storage implementation] Create a POC only file based storage implementation for ATS writes
Vrushali C created YARN-3264: Summary: [Storage implementation] Create a POC only file based storage implementation for ATS writes Key: YARN-3264 URL: https://issues.apache.org/jira/browse/YARN-3264 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vrushali C Assignee: Vrushali C For the PoC, need to create a backend impl for file based storage of entities -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2893) AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream
[ https://issues.apache.org/jira/browse/YARN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-2893: Attachment: YARN-2893.000.patch AMLaucher: sporadic job failures due to EOFException in readTokenStorageStream -- Key: YARN-2893 URL: https://issues.apache.org/jira/browse/YARN-2893 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Gera Shegalov Assignee: zhihai xu Attachments: YARN-2893.000.patch MapReduce jobs on our clusters experience sporadic failures due to corrupt tokens in the AM launch context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3251: - Issue Type: Bug (was: Sub-task) Parent: (was: YARN-3243) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Wangda Tan Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.trunk.1.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3251: - Assignee: Craig Welch (was: Wangda Tan) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Craig Welch Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.trunk.1.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337567#comment-14337567 ] Wangda Tan commented on YARN-3251: -- +1 for the suggestion, moved it out of YARN-3243, changed target version and also updated title. CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Wangda Tan Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.trunk.1.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3131) YarnClientImpl should check FAILED and KILLED state in submitApplication
[ https://issues.apache.org/jira/browse/YARN-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337585#comment-14337585 ] Vinod Kumar Vavilapalli commented on YARN-3131: --- Yeah, I missed the larger context in which this code works i.e. app submissions. Tx for the confirmation [~jlowe]. [~lichangleo], I am good. I'll let either [~jlowe] or other committers commit it as they spent more time on the patch. YarnClientImpl should check FAILED and KILLED state in submitApplication Key: YARN-3131 URL: https://issues.apache.org/jira/browse/YARN-3131 Project: Hadoop YARN Issue Type: Bug Reporter: Chang Li Assignee: Chang Li Attachments: yarn_3131_v1.patch, yarn_3131_v2.patch, yarn_3131_v3.patch, yarn_3131_v4.patch, yarn_3131_v5.patch, yarn_3131_v6.patch, yarn_3131_v7.patch Just run into a issue when submit a job into a non-existent queue and YarnClient raise no exception. Though that job indeed get submitted successfully and just failed immediately after, it will be better if YarnClient can handle the immediate fail situation like YarnRunner does -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3255) RM and NM main() should support generic options
[ https://issues.apache.org/jira/browse/YARN-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated YARN-3255: -- Attachment: YARN-3255-02.patch Added {{GenericOptionsParser}} to {{WebAppProxyServer}} and {{JobHistoryServer}}. Checked other builds on the findbugs issue. Looks like all of them report 5 new findbugs warnings. Something with the build I guess. TestAllocationFileLoaderService is passing locally. RM and NM main() should support generic options --- Key: YARN-3255 URL: https://issues.apache.org/jira/browse/YARN-3255 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 2.5.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: YARN-3255-01.patch, YARN-3255-02.patch Currently {{ResourceManager.main()}} and {{NodeManager.main()}} ignore generic options, like {{-conf}} and {{-fs}}. It would be good to have the ability to pass generic options in order to specify configuration files or the NameNode location, when the services start through {{main()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile
[ https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337668#comment-14337668 ] Abin Shahab commented on YARN-3080: --- [~raviprakash], [~vinodkv], [~vvasudev] [~ywskycn] Please review. The DockerContainerExecutor could not write the right pid to container pidFile -- Key: YARN-3080 URL: https://issues.apache.org/jira/browse/YARN-3080 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Beckham007 Assignee: Abin Shahab Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch The docker_container_executor_session.sh is like this: {quote} #!/usr/bin/env bash echo `/usr/bin/docker inspect --format {{.State.Pid}} container_1421723685222_0008_01_02` /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp /bin/mv -f /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid /usr/bin/docker run --rm --name container_1421723685222_0008_01_02 -e GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M --cpu-shares=1024 -v /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02 -v /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02 -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh {quote} The DockerContainerExecutor use docker inspect before docker run, so the docker inspect couldn't get the right pid for the docker, signalContainer() and nm restart would fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3262) Suface application resource requests table
[ https://issues.apache.org/jira/browse/YARN-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-3262: -- Attachment: YARN-3262.1.patch uploaded a patch to surface the outstanding resource requests on the UI Suface application resource requests table -- Key: YARN-3262 URL: https://issues.apache.org/jira/browse/YARN-3262 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: Jian He Assignee: Jian He Attachments: YARN-3262.1.patch, resource requests.png It would be useful to surface the resource requests table on the application web page to facilitate scheduling analysis and debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3254) HealthReport should include disk full information
[ https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337364#comment-14337364 ] Hadoop QA commented on YARN-3254: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700871/YARN-3254-002.patch against trunk revision caa42ad. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6742//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6742//console This message is automatically generated. HealthReport should include disk full information - Key: YARN-3254 URL: https://issues.apache.org/jira/browse/YARN-3254 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.6.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch When a NodeManager's local disk gets almost full, the NodeManager sends a health report to ResourceManager that local/log dir is bad and the message is displayed on ResourceManager Web UI. It's difficult for users to detect why the dir is bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3031) [Storage abstraction] Create backing storage write interface for ATS writers
[ https://issues.apache.org/jira/browse/YARN-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337393#comment-14337393 ] Vrushali C commented on YARN-3031: -- Yes, will update it in the next patch. [Storage abstraction] Create backing storage write interface for ATS writers Key: YARN-3031 URL: https://issues.apache.org/jira/browse/YARN-3031 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Vrushali C Attachments: Sequence_diagram_write_interaction.2.png, Sequence_diagram_write_interaction.png, YARN-3031.01.patch, YARN-3031.02.patch, YARN-3031.03.patch Per design in YARN-2928, come up with the interface for the ATS writer to write to various backing storages. The interface should be created to capture the right level of abstractions so that it will enable all backing storage implementations to implement it efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2928) Application Timeline Server (ATS) next gen: phase 1
[ https://issues.apache.org/jira/browse/YARN-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337419#comment-14337419 ] Gopal V commented on YARN-2928: --- The original discussion about ATS v1 drew inspiration from existing systems like rsyslog and scribe, which are simple systems which buffer/route/forward into a central store. Those mechanisms were very useful in duplicating higher priority (and rare) events for immediate alerting/dashboards (errors/sec etc). Are there any plans to include intermediate routing/forwarding systems for ATS v2? The tail -f | grep firehose across a cluster is useful in avoiding scalability issues when looking for rare events in a distributed store. Being able to route something like a node blacklisting event from an AppMaster to such a system would prevent the fault checker systems from having to produce irrelevant ATS traffic periodically to scrape through it. Application Timeline Server (ATS) next gen: phase 1 --- Key: YARN-2928 URL: https://issues.apache.org/jira/browse/YARN-2928 Project: Hadoop YARN Issue Type: New Feature Components: timelineserver Reporter: Sangjin Lee Assignee: Sangjin Lee Priority: Critical Attachments: ATSv2.rev1.pdf, ATSv2.rev2.pdf, Data model proposal v1.pdf We have the application timeline server implemented in yarn per YARN-1530 and YARN-321. Although it is a great feature, we have recognized several critical issues and features that need to be addressed. This JIRA proposes the design and implementation changes to address those. This is phase 1 of this effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337529#comment-14337529 ] Vinod Kumar Vavilapalli commented on YARN-3251: --- Actually let's use this JIRA for 2.6.1 and YARN-3243 for the trunk fix. CapacityScheduler deadlock when computing absolute max avail capacity - Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Wangda Tan Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.trunk.1.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3251: - Summary: CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) (was: CapacityScheduler deadlock when computing absolute max avail capacity) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Wangda Tan Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.trunk.1.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3265) CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2)
[ https://issues.apache.org/jira/browse/YARN-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337577#comment-14337577 ] Vinod Kumar Vavilapalli commented on YARN-3265: --- Pasting comments I was going to paste on YARN-3251. ResourceUsage - Not sure why we need to do this, can we fix this? {code} +return getHeadroom(NL, Resources.none()); {code} - Should we instead call this headRoom to be something like a resourceLimit and make it a different object? - Also, instead of setting this object in the leaf-queue, the better model is to simply pass it down in the allocateContainer() API. ParentQueue - Can we improve the names in setHeadroomOfChild API? It's a complicated method. - When updateClusterResource() is called, do we really need the setHeadRoom call() after we start passing down the API? Testcase: Can you draw a simple tree of the queues for readability? It's very confusing that application-headroom gives the extra room apps have, but the one inside LeafQueue.QueueHeadroomInfo is overall allocation possible (including used resources). We should fix this - not caused by your patch though. CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2) -- Key: YARN-3265 URL: https://issues.apache.org/jira/browse/YARN-3265 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Blocker Attachments: YARN-3265.1.patch This patch is trying to solve the same problem described in YARN-3251, but this is a longer term fix for trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3254) HealthReport should include disk full information
[ https://issues.apache.org/jira/browse/YARN-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337366#comment-14337366 ] Hadoop QA commented on YARN-3254: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700883/Screen%20Shot%202015-02-25%20at%2014.38.10.png against trunk revision caa42ad. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6745//console This message is automatically generated. HealthReport should include disk full information - Key: YARN-3254 URL: https://issues.apache.org/jira/browse/YARN-3254 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.6.0 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: Screen Shot 2015-02-24 at 17.57.39.png, Screen Shot 2015-02-25 at 14.38.10.png, YARN-3254-001.patch, YARN-3254-002.patch When a NodeManager's local disk gets almost full, the NodeManager sends a health report to ResourceManager that local/log dir is bad and the message is displayed on ResourceManager Web UI. It's difficult for users to detect why the dir is bad. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3265) CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2)
Wangda Tan created YARN-3265: Summary: CapacityScheduler deadlock when computing absolute max avail capacity (fix for trunk/branch-2) Key: YARN-3265 URL: https://issues.apache.org/jira/browse/YARN-3265 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan This patch is trying to solve the same problem described in YARN-3251, but this is a longer term fix for trunk and branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.
[ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337592#comment-14337592 ] Vinod Kumar Vavilapalli commented on YARN-41: - [~djp], can you please respond to [~devaraj.k]'s comments? bq. It is not for decommissioning of NM and it is for handling 'yarn-daemon.sh stop nodemanager' and kill nmPid (not for 'kill -9 nmPid' i.e. abrupt kill) This is the use-case that I am interested in addressing here, more than a decommission through RM. The RM should handle the graceful shutdown of the NM. - Key: YARN-41 URL: https://issues.apache.org/jira/browse/YARN-41 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Ravi Teja Ch N V Assignee: Devaraj K Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, YARN-41.patch Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown gracefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337622#comment-14337622 ] Vinod Kumar Vavilapalli commented on YARN-2423: --- [~rkanter], I am just pointing that the existing users today directly use the raw TimelineClient APIs. Adding a new API for an older framework we are rewriting anyways is backwards to me and additional burden of compatibility - unless we can write these new APIs in such a way that we know for sure that we will support them in the YARN-2928 rewrite. [~vanzin], can Spark use the existing Timeline APIs that we already support and trying to maintain compatibility for MapReduce, Tez and Hive projects? The problem is that even if we implement the proposal in this JIRA, come YARN-2928, these new APIs may break and so Spark will have to add shim layers in either cases. TimelineClient should wrap all GET APIs to facilitate Java users Key: YARN-2423 URL: https://issues.apache.org/jira/browse/YARN-2423 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Robert Kanter Attachments: YARN-2423.004.patch, YARN-2423.005.patch, YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch TimelineClient provides the Java method to put timeline entities. It's also good to wrap over all GET APIs (both entity and domain), and deserialize the json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile
[ https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337641#comment-14337641 ] Hadoop QA commented on YARN-3080: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700921/YARN-3080.patch against trunk revision d140d76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6749//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6749//console This message is automatically generated. The DockerContainerExecutor could not write the right pid to container pidFile -- Key: YARN-3080 URL: https://issues.apache.org/jira/browse/YARN-3080 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: Beckham007 Assignee: Abin Shahab Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch The docker_container_executor_session.sh is like this: {quote} #!/usr/bin/env bash echo `/usr/bin/docker inspect --format {{.State.Pid}} container_1421723685222_0008_01_02` /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp /bin/mv -f /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid.tmp /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_02/container_1421723685222_0008_01_02.pid /usr/bin/docker run --rm --name container_1421723685222_0008_01_02 -e GAIA_HOST_IP=c162 -e GAIA_API_SERVER=10.6.207.226:8080 -e GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin -e GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e GAIA_CONTAINER_ID=container_1421723685222_0008_01_02 --memory=32M --cpu-shares=1024 -v /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_02 -v /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02 -P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_02/launch_container.sh {quote} The DockerContainerExecutor use docker inspect before docker run, so the docker inspect couldn't get the right pid for the docker, signalContainer() and nm restart would fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3125) [Event producers] Change distributed shell to use new timeline service
[ https://issues.apache.org/jira/browse/YARN-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3125: - Attachment: YARN-3125v3.patch Thanks [~zjshen] for review and comments! Address both of your comments in v3 patch. [Event producers] Change distributed shell to use new timeline service -- Key: YARN-3125 URL: https://issues.apache.org/jira/browse/YARN-3125 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Junping Du Attachments: YARN-3125.patch, YARN-3125v2.patch, YARN-3125v3.patch We can start with changing distributed shell to use new timeline service once the framework is completed, in which way we can quickly verify the next gen is working fine end-to-end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337921#comment-14337921 ] Hadoop QA commented on YARN-3087: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700972/YARN-3087-022515.patch against trunk revision 71385f9. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6751//console This message is automatically generated. [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Li Lu Attachments: YARN-3087-022315.patch, YARN-3087-022515.patch This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3251: -- Attachment: YARN-3251.2-6-0.2.patch Patch against branch-2.6.0 CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Craig Welch Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3251) CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1)
[ https://issues.apache.org/jira/browse/YARN-3251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337962#comment-14337962 ] Craig Welch commented on YARN-3251: --- bq. 1) Since the target of your patch is to make a quick fix for old version, it's better to create a patch in branch-2.6 done bq. And patch I'm working on now will remove the CSQueueUtils.computeMaxAvailResource, so it's no need to add a intermediate fix in branch-2. I suppose that depends on whether anyone needs a trunk version of the patch before the other changes are landed - if someone asks for it I could quickly update the original patch to provide it bq. 2) I think CSQueueUtils.getAbsoluteMaxAvailCapacity doesn't hold child/parent's lock together, maybe we don't need to change that, could you confirm? it doesn't, the change there was to insure consistency for multiple values used from the queue, as previously it was occurring inside a lock and that was guaranteed, now it isn't. However, there's no need to lock on the parent, so I removed that bq. 3) Maybe we don't need getter/setter of absoluteMaxAvailCapacity in queue, a volatile float is enough? Yes, that should be safe, done CapacityScheduler deadlock when computing absolute max avail capacity (short term fix for 2.6.1) Key: YARN-3251 URL: https://issues.apache.org/jira/browse/YARN-3251 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Craig Welch Priority: Blocker Attachments: YARN-3251.1.patch, YARN-3251.2-6-0.2.patch The ResourceManager can deadlock in the CapacityScheduler when computing the absolute max available capacity for user limits and headroom. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu reassigned YARN-3087: --- Assignee: Li Lu (was: Devaraj K) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Li Lu Attachments: YARN-3087-022315.patch, YARN-3087-022515.patch This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337914#comment-14337914 ] Li Lu commented on YARN-3087: - Assign this to myself since this is blocking our prototype, and I've got patch to this JIRA. [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Li Lu Attachments: YARN-3087-022315.patch, YARN-3087-022515.patch This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3087: Attachment: YARN-3087-022515.patch I've just updated my patch. This patch is rebased to the latest YARN-2928 branch. Other than all changes happened in the previous version, I added support functions in our object model classes for REST calls. I also added a static user web filter to the newly created web server. This patch passed the (newly added, in YARN-3240) TestTimelineServiceClientIntegration. [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Devaraj K Attachments: YARN-3087-022315.patch, YARN-3087-022515.patch This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337925#comment-14337925 ] Sangjin Lee commented on YARN-2902: --- We definitely see this (usually coupled with localization failures) with PUBLIC resources. I don't have easy access to the scenarios at the moment, but will be able to provide it next week. Killing a container that is localizing can orphan resources in the DOWNLOADING state Key: YARN-2902 URL: https://issues.apache.org/jira/browse/YARN-2902 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.5.0 Reporter: Jason Lowe Assignee: Varun Saxena Fix For: 2.7.0 Attachments: YARN-2902.002.patch, YARN-2902.patch If a container is in the process of localizing when it is stopped/killed then resources are left in the DOWNLOADING state. If no other container comes along and requests these resources they linger around with no reference counts but aren't cleaned up during normal cache cleanup scans since it will never delete resources in the DOWNLOADING state even if their reference count is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3087) [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager
[ https://issues.apache.org/jira/browse/YARN-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3087: Target Version/s: YARN-2928 Fix Version/s: YARN-2928 [Aggregator implementation] the REST server (web server) for per-node aggregator does not work if it runs inside node manager - Key: YARN-3087 URL: https://issues.apache.org/jira/browse/YARN-3087 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Li Lu Fix For: YARN-2928 Attachments: YARN-3087-022315.patch, YARN-3087-022515.patch This is related to YARN-3030. YARN-3030 sets up a per-node timeline aggregator and the associated REST server. It runs fine as a standalone process, but does not work if it runs inside the node manager due to possible collisions of servlet mapping. Exception: {noformat} org.apache.hadoop.yarn.webapp.WebAppException: /v2/timeline: controller for v2 not found at org.apache.hadoop.yarn.webapp.Router.resolveDefault(Router.java:232) at org.apache.hadoop.yarn.webapp.Router.resolve(Router.java:140) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:134) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3261) rewrite resourcemanager restart doc to remove roadmap bits
[ https://issues.apache.org/jira/browse/YARN-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gururaj Shetty reassigned YARN-3261: Assignee: Gururaj Shetty rewrite resourcemanager restart doc to remove roadmap bits --- Key: YARN-3261 URL: https://issues.apache.org/jira/browse/YARN-3261 Project: Hadoop YARN Issue Type: Bug Reporter: Allen Wittenauer Assignee: Gururaj Shetty Another mixture of roadmap and instruction manual that seems to be ever present in a lot of the recently written documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337995#comment-14337995 ] Hadoop QA commented on YARN-3217: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700976/YARN-3217-004.patch against trunk revision 71385f9. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6752//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6752//console This message is automatically generated. Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217-004.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3217: --- Attachment: YARN-3217-004.patch rebased the patch Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217-004.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3240) [Data Mode] Implement client API to put generic entities
[ https://issues.apache.org/jira/browse/YARN-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved YARN-3240. -- Resolution: Fixed Fix Version/s: YARN-2928 Hadoop Flags: Reviewed I have commit v3 patch to YARN-2928 branch. Thanks [~zjshen] for the patch and [~gtCarrera9] for review! [Data Mode] Implement client API to put generic entities Key: YARN-3240 URL: https://issues.apache.org/jira/browse/YARN-3240 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: YARN-2928 Attachments: YARN-3240.1.patch, YARN-3240.2.patch, YARN-3240.3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3168) Convert site documentation from apt to markdown
[ https://issues.apache.org/jira/browse/YARN-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gururaj Shetty updated YARN-3168: - Attachment: YARN-3168.20150225.2.patch Convert site documentation from apt to markdown --- Key: YARN-3168 URL: https://issues.apache.org/jira/browse/YARN-3168 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.7.0 Reporter: Allen Wittenauer Assignee: Gururaj Shetty Attachments: YARN-3168-00.patch, YARN-3168.20150224.1.patch, YARN-3168.20150225.2.patch YARN analog to HADOOP-11495 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3256) TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336333#comment-14336333 ] Rohith commented on YARN-3256: -- Good catch!! +1(non-binding) lgtm TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase Key: YARN-3256 URL: https://issues.apache.org/jira/browse/YARN-3256 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3256.001.patch The test testClientTokenRace was not using the base class conf causing it to run twice on the same Scheduler configured in the default. All tests deriving from ParameterizedSchedulerTestBase should use the conf created in the base class instead of newing up inside the test and hiding the member. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3168) Convert site documentation from apt to markdown
[ https://issues.apache.org/jira/browse/YARN-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336332#comment-14336332 ] Gururaj Shetty commented on YARN-3168: -- [~Naganarasimha Garla] Updated your comments and the latest patch is uploaded. Please review. Convert site documentation from apt to markdown --- Key: YARN-3168 URL: https://issues.apache.org/jira/browse/YARN-3168 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.7.0 Reporter: Allen Wittenauer Assignee: Gururaj Shetty Attachments: YARN-3168-00.patch, YARN-3168.20150224.1.patch, YARN-3168.20150225.2.patch YARN analog to HADOOP-11495 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3257) FairScheduler: MaxAm may be set too low preventing apps from starting
Anubhav Dhoot created YARN-3257: --- Summary: FairScheduler: MaxAm may be set too low preventing apps from starting Key: YARN-3257 URL: https://issues.apache.org/jira/browse/YARN-3257 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot In YARN-2637 CapacityScheduler#LeafQueue does not enforce max am share if the limit prevents the first application from starting. This would be good to add to FSLeafQueue as well -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3256) TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase
Anubhav Dhoot created YARN-3256: --- Summary: TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase Key: YARN-3256 URL: https://issues.apache.org/jira/browse/YARN-3256 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot The test testClientTokenRace was not using the base class conf causing it to run twice on the same Scheduler configured in the default. All tests deriving from ParameterizedSchedulerTestBase should use the conf created in the base class instead of newing up inside the test and hiding the member. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3256) TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3256: Attachment: YARN-3256.001.patch Fix that removes the local conf object hiding the base class conf. This allows it to use the right conf for the matching scheduler based on ParametrizedTestBase TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase Key: YARN-3256 URL: https://issues.apache.org/jira/browse/YARN-3256 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3256.001.patch The test testClientTokenRace was not using the base class conf causing it to run twice on the same Scheduler configured in the default. All tests deriving from ParameterizedSchedulerTestBase should use the conf created in the base class instead of newing up inside the test and hiding the member. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3249) Add the kill application to the Resource Manager Web UI
[ https://issues.apache.org/jira/browse/YARN-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi Ozawa updated YARN-3249: - Attachment: killapp-failed.log [~ryu_kobayashi] thank you for updating. I deployed pseudo distributed hadoop cluster with your patch, but it cannot kill the application on my local. After pushing the kill app button, I got 50x error. Attached a log when I faced the problem. Could you check it? Add the kill application to the Resource Manager Web UI --- Key: YARN-3249 URL: https://issues.apache.org/jira/browse/YARN-3249 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0, 2.7.0 Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Priority: Minor Attachments: YARN-3249.2.patch, YARN-3249.2.patch, YARN-3249.patch, killapp-failed.log, screenshot.png It want to kill the application on the JobTracker similarly Web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3249) Add the kill application to the Resource Manager Web UI
[ https://issues.apache.org/jira/browse/YARN-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336314#comment-14336314 ] Hadoop QA commented on YARN-3249: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700711/killapp-failed.log against trunk revision ad8ed3e. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6730//console This message is automatically generated. Add the kill application to the Resource Manager Web UI --- Key: YARN-3249 URL: https://issues.apache.org/jira/browse/YARN-3249 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0, 2.7.0 Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Priority: Minor Attachments: YARN-3249.2.patch, YARN-3249.2.patch, YARN-3249.patch, killapp-failed.log, screenshot.png It want to kill the application on the JobTracker similarly Web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336183#comment-14336183 ] Hadoop QA commented on YARN-3217: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700687/YARN-3217-003.patch against trunk revision 6cbd9f1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color}. The applied patch generated 1151 javac compiler warnings (more than the trunk's current 207 warnings). {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 47 warning messages. See https://builds.apache.org/job/PreCommit-YARN-Build/6725//artifact/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6725//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6725//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6725//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6725//console This message is automatically generated. Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi Ozawa updated YARN-3217: - Attachment: YARN-3217-003.patch javac warnings looks strange - let me submit same patch again. Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336197#comment-14336197 ] Brahma Reddy Battula commented on YARN-3217: {{findbugs}},{{Java doc warn}} and {{Test Failures}} are unrelated this patch. Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3249) Add the kill application to the Resource Manager Web UI
[ https://issues.apache.org/jira/browse/YARN-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336212#comment-14336212 ] Hadoop QA commented on YARN-3249: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700695/YARN-3249.2.patch against trunk revision 6cbd9f1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6727//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6727//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6727//console This message is automatically generated. Add the kill application to the Resource Manager Web UI --- Key: YARN-3249 URL: https://issues.apache.org/jira/browse/YARN-3249 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0, 2.7.0 Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Priority: Minor Attachments: YARN-3249.2.patch, YARN-3249.2.patch, YARN-3249.patch, screenshot.png It want to kill the application on the JobTracker similarly Web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336246#comment-14336246 ] Hadoop QA commented on YARN-3217: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700701/YARN-3217-003.patch against trunk revision ad8ed3e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6728//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6728//console This message is automatically generated. Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3255) RM and NM main() should support generic options
[ https://issues.apache.org/jira/browse/YARN-3255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336204#comment-14336204 ] Hadoop QA commented on YARN-3255: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700692/YARN-3255-01.patch against trunk revision 6cbd9f1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6726//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6726//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6726//console This message is automatically generated. RM and NM main() should support generic options --- Key: YARN-3255 URL: https://issues.apache.org/jira/browse/YARN-3255 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 2.5.0 Reporter: Konstantin Shvachko Attachments: YARN-3255-01.patch Currently {{ResourceManager.main()}} and {{NodeManager.main()}} ignore generic options, like {{-conf}} and {{-fs}}. It would be good to have the ability to pass generic options in order to specify configuration files or the NameNode location, when the services start through {{main()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS writer service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336219#comment-14336219 ] Robert Kanter commented on YARN-3039: - Sorry for not commenting earlier. Thanks for taking this up [~djp]. Not using YARN-913 is fine if it's not going to make sense. I haven't looked too closely at it either; it just sounded like it might be helpful here. One comment on the patch: - Given that a particular NM is only interested in the Applications that are running on it, is there some way to have it only receive the aggregator info for those apps? This would decrease the amount of throw away data that gets sent. Also, can you update the design doc? Looking at the patch, it seems like some things have changed. (e.g. it's using protobufs instead of REST; which I think makes more sense here anyway). [Aggregator wireup] Implement ATS writer service discovery -- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, YARN-3039-no-test.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3257) FairScheduler: MaxAm may be set too low preventing apps from starting
[ https://issues.apache.org/jira/browse/YARN-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3257: --- Assignee: Anubhav Dhoot FairScheduler: MaxAm may be set too low preventing apps from starting - Key: YARN-3257 URL: https://issues.apache.org/jira/browse/YARN-3257 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot In YARN-2637 CapacityScheduler#LeafQueue does not enforce max am share if the limit prevents the first application from starting. This would be good to add to FSLeafQueue as well -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3257) FairScheduler: MaxAm may be set too low preventing apps from starting
[ https://issues.apache.org/jira/browse/YARN-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3257: Attachment: YARN-3257.001.patch Patch demonstrating the behavior change FairScheduler: MaxAm may be set too low preventing apps from starting - Key: YARN-3257 URL: https://issues.apache.org/jira/browse/YARN-3257 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3257.001.patch In YARN-2637 CapacityScheduler#LeafQueue does not enforce max am share if the limit prevents the first application from starting. This would be good to add to FSLeafQueue as well -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3256) TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336370#comment-14336370 ] Hadoop QA commented on YARN-3256: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12700710/YARN-3256.001.patch against trunk revision ad8ed3e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6729//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6729//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6729//console This message is automatically generated. TestClientToAMToken#testClientTokenRace is not running against all Schedulers even when using ParameterizedSchedulerTestBase Key: YARN-3256 URL: https://issues.apache.org/jira/browse/YARN-3256 Project: Hadoop YARN Issue Type: Bug Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3256.001.patch The test testClientTokenRace was not using the base class conf causing it to run twice on the same Scheduler configured in the default. All tests deriving from ParameterizedSchedulerTestBase should use the conf created in the base class instead of newing up inside the test and hiding the member. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3249) Add the kill application to the Resource Manager Web UI
[ https://issues.apache.org/jira/browse/YARN-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated YARN-3249: Attachment: YARN-3249.3.patch Add the kill application to the Resource Manager Web UI --- Key: YARN-3249 URL: https://issues.apache.org/jira/browse/YARN-3249 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0, 2.7.0 Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Priority: Minor Attachments: YARN-3249.2.patch, YARN-3249.2.patch, YARN-3249.3.patch, YARN-3249.patch, killapp-failed.log, screenshot.png It want to kill the application on the JobTracker similarly Web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3249) Add the kill application to the Resource Manager Web UI
[ https://issues.apache.org/jira/browse/YARN-3249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336375#comment-14336375 ] Ryu Kobayashi commented on YARN-3249: - [~ozawa] got it. I have attached a patch that was updated. Add the kill application to the Resource Manager Web UI --- Key: YARN-3249 URL: https://issues.apache.org/jira/browse/YARN-3249 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0, 2.7.0 Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Priority: Minor Attachments: YARN-3249.2.patch, YARN-3249.2.patch, YARN-3249.3.patch, YARN-3249.patch, killapp-failed.log, screenshot.png It want to kill the application on the JobTracker similarly Web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2980) Move health check script related functionality to hadoop-common
[ https://issues.apache.org/jira/browse/YARN-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336382#comment-14336382 ] Hudson commented on YARN-2980: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/115/]) YARN-2980. Move health check script related functionality to hadoop-common (Varun Saxena via aw) (aw: rev d4ac6822e1c5dfac504ced48f10ab57a55b49e93) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServicesContainers.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServicesApps.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestEventFlow.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthCheckerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NodeHealthScriptRunner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeHealthScriptRunner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeHealthService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java * hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestNodeHealthScriptRunner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java Move health check script related functionality to hadoop-common --- Key: YARN-2980 URL: https://issues.apache.org/jira/browse/YARN-2980 Project: Hadoop YARN Issue Type: Improvement Reporter: Ming Ma Assignee: Varun Saxena Fix For: 3.0.0 Attachments: YARN-2980.001.patch, YARN-2980.002.patch, YARN-2980.003.patch, YARN-2980.004.patch HDFS might want to leverage health check functionality available in YARN in both namenode https://issues.apache.org/jira/browse/HDFS-7400 and datanode https://issues.apache.org/jira/browse/HDFS-7441. We can move health check functionality including the protocol between hadoop daemons and health check script to hadoop-common. That will simplify the development and maintenance for both hadoop source code and health check script. Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3247) TestQueueMappings should use CapacityScheduler explicitly
[ https://issues.apache.org/jira/browse/YARN-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336390#comment-14336390 ] Hudson commented on YARN-3247: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #115 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/115/]) YARN-3247. TestQueueMappings should use CapacityScheduler explicitly. Contributed by Zhihai Xu. (ozawa: rev 6cbd9f1113fca9ff86fd6ffa783ecd54b147e0db) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueMappings.java TestQueueMappings should use CapacityScheduler explicitly - Key: YARN-3247 URL: https://issues.apache.org/jira/browse/YARN-3247 Project: Hadoop YARN Issue Type: Bug Components: test Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Trivial Attachments: YARN-3247.000.patch TestQueueMappings is only supported by CapacityScheduler. We should configure CapacityScheduler for this test. Otherwise if the default scheduler is set to FairScheduler, the test will fail with the following message: {code} Running org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.392 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings testQueueMapping(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings) Time elapsed: 2.202 sec ERROR! java.lang.ClassCastException: org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics cannot be cast to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:118) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1266) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1319) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:558) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:989) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:255) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.init(MockRM.java:103) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueMappings.testQueueMapping(TestQueueMappings.java:143) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3125) [Event producers] Change distributed shell to use new timeline service
[ https://issues.apache.org/jira/browse/YARN-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3125: - Attachment: YARN-3125v2.patch [Event producers] Change distributed shell to use new timeline service -- Key: YARN-3125 URL: https://issues.apache.org/jira/browse/YARN-3125 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Junping Du Attachments: YARN-3125.patch, YARN-3125v2.patch We can start with changing distributed shell to use new timeline service once the framework is completed, in which way we can quickly verify the next gen is working fine end-to-end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3125) [Event producers] Change distributed shell to use new timeline service
[ https://issues.apache.org/jira/browse/YARN-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336395#comment-14336395 ] Junping Du commented on YARN-3125: -- Thanks for comments, [~vinodkv] and [~zjshen]! Address both comments in v2 patch. [Event producers] Change distributed shell to use new timeline service -- Key: YARN-3125 URL: https://issues.apache.org/jira/browse/YARN-3125 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Junping Du Attachments: YARN-3125.patch, YARN-3125v2.patch We can start with changing distributed shell to use new timeline service once the framework is completed, in which way we can quickly verify the next gen is working fine end-to-end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1809) Synchronize RM and Generic History Service Web-UIs
[ https://issues.apache.org/jira/browse/YARN-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14336396#comment-14336396 ] Tsuyoshi Ozawa commented on YARN-1809: -- Sure, let me check. Synchronize RM and Generic History Service Web-UIs -- Key: YARN-1809 URL: https://issues.apache.org/jira/browse/YARN-1809 Project: Hadoop YARN Issue Type: Improvement Reporter: Zhijie Shen Assignee: Xuan Gong Attachments: YARN-1809.1.patch, YARN-1809.10.patch, YARN-1809.11.patch, YARN-1809.2.patch, YARN-1809.3.patch, YARN-1809.4.patch, YARN-1809.5.patch, YARN-1809.5.patch, YARN-1809.6.patch, YARN-1809.7.patch, YARN-1809.8.patch, YARN-1809.9.patch After YARN-953, the web-UI of generic history service is provide more information than that of RM, the details about app attempt and container. It's good to provide similar web-UIs, but retrieve the data from separate source, i.e., RM cache and history store respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)