[jira] [Commented] (YARN-3126) FairScheduler: queue's usedResource is always more than the maxResource limit

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251323#comment-15251323 ] Karthik Kambatla commented on YARN-3126: Commenting without looking into the related code.

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251279#comment-15251279 ] Sangjin Lee commented on YARN-3816: --- I'm also unable to reproduce the TestHBaseTimelineStorage failure.

[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251266#comment-15251266 ] Hadoop QA commented on YARN-4577: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251259#comment-15251259 ] Li Lu commented on YARN-3816: - BTW, the HBase storage UT failure looks a little bit weird. Is this an

[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3816: Attachment: YARN-3816-YARN-2928-v9.patch Flip v9 patch for the two wrong imports. I cannot reproduce the two UT

[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3816: Attachment: (was: YARN-3816-YARN-2928-v9.patch) > [Aggregation] App-level aggregation and accumulation for YARN

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251229#comment-15251229 ] Hadoop QA commented on YARN-3816: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Sunil G (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251230#comment-15251230 ] Sunil G commented on YARN-4846: --- Thanks [~bibinchundatt] and [~leftnoteasy]. As {{editpolicy}} was called

[jira] [Updated] (YARN-4577) Enable aux services to have their own custom classpath/jar file

2016-04-20 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4577: Attachment: YARN-4577.poc.patch > Enable aux services to have their own custom classpath/jar file >

[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file

2016-04-20 Thread Xuan Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251193#comment-15251193 ] Xuan Gong commented on YARN-4577: - [~sjlee0] Thanks for the suggestion. I have uploaded a poc patch for

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251165#comment-15251165 ] Hadoop QA commented on YARN-4976: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251093#comment-15251093 ] Hadoop QA commented on YARN-4807: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3816: Attachment: YARN-3816-YARN-2928-v9.patch V9 patch. Keep addressing javadoc warnings. Added a "skip" set that can

[jira] [Commented] (YARN-4890) Unit test intermittent failure: TestNodeLabelContainerAllocation#testQueueUsedCapacitiesUpdate

2016-04-20 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251045#comment-15251045 ] Hudson commented on YARN-4890: -- FAILURE: Integrated in Hadoop-trunk-Commit #9639 (See

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251043#comment-15251043 ] Giovanni Matteo Fumarola commented on YARN-4976: Cool, I just uploaded the new version with

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: (was: YARN-4976.v2.patch) > Missing NullPointer check in

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: YARN-4976.v2.patch > Missing NullPointer check in

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: YARN-4976.v2.patch > Missing NullPointer check in

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Haibo Chen (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251022#comment-15251022 ] Haibo Chen commented on YARN-4697: -- [~vinodkv] Upon NM restart, NM will try to recover all applications

[jira] [Commented] (YARN-1458) FairScheduler: Zero weight can lead to livelock

2016-04-20 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251016#comment-15251016 ] zhihai xu commented on YARN-1458: - Hi [~dwatzke], thanks for reporting this issue, I double check the code,

[jira] [Updated] (YARN-1458) FairScheduler: Zero weight can lead to livelock

2016-04-20 Thread zhihai xu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-1458: Attachment: YARN-1458.addendum.patch > FairScheduler: Zero weight can lead to livelock >

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15251014#comment-15251014 ] Wangda Tan commented on YARN-4697: -- IIUC, currently all application recovery go to INIT state, and

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250941#comment-15250941 ] Daniel Templeton commented on YARN-4976: Patch looks good. Can you add a quick test for this case?

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250918#comment-15250918 ] Li Lu commented on YARN-3816: - Thanks [~sjlee0]. I'm still fighting with the javadoc issues and opened two

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250909#comment-15250909 ] Hadoop QA commented on YARN-4976: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Sangjin Lee (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250905#comment-15250905 ] Sangjin Lee commented on YARN-3816: --- The latest patch looks good for the most part, minus the javadoc

[jira] [Commented] (YARN-4977) Fix javadocs warnings in yarn-api for jdk 1.8

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250859#comment-15250859 ] Li Lu commented on YARN-4977: - Hi [~templedf], sure, let's do that. I agree that there might be tons of them,

[jira] [Commented] (YARN-4977) Fix javadocs warnings in yarn-api for jdk 1.8

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250853#comment-15250853 ] Daniel Templeton commented on YARN-4977: I wouldn't be surprised if there were tens of thousands of

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250833#comment-15250833 ] Hadoop QA commented on YARN-4976: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Created] (YARN-4978) The number of javadocs warnings is limited to 100

2016-04-20 Thread Li Lu (JIRA)
Li Lu created YARN-4978: --- Summary: The number of javadocs warnings is limited to 100 Key: YARN-4978 URL: https://issues.apache.org/jira/browse/YARN-4978 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Yufei Gu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-4807: --- Attachment: YARN-4807.014.patch Hi [~templedf] and [~kasha], I solved the flaky test cases. Please take a

[jira] [Created] (YARN-4977) Fix javadocs warnings in yarn-api for jdk 1.8

2016-04-20 Thread Li Lu (JIRA)
Li Lu created YARN-4977: --- Summary: Fix javadocs warnings in yarn-api for jdk 1.8 Key: YARN-4977 URL: https://issues.apache.org/jira/browse/YARN-4977 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-4968) A couple of AM retry unit tests need to wait SchedulerApplicationAttempt stopped.

2016-04-20 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-4968: -- Issue Type: Sub-task (was: Bug) Parent: YARN-4478 > A couple of AM

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250822#comment-15250822 ] Giovanni Matteo Fumarola commented on YARN-4976: Yes, (by mistake) we set

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Vinod Kumar Vavilapalli (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250811#comment-15250811 ] Vinod Kumar Vavilapalli commented on YARN-4697: --- bq. In the case that we have had a problem

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: YARN-4976.v1.patch > Missing NullPointer check in

[jira] [Updated] (YARN-4957) Add getNewReservation in ApplicationClientProtocol

2016-04-20 Thread Sean Po (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-4957: -- Attachment: YARN-4957.v4.patch Added description as to why IOException and YarnException might be thrown when

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250777#comment-15250777 ] Hadoop QA commented on YARN-4846: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4697: - Target Version/s: 2.6.4, 2.8.0, 2.7.3 > NM aggregation thread pool is not bound by limits >

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250767#comment-15250767 ] Daniel Templeton commented on YARN-4976: Thanks, [~giovanni.fumarola]. I should have taken a

[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250761#comment-15250761 ] Wangda Tan commented on YARN-4697: -- Thanks [~haibochen] fixed this problem. I think this is a critical

[jira] [Updated] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4697: - Priority: Critical (was: Major) > NM aggregation thread pool is not bound by limits >

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250750#comment-15250750 ] Hadoop QA commented on YARN-4807: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: (was: YARN-4976.v0.patch) > Missing NullPointer check in

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: YARN-4976.v0.patch > Missing NullPointer check in

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250742#comment-15250742 ] Giovanni Matteo Fumarola commented on YARN-4976: Let me reattached it. Thanks for the quick

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250732#comment-15250732 ] Daniel Templeton commented on YARN-4976: Looks reasonable. The indentation is off, though. >

[jira] [Commented] (YARN-4968) A couple of AM retry unit tests need to wait SchedulerApplicationAttempt stopped.

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250726#comment-15250726 ] Li Lu commented on YARN-4968: - Fix makes sense. +1. > A couple of AM retry unit tests need to wait

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250724#comment-15250724 ] Giovanni Matteo Fumarola commented on YARN-4976: Attached a "nice" way to handle it. >

[jira] [Commented] (YARN-4968) A couple of AM retry unit tests need to wait SchedulerApplicationAttempt stopped.

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250722#comment-15250722 ] Wangda Tan commented on YARN-4968: -- Failed tests are not related to this fix. > A couple of AM retry unit

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Attachment: YARN-4976.v0.patch > Missing NullPointer check in

[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-20 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250721#comment-15250721 ] Robert Kanter commented on YARN-4676: - Here is my final review feedback: # The GracefulDecommision docs

[jira] [Updated] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4846: - Attachment: YARN-4846-update-PCPP.patch > Random failures for >

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250717#comment-15250717 ] Wangda Tan commented on YARN-4846: -- [~bibinchundatt], bq. For issue 2 One probable cause could be

[jira] [Updated] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-4976: --- Description: The client can set a null value for any env variable. > Missing

[jira] [Updated] (YARN-4795) ContainerMetrics drops records

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4795: --- Attachment: YARN-4795.002.patch > ContainerMetrics drops records >

[jira] [Updated] (YARN-4795) ContainerMetrics drops records

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4795: --- Attachment: (was: YARN-4795.002.patch) > ContainerMetrics drops records >

[jira] [Commented] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250709#comment-15250709 ] Giovanni Matteo Fumarola commented on YARN-4976: Missing null check {quote} @Override

[jira] [Updated] (YARN-4957) Add getNewReservation in ApplicationClientProtocol

2016-04-20 Thread Sean Po (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Po updated YARN-4957: -- Attachment: YARN-4957.v3.patch Fixed javadoc issues. Ignoring request to remove unused constructor for

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250707#comment-15250707 ] Li Lu commented on YARN-3816: - OK, I'm working with the javadoc warnings. Seems like each time I can only get

[jira] [Commented] (YARN-4795) ContainerMetrics drops records

2016-04-20 Thread Ray Chiang (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250697#comment-15250697 ] Ray Chiang commented on YARN-4795: -- Looks like the Jenkins jobs needs to be re-launched or a new (same)

[jira] [Created] (YARN-4976) Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die

2016-04-20 Thread Giovanni Matteo Fumarola (JIRA)
Giovanni Matteo Fumarola created YARN-4976: -- Summary: Missing NullPointer check in ContainerLaunchContextPBImpl causes RM to die Key: YARN-4976 URL: https://issues.apache.org/jira/browse/YARN-4976

[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Yufei Gu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-4807: --- Attachment: (was: YARN-4807.014.patch) > MockAM#waitForState sleep duration is too long >

[jira] [Updated] (YARN-4087) Followup fixes after YARN-2019 regarding RM behavior when state-store error occurs

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4087: --- Hadoop Flags: Incompatible change Marking this incompatible to ensure this gets called out in

[jira] [Updated] (YARN-4975) Fair Scheduler: exception thrown when a parent queue marked 'parent' has configured child queues

2016-04-20 Thread Ashwin Shankar (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Shankar updated YARN-4975: - Description: We upgraded our clusters to 2.7.2 from 2.4.1 and saw the following exception in RM

[jira] [Created] (YARN-4975) Fair Scheduler: exception thrown when a parent queue marked 'parent' has configured child queues

2016-04-20 Thread Ashwin Shankar (JIRA)
Ashwin Shankar created YARN-4975: Summary: Fair Scheduler: exception thrown when a parent queue marked 'parent' has configured child queues Key: YARN-4975 URL: https://issues.apache.org/jira/browse/YARN-4975

[jira] [Updated] (YARN-4975) Fair Scheduler: exception thrown when a parent queue marked 'parent' has configured child queues

2016-04-20 Thread Ashwin Shankar (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashwin Shankar updated YARN-4975: - Description: We upgraded our clusters to 2.7.2 from 2.4.1 and saw the following exception in RM

[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Yufei Gu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-4807: --- Attachment: YARN-4807.014.patch > MockAM#waitForState sleep duration is too long >

[jira] [Created] (YARN-4974) Random test failure:TestRMApplicationHistoryWriter#testRMWritingMassiveHistoryForCapacitySche

2016-04-20 Thread Bibin A Chundatt (JIRA)
Bibin A Chundatt created YARN-4974: -- Summary: Random test failure:TestRMApplicationHistoryWriter#testRMWritingMassiveHistoryForCapacitySche Key: YARN-4974 URL: https://issues.apache.org/jira/browse/YARN-4974

[jira] [Updated] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Bibin A Chundatt (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4846: --- Attachment: 0002-YARN-4846.patch Attaching patch for the same > Random failures for >

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Bibin A Chundatt (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250453#comment-15250453 ] Bibin A Chundatt commented on YARN-4846: [~leftnoteasy] IIUC its will still cause failure.

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Yufei Gu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250423#comment-15250423 ] Yufei Gu commented on YARN-4807: Seems like it did make some test cases flaky. Need more investigate. Sorry

[jira] [Updated] (YARN-4973) YarnWebParams next.fresh.interval should be next.refresh.interval

2016-04-20 Thread Daniel Templeton (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4973: --- Priority: Minor (was: Major) > YarnWebParams next.fresh.interval should be

[jira] [Created] (YARN-4973) YarnWebParams next.fresh.interval should be next.refresh.interval

2016-04-20 Thread Daniel Templeton (JIRA)
Daniel Templeton created YARN-4973: -- Summary: YarnWebParams next.fresh.interval should be next.refresh.interval Key: YARN-4973 URL: https://issues.apache.org/jira/browse/YARN-4973 Project: Hadoop

[jira] [Commented] (YARN-3959) Store application related configurations in Timeline Service v2

2016-04-20 Thread Li Lu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250401#comment-15250401 ] Li Lu commented on YARN-3959: - Hi [~varun_saxena], since our planned data of the 1st milestone is approaching,

[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-20 Thread Junping Du (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250380#comment-15250380 ] Junping Du commented on YARN-4676: -- Thanks Robert for check. I will give this big patch a review today or

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Yufei Gu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250372#comment-15250372 ] Yufei Gu commented on YARN-4807: Thanks [~kasha] for the review. Some of them did invoke {{waitForState}}.

[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-20 Thread Robert Kanter (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250370#comment-15250370 ] Robert Kanter commented on YARN-4676: - I'm taking another pass at reviewing the latest patch (012), but

[jira] [Updated] (YARN-1458) FairScheduler: Zero weight can lead to livelock

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1458: --- Fix Version/s: 2.6.0 > FairScheduler: Zero weight can lead to livelock >

[jira] [Updated] (YARN-1458) FairScheduler: Zero weight can lead to livelock

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1458: --- Labels: (was: patch) > FairScheduler: Zero weight can lead to livelock >

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250315#comment-15250315 ] Wangda Tan commented on YARN-4846: -- [~bibinchundatt], if we update "<" check to "<=" check, we can get

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Bibin A Chundatt (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250306#comment-15250306 ] Bibin A Chundatt commented on YARN-4846: [~sunilg]/[~leftnoteasy] One probable cause could be

[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250287#comment-15250287 ] Karthik Kambatla commented on YARN-4807: Thanks [~yufeigu] for working on this and [~templedf] for

[jira] [Commented] (YARN-2883) Queuing of container requests in the NM

2016-04-20 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250286#comment-15250286 ] Hudson commented on YARN-2883: -- FAILURE: Integrated in Hadoop-trunk-Commit #9635 (See

[jira] [Commented] (YARN-2883) Queuing of container requests in the NM

2016-04-20 Thread Konstantinos Karanasos (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250244#comment-15250244 ] Konstantinos Karanasos commented on YARN-2883: -- Thanks for all the reviews, [~kasha]! >

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Wangda Tan (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250241#comment-15250241 ] Wangda Tan commented on YARN-4846: -- [~sunilg], [~bibinchundatt], Looked at this issue again, it seems we

[jira] [Commented] (YARN-2883) Queuing of container requests in the NM

2016-04-20 Thread Arun Suresh (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250189#comment-15250189 ] Arun Suresh commented on YARN-2883: --- Filed YARN-4972 to fix the test cases. Thanks for the review

[jira] [Created] (YARN-4972) Cleanup QueuingContainerManager tests to remove long sleep times

2016-04-20 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-4972: - Summary: Cleanup QueuingContainerManager tests to remove long sleep times Key: YARN-4972 URL: https://issues.apache.org/jira/browse/YARN-4972 Project: Hadoop YARN

[jira] [Commented] (YARN-2883) Queuing of container requests in the NM

2016-04-20 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250145#comment-15250145 ] Karthik Kambatla commented on YARN-2883: Latest patch looks good to me. +1 The tests introduced

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-20 Thread Nathan Roberts (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250137#comment-15250137 ] Nathan Roberts commented on YARN-4963: -- bq. IMO, I think application specific configurations should be

[jira] [Updated] (YARN-4971) RM fails to re-bind to wildcard IP after failover in multi homed clusters

2016-04-20 Thread Wilfred Spiegelenburg (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-4971: Attachment: YARN-4971.1.patch patch to not override the bind address in the two

[jira] [Commented] (YARN-4971) RM fails to re-bind to wildcard IP after failover in multi homed clusters

2016-04-20 Thread Wilfred Spiegelenburg (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249961#comment-15249961 ] Wilfred Spiegelenburg commented on YARN-4971: - During the service init the service bind address

[jira] [Created] (YARN-4971) RM fails to re-bind to wildcard IP after failover in multi homed clusters

2016-04-20 Thread Wilfred Spiegelenburg (JIRA)
Wilfred Spiegelenburg created YARN-4971: --- Summary: RM fails to re-bind to wildcard IP after failover in multi homed clusters Key: YARN-4971 URL: https://issues.apache.org/jira/browse/YARN-4971

[jira] [Commented] (YARN-4846) Random failures for TestCapacitySchedulerPreemption#testPreemptionPolicyShouldRespectAlreadyMarkedKillableContainers

2016-04-20 Thread Bibin A Chundatt (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249740#comment-15249740 ] Bibin A Chundatt commented on YARN-4846: [~sunilg] Thank you looking into the testresult. queueName

[jira] [Commented] (YARN-3524) Mapreduce failed due to AM Container-Launch failure at NM on windows

2016-04-20 Thread tianyu (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249706#comment-15249706 ] tianyu commented on YARN-3524: -- I am a new learner. Today ,I have get the similar problem as above. First,I

[jira] [Created] (YARN-4970) Difficult to trace "Connection Refused" in AM Proxying

2016-04-20 Thread Matthew Byng-Maddick (JIRA)
Matthew Byng-Maddick created YARN-4970: -- Summary: Difficult to trace "Connection Refused" in AM Proxying Key: YARN-4970 URL: https://issues.apache.org/jira/browse/YARN-4970 Project: Hadoop YARN

[jira] [Commented] (YARN-4957) Add getNewReservation in ApplicationClientProtocol

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249650#comment-15249650 ] Hadoop QA commented on YARN-4957: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-4721) RM to try to auth with HDFS on startup, retry with max diagnostics on failure

2016-04-20 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-4721: - Attachment: HADOOP-12289-003.patch patch 003 with the complete diff > RM to try to auth with HDFS

[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-20 Thread Hadoop QA (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249352#comment-15249352 ] Hadoop QA commented on YARN-3816: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||