[jira] [Updated] (YARN-1774) submitting job to non-leaf queue causes resourcemanager(fairscheduler) to exit

2014-02-28 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1774: --- Priority: Blocker (was: Major) submitting job to non-leaf queue causes

[jira] [Updated] (YARN-1774) FS: Submitting to non-leaf queue throws NPE

2014-02-28 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1774: --- Summary: FS: Submitting to non-leaf queue throws NPE (was: submitting job to non-leaf queue

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918365#comment-13918365 ] Karthik Kambatla commented on YARN-1734: In our case, we plan to use the

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918437#comment-13918437 ] Karthik Kambatla commented on YARN-1734: I guess the ambiguity stems from the

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918441#comment-13918441 ] Karthik Kambatla commented on YARN-1734: bq. For calling refresh* in standby RM, it

[jira] [Created] (YARN-1779) Handle AMRMTokens across RM failover

2014-03-03 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1779: -- Summary: Handle AMRMTokens across RM failover Key: YARN-1779 URL: https://issues.apache.org/jira/browse/YARN-1779 Project: Hadoop YARN Issue Type:

[jira] [Updated] (YARN-986) RM DT token service should have service addresses of both RMs

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-986: -- Attachment: yarn-986-3.patch RM DT token service should have service addresses of both RMs

[jira] [Commented] (YARN-986) RM DT token service should have service addresses of both RMs

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918951#comment-13918951 ] Karthik Kambatla commented on YARN-986: --- bq. There are some related TODOs in

[jira] [Commented] (YARN-1734) RM should get the updated Configurations when it transits from Standby to Active

2014-03-03 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919031#comment-13919031 ] Karthik Kambatla commented on YARN-1734: Sorry for all the confusion caused here -

[jira] [Updated] (YARN-1064) YarnConfiguration scheduler configuration constants are not consistent

2014-03-04 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1064: --- Target Version/s: 2.4.0 Fix Version/s: (was: 2.4.0) Set target version 2.4.0.

[jira] [Created] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-03-04 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1784: -- Summary: TestContainerAllocation assumes CapacityScheduler Key: YARN-1784 URL: https://issues.apache.org/jira/browse/YARN-1784 Project: Hadoop YARN

[jira] [Commented] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-03-04 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920395#comment-13920395 ] Karthik Kambatla commented on YARN-1525: Verified - the redirection works as

[jira] [Commented] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-03-04 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13920402#comment-13920402 ] Karthik Kambatla commented on YARN-1525: Comments on the latest patch (it would be

[jira] [Resolved] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-03-05 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved YARN-1784. Resolution: Won't Fix The test fails locally when run against FairScheduler, but passes on

[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)

2014-03-05 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922063#comment-13922063 ] Karthik Kambatla commented on YARN-1492: Thanks [~ctrezzo] for the clarifications.

[jira] [Updated] (YARN-1774) FS: Submitting to non-leaf queue throws NPE

2014-03-05 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1774: --- Attachment: yarn-1774-2.patch [~adhoot] is out on vacation. Posting a patch that addresses

[jira] [Created] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1793: -- Summary: yarn application -kill doesn't kill UnmanagedAMs Key: YARN-1793 URL: https://issues.apache.org/jira/browse/YARN-1793 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13922989#comment-13922989 ] Karthik Kambatla commented on YARN-1793: {code} if

[jira] [Commented] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923039#comment-13923039 ] Karthik Kambatla commented on YARN-1525: bq. In the case the standby RM, where

[jira] [Commented] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923099#comment-13923099 ] Karthik Kambatla commented on YARN-1793: What do you think about getting rid of

[jira] [Updated] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1793: --- Attachment: yarn-1793-0.patch Simple patch that seems to fix the issue. Removed UnmanagedAM

[jira] [Updated] (YARN-1795) Oozie tests are flakey after YARN-713

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1795: --- Priority: Critical (was: Major) Oozie tests are flakey after YARN-713

[jira] [Updated] (YARN-1795) Oozie tests are flakey after YARN-713

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1795: --- Target Version/s: 2.4.0 Oozie tests are flakey after YARN-713

[jira] [Updated] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1525: --- Attachment: YARN1525.secure.v11.patch Thanks Cindy. Posting a patch with cosmetic changes

[jira] [Commented] (YARN-1525) Web UI should redirect to active RM when HA is enabled.

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923531#comment-13923531 ] Karthik Kambatla commented on YARN-1525: Thanks Cindy. +1 on the latest patch.

[jira] [Updated] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1793: --- Attachment: yarn-1793-1.patch yarn application -kill doesn't kill UnmanagedAMs

[jira] [Commented] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-06 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923568#comment-13923568 ] Karthik Kambatla commented on YARN-1793: Thanks for digging up the reason,

[jira] [Commented] (YARN-1799) Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff

2014-03-07 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924074#comment-13924074 ] Karthik Kambatla commented on YARN-1799: Just to understand what we are trying to

[jira] [Updated] (YARN-1774) FS: Submitting to non-leaf queue throws NPE

2014-03-07 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1774: --- Attachment: yarn-1774-3.patch Updated fix to report different messages for different cases.

[jira] [Updated] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-07 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1793: --- Attachment: yarn-1793-2.patch Added test to verify behavior of

[jira] [Updated] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-09 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1793: --- Attachment: yarn-1793-3.patch Thanks Jian and Bikas. Here is an updated patch that

[jira] [Updated] (YARN-1811) Error 500 when clicking the Application Master link in the RM UI while a job is running with RM HA

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1811: --- Description: When using RM HA, if you click on the Application Master link in the RM web UI

[jira] [Commented] (YARN-1811) Error 500 when clicking the Application Master link in the RM UI while a job is running with RM HA

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13926228#comment-13926228 ] Karthik Kambatla commented on YARN-1811: Corresponding stack trace from Robert:

[jira] [Commented] (YARN-1811) Error 500 when clicking the Application Master link in the RM UI while a job is running with RM HA

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13926276#comment-13926276 ] Karthik Kambatla commented on YARN-1811: Thanks for identifying and working on

[jira] [Commented] (YARN-1793) yarn application -kill doesn't kill UnmanagedAMs

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13926401#comment-13926401 ] Karthik Kambatla commented on YARN-1793: Thanks for your prompt reviews, Jian and

[jira] [Commented] (YARN-1811) Error 500 when clicking the Application Master link in the RM UI while a job is running with RM HA

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13926408#comment-13926408 ] Karthik Kambatla commented on YARN-1811: [~rkanter] - could you file a follow-up

[jira] [Created] (YARN-1815) RM should recover only Managed AMs

2014-03-10 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1815: -- Summary: RM should recover only Managed AMs Key: YARN-1815 URL: https://issues.apache.org/jira/browse/YARN-1815 Project: Hadoop YARN Issue Type:

[jira] [Commented] (YARN-1705) Cluster metrics are off after failover

2014-03-10 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13929909#comment-13929909 ] Karthik Kambatla commented on YARN-1705: Yes, that is the intent of this JIRA. Feel

[jira] [Commented] (YARN-1811) Error 500 when clicking the Application Master link in the RM UI while a job is running with RM HA

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930009#comment-13930009 ] Karthik Kambatla commented on YARN-1811: TestFifoScheduler passes for me locally

[jira] [Updated] (YARN-1811) RM HA: AM link broken if the AM is not on nodes other than RM

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1811: --- Summary: RM HA: AM link broken if the AM is not on nodes other than RM (was: Error 500 when

[jira] [Updated] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1811: --- Summary: RM HA: AM link broken if the AM is on nodes other than RM (was: RM HA: AM link

[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930023#comment-13930023 ] Karthik Kambatla commented on YARN-1811: TestRMHA - looks like it should be okay to

[jira] [Created] (YARN-1821) NM throws NPE on re-register with RM if it has containers for Unmanaged AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1821: -- Summary: NM throws NPE on re-register with RM if it has containers for Unmanaged AMs Key: YARN-1821 URL: https://issues.apache.org/jira/browse/YARN-1821 Project:

[jira] [Assigned] (YARN-1821) NM throws NPE on re-register with RM if it has containers for Unmanaged AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-1821: -- Assignee: Karthik Kambatla NM throws NPE on re-register with RM if it has containers

[jira] [Updated] (YARN-1779) Handle AMRMTokens across RM failover

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1779: --- Priority: Critical (was: Blocker) Handle AMRMTokens across RM failover

[jira] [Commented] (YARN-1779) Handle AMRMTokens across RM failover

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930534#comment-13930534 ] Karthik Kambatla commented on YARN-1779: Looks like the service address for

[jira] [Commented] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13930733#comment-13930733 ] Karthik Kambatla commented on YARN-1784: Ran into this again on our internal

[jira] [Reopened] (YARN-1784) TestContainerAllocation assumes CapacityScheduler

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reopened YARN-1784: Assignee: Robert Kanter (was: Karthik Kambatla) TestContainerAllocation assumes

[jira] [Updated] (YARN-1821) NPE on registerNodeManager if the request has containers for UnmanagedAMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1821: --- Summary: NPE on registerNodeManager if the request has containers for UnmanagedAMs (was: NM

[jira] [Updated] (YARN-1821) NPE on registerNodeManager if the request has containers for UnmanagedAMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1821: --- Attachment: yarn-1821-1.patch Straight-forward patch and test. Verified test fails without

[jira] [Created] (YARN-1823) Recover Unmanaged AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1823: -- Summary: Recover Unmanaged AMs Key: YARN-1823 URL: https://issues.apache.org/jira/browse/YARN-1823 Project: Hadoop YARN Issue Type: Sub-task

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Description: RM should not recover unmanaged AMs until YARN-1823 is fixed. (was: RM should

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Attachment: yarn-1815-1.patch Simple patch and a corresponding test update. RM should

[jira] [Commented] (YARN-1821) NPE on registerNodeManager if the request has containers for UnmanagedAMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13931043#comment-13931043 ] Karthik Kambatla commented on YARN-1821: The test failure is unrelated. YARN-1591

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-11 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Priority: Critical (was: Blocker) Lowering priority. Not having this only leads to one RMApp

[jira] [Updated] (YARN-1370) Fair scheduler to re-populate container allocation state

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1370: --- Assignee: Anubhav Dhoot (was: Karthik Kambatla) Fair scheduler to re-populate container

[jira] [Commented] (YARN-1815) RM should recover only Managed AMs

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932232#comment-13932232 ] Karthik Kambatla commented on YARN-1815: bq. If so, we need to make sure the App

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Attachment: yarn-1815-2.patch Updated patch sets the final state of Umanaged AMs on recovery

[jira] [Updated] (YARN-1812) Job stays in PREP state for long time after RM Restarts

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1812: --- Summary: Job stays in PREP state for long time after RM Restarts (was: Job stays in PREP

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Attachment: yarn-1815-2.patch Rebased patch on trunk (previous one had conflicts with

[jira] [Updated] (YARN-1815) RM should recover only Managed AMs

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1815: --- Attachment: Unmanaged AM recovery.png Attaching screen shot with the fix. RM should

[jira] [Commented] (YARN-1828) Resource Manager is down when I request job on specific queue.

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932894#comment-13932894 ] Karthik Kambatla commented on YARN-1828: I believe this is YARN-1774 - submitting

[jira] [Resolved] (YARN-1828) Resource Manager is down when I request job on specific queue.

2014-03-12 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved YARN-1828. Resolution: Duplicate Closing this as a duplicate. We can re-open if we find any pending

[jira] [Created] (YARN-1830) TestRMRestart.testQueueMetricsOnRMRestart failure

2014-03-13 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1830: -- Summary: TestRMRestart.testQueueMetricsOnRMRestart failure Key: YARN-1830 URL: https://issues.apache.org/jira/browse/YARN-1830 Project: Hadoop YARN

[jira] [Commented] (YARN-1815) RM should recover only Managed AMs

2014-03-13 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933580#comment-13933580 ] Karthik Kambatla commented on YARN-1815: The tests pass locally. Filed YARN-1830

[jira] [Updated] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens

2014-03-13 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1795: --- Priority: Blocker (was: Critical) After YARN-713, using FairScheduler can cause an

[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-13 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13933866#comment-13933866 ] Karthik Kambatla commented on YARN-1811: Thanks Robert. Comments: #

[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-13 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934182#comment-13934182 ] Karthik Kambatla commented on YARN-1811: Changes look good to me. I ll defer the

[jira] [Commented] (YARN-1799) Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff

2014-03-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935431#comment-13935431 ] Karthik Kambatla commented on YARN-1799: bq. Given the disk write speed as a

[jira] [Assigned] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens

2014-03-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-1795: -- Assignee: Karthik Kambatla After YARN-713, using FairScheduler can cause an

[jira] [Commented] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens

2014-03-14 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935439#comment-13935439 ] Karthik Kambatla commented on YARN-1795: Taking this up to investigate. After

[jira] [Commented] (YARN-1811) RM HA: AM link broken if the AM is on nodes other than RM

2014-03-17 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938354#comment-13938354 ] Karthik Kambatla commented on YARN-1811: Can we suppress the deprecation warnings

[jira] [Updated] (YARN-1846) TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1846: --- Summary: TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler (was:

[jira] [Commented] (YARN-1846) TestRM.testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938891#comment-13938891 ] Karthik Kambatla commented on YARN-1846: +1. Committing this.

[jira] [Commented] (YARN-1846) TestRM#testNMTokenSentForNormalContainer assumes CapacityScheduler

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938894#comment-13938894 ] Karthik Kambatla commented on YARN-1846: Thanks Robert. Just committed this to

[jira] [Commented] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938918#comment-13938918 ] Karthik Kambatla commented on YARN-1843: Comments: # The patch introduces spurious

[jira] [Commented] (YARN-1536) Cleanup: Get rid of ResourceManager#get*SecretManager() methods and use the RMContext methods instead

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938927#comment-13938927 ] Karthik Kambatla commented on YARN-1536: Looks good. Few nits: # May be not include

[jira] [Commented] (YARN-1775) Create SMAPBasedProcessTree to get PSS information

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938938#comment-13938938 ] Karthik Kambatla commented on YARN-1775: Just barely went over the patch. Can we

[jira] [Commented] (YARN-1536) Cleanup: Get rid of ResourceManager#get*SecretManager() methods and use the RMContext methods instead

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938940#comment-13938940 ] Karthik Kambatla commented on YARN-1536: BTW, per my understanding, the Reviewed

[jira] [Commented] (YARN-1705) Cluster metrics are off after failover

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938954#comment-13938954 ] Karthik Kambatla commented on YARN-1705: Patch looks mostly good. Minor comments: #

[jira] [Commented] (YARN-1474) Make schedulers services

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938961#comment-13938961 ] Karthik Kambatla commented on YARN-1474: Skimmed through the patch. Looks

[jira] [Commented] (YARN-1705) Cluster metrics are off after failover

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939020#comment-13939020 ] Karthik Kambatla commented on YARN-1705: Thanks Rohith. Looks good. +1, pending

[jira] [Created] (YARN-1848) Persist ClusterMetrics across RM HA transitions

2014-03-18 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1848: -- Summary: Persist ClusterMetrics across RM HA transitions Key: YARN-1848 URL: https://issues.apache.org/jira/browse/YARN-1848 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-1705) Cluster metrics are off after failover

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939625#comment-13939625 ] Karthik Kambatla commented on YARN-1705: Committing this.. Cluster metrics are

[jira] [Commented] (YARN-1848) Persist ClusterMetrics across RM HA transitions

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939624#comment-13939624 ] Karthik Kambatla commented on YARN-1848: If we think, we don't need to support

[jira] [Updated] (YARN-1705) Reset cluster-metrics on transition to standby

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1705: --- Summary: Reset cluster-metrics on transition to standby (was: Cluster metrics are off after

[jira] [Updated] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1843: --- Target Version/s: 2.4.0 Assignee: Liyin Liang LinuxContainerExecutor should

[jira] [Commented] (YARN-1843) LinuxContainerExecutor should always log output

2014-03-18 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939843#comment-13939843 ] Karthik Kambatla commented on YARN-1843: bq. Sorry, I can't catch you. Is there any

[jira] [Created] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-18 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1849: -- Summary: NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters Key: YARN-1849 URL: https://issues.apache.org/jira/browse/YARN-1849

[jira] [Commented] (YARN-1854) TestRMHA#testStartAndTransitions Fails

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940609#comment-13940609 ] Karthik Kambatla commented on YARN-1854: YARN-1705 introduced this check - I ran it

[jira] [Commented] (YARN-1855) TestRMFailover#testRMWebAppRedirect fails in trunk

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940685#comment-13940685 ] Karthik Kambatla commented on YARN-1855: [~cindyli] - will you be able to take a

[jira] [Commented] (YARN-1854) TestRMHA#testStartAndTransitions Fails

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940730#comment-13940730 ] Karthik Kambatla commented on YARN-1854: Thanks Mit. It is likely a race in the

[jira] [Updated] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1849: --- Attachment: yarn-1849-1.patch NPE in ResourceTrackerService#registerNodeManager for UAM on

[jira] [Commented] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940640#comment-13940640 ] Karthik Kambatla commented on YARN-1849: This time around, it turns out the master

[jira] [Created] (YARN-1856) cgroups based memory monitoring for containers

2014-03-19 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-1856: -- Summary: cgroups based memory monitoring for containers Key: YARN-1856 URL: https://issues.apache.org/jira/browse/YARN-1856 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-1747) Better physical memory monitoring for containers

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13940832#comment-13940832 ] Karthik Kambatla commented on YARN-1747: Re-purposing this JIRA to use cgroups for

[jira] [Assigned] (YARN-1747) Better physical memory monitoring for containers

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-1747: -- Assignee: Karthik Kambatla Better physical memory monitoring for containers

[jira] [Updated] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1849: --- Attachment: yarn-1849-2.patch Thinking more, thought we could benefit from better logging for

[jira] [Updated] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1849: --- Attachment: yarn-1849-2.patch Cosmetic import fix. NPE in

[jira] [Updated] (YARN-1849) NPE in ResourceTrackerService#registerNodeManager for UAM on secure clusters

2014-03-19 Thread Karthik Kambatla (JIRA)
[ https://issues.apache.org/jira/browse/YARN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1849: --- Attachment: yarn-1849-3.patch Test failure is unrelated. New patch fixes javac warning. NPE

<    3   4   5   6   7   8   9   10   11   12   >