[jira] [Assigned] (YARN-2175) Container localization has no timeouts and tasks can be stuck there for a long time

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-2175: --- Assignee: (was: Anubhav Dhoot) > Container localization has no timeouts and tasks can be

[jira] [Assigned] (YARN-2661) Container Localization is not resource limited

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-2661: --- Assignee: (was: Anubhav Dhoot) > Container Localization is not resource limited >

[jira] [Assigned] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3119: --- Assignee: (was: Anubhav Dhoot) > Memory limit check need not be enforced unless aggregate

[jira] [Assigned] (YARN-3229) Incorrect processing of container as LOST on Interruption during NM shutdown

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3229: --- Assignee: (was: Anubhav Dhoot) > Incorrect processing of container as LOST on

[jira] [Assigned] (YARN-3257) FairScheduler: MaxAm may be set too low preventing apps from starting

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3257: --- Assignee: (was: Anubhav Dhoot) > FairScheduler: MaxAm may be set too low preventing apps

[jira] [Assigned] (YARN-3994) RM should respect AM resource/placement constraints

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3994: --- Assignee: (was: Anubhav Dhoot) > RM should respect AM resource/placement constraints >

[jira] [Assigned] (YARN-4021) RuntimeException/YarnRuntimeException sent over to the client can cause client to assume a local fatal failure

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-4021: --- Assignee: (was: Anubhav Dhoot) > RuntimeException/YarnRuntimeException sent over to the

[jira] [Assigned] (YARN-4076) FairScheduler does not allow AM to choose which containers to preempt

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-4076: --- Assignee: (was: Anubhav Dhoot) > FairScheduler does not allow AM to choose which

[jira] [Assigned] (YARN-4030) Make Nodemanager cgroup usage for container easier to use when its running inside a cgroup

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-4030: --- Assignee: (was: Anubhav Dhoot) > Make Nodemanager cgroup usage for container easier to

[jira] [Assigned] (YARN-4144) Add NM that causes LaunchFailedTransition to blacklist

2017-06-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-4144: --- Assignee: (was: Anubhav Dhoot) > Add NM that causes LaunchFailedTransition to blacklist >

[jira] [Updated] (YARN-4032) Corrupted state from a previous version can still cause RM to fail with NPE due to same reasons as YARN-2834

2015-10-31 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4032: Assignee: (was: Anubhav Dhoot) Unassigning this from myself since I am not going to have time to

[jira] [Updated] (YARN-3738) Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler

2015-10-23 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3738: Attachment: YARN-3738-v3.patch Retriggering jenkins with same patch > Add support for recovery of

[jira] [Commented] (YARN-3738) Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler

2015-10-23 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971817#comment-14971817 ] Anubhav Dhoot commented on YARN-3738: - +1 pending jenkins > Add support for recovery of reserved apps

[jira] [Updated] (YARN-3739) Add reservation system recovery to RM recovery process

2015-10-22 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3739: Fix Version/s: 2.8.0 > Add reservation system recovery to RM recovery process >

[jira] [Commented] (YARN-3739) Add recovery of reservation system to RM failover process

2015-10-22 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969137#comment-14969137 ] Anubhav Dhoot commented on YARN-3739: - +1 > Add recovery of reservation system to RM failover process

[jira] [Updated] (YARN-3739) Add reservation system recovery to RM recovery process

2015-10-22 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3739: Summary: Add reservation system recovery to RM recovery process (was: Add recovery of reservation

[jira] [Updated] (YARN-4184) Remove update reservation state api from state store as its not used by ReservationSystem

2015-10-21 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4184: Assignee: Subru Krishnan (was: Anubhav Dhoot) > Remove update reservation state api from state

[jira] [Commented] (YARN-3739) Add recovery of reservation system to RM failover process

2015-10-21 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968136#comment-14968136 ] Anubhav Dhoot commented on YARN-3739: - Minor comment on loadState you can enumerate on

[jira] [Commented] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-10-20 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965804#comment-14965804 ] Anubhav Dhoot commented on YARN-3985: - Reran the test multiple times without failure > Make

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-10-20 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.005.patch Added a retry since there are multiple events that we need to wait

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-10-20 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.005.patch > Make ReservationSystem persist state using RMStateStore

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-10-19 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.004.patch Thanks for the review [~asuresh]. Attached patch removes sleep by

[jira] [Commented] (YARN-4227) FairScheduler: RM quits processing expired container from a removed node

2015-10-14 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14957372#comment-14957372 ] Anubhav Dhoot commented on YARN-4227: - The previous statement should also be updated to handle a null

[jira] [Commented] (YARN-4032) Corrupted state from a previous version can still cause RM to fail with NPE due to same reasons as YARN-2834

2015-10-12 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953924#comment-14953924 ] Anubhav Dhoot commented on YARN-4032: - This is a sample log {noformat} 2015-10-10 04:35:32,486 INFO

[jira] [Updated] (YARN-4032) Corrupted state from a previous version can still cause RM to fail with NPE due to same reasons as YARN-2834

2015-10-12 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4032: Attachment: YARN-4032.prelim.patch Prelim patch based on the discussion > Corrupted state from a

[jira] [Commented] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952545#comment-14952545 ] Anubhav Dhoot commented on YARN-4247: - Yup. I had tested without that change. Resolving this as not

[jira] [Commented] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952032#comment-14952032 ] Anubhav Dhoot commented on YARN-4247: - Tested this in a cluster. Before this fix the cluster would fall

[jira] [Updated] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4247: Attachment: YARN-4247.001.patch Fix removes need for locking from FSAppAttempt to RMAppAttemptImpl.

[jira] [Updated] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4247: Attachment: YARN-4247.001.patch retrigger jenkins > Deadlock in FSAppAttempt and RMAppAttemptImpl

[jira] [Commented] (YARN-4235) FairScheduler PrimaryGroup does not handle empty groups returned for a user

2015-10-09 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951331#comment-14951331 ] Anubhav Dhoot commented on YARN-4235: - Thanks [~rohithsharma] for review and commit! > FairScheduler

[jira] [Commented] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951161#comment-14951161 ] Anubhav Dhoot commented on YARN-4247: - Looking at the jstack here is the deadlock between FS and

[jira] [Moved] (YARN-4247) Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events

2015-10-09 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot moved MAPREDUCE-6509 to YARN-4247: Component/s: (was: resourcemanager)

[jira] [Updated] (YARN-4235) FairScheduler PrimaryGroup does not handle empty groups returned for a user

2015-10-07 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4235: Attachment: YARN-4235.001.patch Handle empty groups > FairScheduler PrimaryGroup does not handle

[jira] [Created] (YARN-4235) FairScheduler PrimaryGroup does not handle empty groups returned for a user

2015-10-07 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4235: --- Summary: FairScheduler PrimaryGroup does not handle empty groups returned for a user Key: YARN-4235 URL: https://issues.apache.org/jira/browse/YARN-4235 Project:

[jira] [Commented] (YARN-4185) Retry interval delay for NM client can be improved from the fixed static retry

2015-10-05 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14943809#comment-14943809 ] Anubhav Dhoot commented on YARN-4185: - I don't think option 2 where you restart from 1 makes sense. Its

[jira] [Commented] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-10-02 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941443#comment-14941443 ] Anubhav Dhoot commented on YARN-3996: - Approach looks ok > YARN-789 (Support for zero capabilities in

[jira] [Updated] (YARN-4185) Retry interval delay for NM client can be improved from the fixed static retry

2015-09-30 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4185: Assignee: Neelesh Srinivas Salian (was: Anubhav Dhoot) > Retry interval delay for NM client can be

[jira] [Commented] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-09-30 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938225#comment-14938225 ] Anubhav Dhoot commented on YARN-3996: - SchedulerUtils has multiple overloads of normalizeRequests. The

[jira] [Commented] (YARN-4185) Retry interval delay for NM client can be improved from the fixed static retry

2015-09-30 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938190#comment-14938190 ] Anubhav Dhoot commented on YARN-4185: - can we try to reuse the existing values for retries

[jira] [Updated] (YARN-3996) YARN-789 (Support for zero capabilities in fairscheduler) is broken after YARN-3305

2015-09-30 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3996: Assignee: Neelesh Srinivas Salian (was: Anubhav Dhoot) > YARN-789 (Support for zero capabilities in

[jira] [Commented] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-28 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14933547#comment-14933547 ] Anubhav Dhoot commented on YARN-4204: - Committed to trunk and branch-2. Thanks [~kasha] for the review.

[jira] [Updated] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-28 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4180: Attachment: YARN-4180-branch-2.7.2.txt Minor conflicts in backporting changes to branch 2.7 >

[jira] [Updated] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-24 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4204: Description: Saw this exception which caused RM to go down {noformat}

[jira] [Updated] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-24 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4180: Attachment: YARN-4180.002.patch Try triggering jenkins again > AMLauncher does not retry on

[jira] [Updated] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-24 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4204: Attachment: YARN-4204.002.patch Add unit test to repro ConcurrentModificationException >

[jira] [Commented] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-23 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905621#comment-14905621 ] Anubhav Dhoot commented on YARN-4204: - Issue is getChildQueues is returning an unmodifiable list

[jira] [Updated] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-23 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4204: Attachment: YARN-4204.001.patch > ConcurrentModificationException in FairSchedulerQueueInfo >

[jira] [Created] (YARN-4204) ConcurrentModificationException in FairSchedulerQueueInfo

2015-09-23 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4204: --- Summary: ConcurrentModificationException in FairSchedulerQueueInfo Key: YARN-4204 URL: https://issues.apache.org/jira/browse/YARN-4204 Project: Hadoop YARN

[jira] [Commented] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-22 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903217#comment-14903217 ] Anubhav Dhoot commented on YARN-4180: - The test failure looks unrelated. > AMLauncher does not retry

[jira] [Updated] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-22 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4180: Attachment: YARN-4180.002.patch Addressed feedback > AMLauncher does not retry on failures when

[jira] [Updated] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4180: Attachment: YARN-4180.001.patch reuse the same retry proxy used by AM client for RM client. Also

[jira] [Created] (YARN-4185) Retry interval delay for NM client can be improved from the fixed static retry

2015-09-18 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4185: --- Summary: Retry interval delay for NM client can be improved from the fixed static retry Key: YARN-4185 URL: https://issues.apache.org/jira/browse/YARN-4185 Project:

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.003.patch > Make ReservationSystem persist state using RMStateStore

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876038#comment-14876038 ] Anubhav Dhoot commented on YARN-2005: - Thanks [~jianhe], [~sunilg], [~jlowe] for the reviews and

[jira] [Commented] (YARN-4131) Add API and CLI to kill container on given containerId

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876669#comment-14876669 ] Anubhav Dhoot commented on YARN-4131: - Can you please add [~kasha] and me to this? We are interested in

[jira] [Commented] (YARN-3920) FairScheduler container reservation on a node should be configurable to limit it to large containers

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876473#comment-14876473 ] Anubhav Dhoot commented on YARN-3920: - Thx [~asuresh] for review and commit! > FairScheduler container

[jira] [Commented] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876494#comment-14876494 ] Anubhav Dhoot commented on YARN-4143: - Cannot think of an API to add to the scheduler that will be

[jira] [Created] (YARN-4184) Remove update reservation state api from state store as its not used by ReservationSystem

2015-09-18 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4184: --- Summary: Remove update reservation state api from state store as its not used by ReservationSystem Key: YARN-4184 URL: https://issues.apache.org/jira/browse/YARN-4184

[jira] [Commented] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-18 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876049#comment-14876049 ] Anubhav Dhoot commented on YARN-3985: - Addressed feedback and opened YARN-4184 for removing

[jira] [Commented] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-17 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804344#comment-14804344 ] Anubhav Dhoot commented on YARN-4143: - I think we can minimize the impact of checking on every allocate

[jira] [Updated] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-17 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4143: Attachment: YARN-4143.001.patch Attached patch ensures checks are done only when AM is not allocated

[jira] [Created] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-17 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4180: --- Summary: AMLauncher does not retry on failures when talking to NM Key: YARN-4180 URL: https://issues.apache.org/jira/browse/YARN-4180 Project: Hadoop YARN

[jira] [Commented] (YARN-4180) AMLauncher does not retry on failures when talking to NM

2015-09-17 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804619#comment-14804619 ] Anubhav Dhoot commented on YARN-4180: - Propose using retries in the ContainerManagement proxy used by

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.002.patch All the failed tests passed locally for me. Rerunning the tests >

[jira] [Commented] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-16 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791259#comment-14791259 ] Anubhav Dhoot commented on YARN-4143: - Copying comments from YARN-2005 Sunil G added a comment -

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-15 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.002.patch Retriggering jenkins as failures seem unrelated > Make

[jira] [Commented] (YARN-4130) Duplicate declaration of ApplicationId in RMAppManager

2015-09-15 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745782#comment-14745782 ] Anubhav Dhoot commented on YARN-4130: - LGTM the failures look unrelated > Duplicate declaration of

[jira] [Commented] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-15 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745893#comment-14745893 ] Anubhav Dhoot commented on YARN-3985: - failures are due to rebase with YARN-3656. Will update those new

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-15 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.002.patch Updated patch to modify new tests to add valid ReservationDefinition

[jira] [Commented] (YARN-4135) Improve the assertion message in MockRM while failing after waiting for the state.

2015-09-15 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745775#comment-14745775 ] Anubhav Dhoot commented on YARN-4135: - The patch looks fine except for missing spaces at before and

[jira] [Created] (YARN-4156) testAMBlacklistPreventsRestartOnSameNode assumes CapacityScheduler

2015-09-14 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4156: --- Summary: testAMBlacklistPreventsRestartOnSameNode assumes CapacityScheduler Key: YARN-4156 URL: https://issues.apache.org/jira/browse/YARN-4156 Project: Hadoop YARN

[jira] [Updated] (YARN-4156) testAMBlacklistPreventsRestartOnSameNode assumes CapacityScheduler

2015-09-14 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4156: Attachment: YARN-4156.001.patch Uploading patch that configures the scheduler to be

[jira] [Commented] (YARN-3784) Indicate preemption timout along with the list of containers to AM (preemption message)

2015-09-14 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14744006#comment-14744006 ] Anubhav Dhoot commented on YARN-3784: - Hi [~sunilg] this does not include support for FairScheduler.

[jira] [Commented] (YARN-4115) Reduce loglevel of ContainerManagementProtocolProxy to Debug

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741314#comment-14741314 ] Anubhav Dhoot commented on YARN-4115: - The test failure looks unrelated. > Reduce loglevel of

[jira] [Assigned] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-3273: --- Assignee: Anubhav Dhoot (was: Rohith Sharma K S) > Improve web UI to facilitate scheduling

[jira] [Commented] (YARN-4145) Make RMHATestBase abstract so its not run when running all tests under that namespace

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741309#comment-14741309 ] Anubhav Dhoot commented on YARN-4145: - The timed out tests are not related to this base class. > Make

[jira] [Updated] (YARN-4150) Failure in TestNMClient because nodereports were not available

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4150: Description: Saw a failure in a test run

[jira] [Updated] (YARN-4150) Failure in TestNMClient because nodereports were not available

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4150: Attachment: YARN-4150.001.patch Simple fix to wait for nodemanagers to be up before trying to get

[jira] [Created] (YARN-4150) Failure in TestNMClient because nodereports were not available

2015-09-11 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4150: --- Summary: Failure in TestNMClient because nodereports were not available Key: YARN-4150 URL: https://issues.apache.org/jira/browse/YARN-4150 Project: Hadoop YARN

[jira] [Commented] (YARN-4115) Reduce loglevel of ContainerManagementProtocolProxy to Debug

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741334#comment-14741334 ] Anubhav Dhoot commented on YARN-4115: - The test passes for me locally. Opened YARN-4150 for fixing the

[jira] [Commented] (YARN-4150) Failure in TestNMClient because nodereports were not available

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14741338#comment-14741338 ] Anubhav Dhoot commented on YARN-4150: - This is most likely due to the test reading the node reports

[jira] [Updated] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging

2015-09-11 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3273: Assignee: Rohith Sharma K S (was: Anubhav Dhoot) > Improve web UI to facilitate scheduling analysis

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Attachment: YARN-3985.001.patch Added a patch that calls into state store and unit test that

[jira] [Commented] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739939#comment-14739939 ] Anubhav Dhoot commented on YARN-3985: - Since updateReservation does an add and remove we do not have

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739700#comment-14739700 ] Anubhav Dhoot commented on YARN-2005: - [~sunilg] thats a good suggestion. Added a followup for this

[jira] [Updated] (YARN-4145) Make RMHATestBase abstract so its not run when running all tests under that namespace

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4145: Attachment: YARN-4145.001.patch > Make RMHATestBase abstract so its not run when running all tests

[jira] [Updated] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2005: Attachment: YARN-2005.009.patch Addressed feedback > Blacklisting support for scheduling AMs >

[jira] [Created] (YARN-4144) Add NM that causes LaunchFailedTransition to blacklist

2015-09-10 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4144: --- Summary: Add NM that causes LaunchFailedTransition to blacklist Key: YARN-4144 URL: https://issues.apache.org/jira/browse/YARN-4144 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739708#comment-14739708 ] Anubhav Dhoot commented on YARN-2005: - Added YARN-4144 to add the node that causes

[jira] [Created] (YARN-4145) Make RMHATestBase abstract so its not run when running all tests under that namespace

2015-09-10 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4145: --- Summary: Make RMHATestBase abstract so its not run when running all tests under that namespace Key: YARN-4145 URL: https://issues.apache.org/jira/browse/YARN-4145

[jira] [Updated] (YARN-4145) Make RMHATestBase abstract so its not run when running all tests under that namespace

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4145: Description: Make it abstract to avoid running it as a test (was: Trivial patch to make it

[jira] [Created] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-10 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4143: --- Summary: Optimize the check for AMContainer allocation needed by blacklisting and ContainerType Key: YARN-4143 URL: https://issues.apache.org/jira/browse/YARN-4143

[jira] [Assigned] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-4143: --- Assignee: Anubhav Dhoot > Optimize the check for AMContainer allocation needed by

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739614#comment-14739614 ] Anubhav Dhoot commented on YARN-2005: - Hi [~kasha] thanks for your comments. 2.4 - we do not need to

[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-09-10 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739618#comment-14739618 ] Anubhav Dhoot commented on YARN-2005: - [~He Tianyi] yes we are using the ContainerExitStatus in this.

[jira] [Updated] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs

2015-09-08 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3985: Component/s: (was: fairscheduler) (was: capacityscheduler) > Make

[jira] [Updated] (YARN-4115) Reduce loglevel of ContainerManagementProtocolProxy to Debug

2015-09-04 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4115: Attachment: YARN-4115.001.patch Change the default log level to Debug > Reduce loglevel of

[jira] [Created] (YARN-4115) Reduce loglevel of ContainerManagementProtocolProxy to Debug

2015-09-04 Thread Anubhav Dhoot (JIRA)
Anubhav Dhoot created YARN-4115: --- Summary: Reduce loglevel of ContainerManagementProtocolProxy to Debug Key: YARN-4115 URL: https://issues.apache.org/jira/browse/YARN-4115 Project: Hadoop YARN

[jira] [Commented] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests

2015-09-04 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731244#comment-14731244 ] Anubhav Dhoot commented on YARN-3676: - Thanks [~asuresh] for working on this. I see the patch continues

[jira] [Commented] (YARN-4087) Set YARN_FAIL_FAST to be false by default

2015-09-03 Thread Anubhav Dhoot (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729422#comment-14729422 ] Anubhav Dhoot commented on YARN-4087: - In general if we are not failing the daemon if fail fast flag is

  1   2   3   4   5   6   >