[jira] [Commented] (YARN-6524) Avoid storing unnecessary information in the Memory for the finished apps

2017-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982974#comment-15982974 ] Jason Lowe commented on YARN-6524: -- This is a duplicate of YARN-65. > Avoid storing unnecessary

[jira] [Resolved] (YARN-6524) Avoid storing unnecessary information in the Memory for the finished apps

2017-04-25 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-6524. -- Resolution: Duplicate > Avoid storing unnecessary information in the Memory for the finished apps >

[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-04-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981164#comment-15981164 ] Jason Lowe commented on YARN-5892: -- I don't understand imposing a hard limit of weight < 100/MULP. For

[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException

2017-04-21 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15978705#comment-15978705 ] Jason Lowe commented on YARN-3839: -- Please see my [earlier

[jira] [Commented] (YARN-6501) FSSchedulerNode.java fails to compile with JDK7

2017-04-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976822#comment-15976822 ] Jason Lowe commented on YARN-6501: -- +1 committing this. > FSSchedulerNode.java fails to compile with JDK7

[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException

2017-04-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975315#comment-15975315 ] Jason Lowe commented on YARN-3839: -- The QA bot is probably having issues with the patch since there are a

[jira] [Commented] (YARN-6272) TestAMRMClient#testAMRMClientWithContainerResourceChange fails intermittently

2017-04-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974876#comment-15974876 ] Jason Lowe commented on YARN-6272: -- I've also seen this stacktrace on 2.8: {noformat}

[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-04-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974679#comment-15974679 ] Jason Lowe commented on YARN-2113: -- The more I think about this, I believe it is completely correct to

[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973638#comment-15973638 ] Jason Lowe commented on YARN-2113: -- Is a deadzone the proper way to fix this? I'm thinking of a case

[jira] [Commented] (YARN-6467) CSQueueMetrics needs to update the current metrics for default partition only

2017-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973466#comment-15973466 ] Jason Lowe commented on YARN-6467: -- bq. I thought of segregating partition based queue metrics in a

[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973168#comment-15973168 ] Jason Lowe commented on YARN-5892: -- bq. Also, weight of users applies to hard limit of user (user limit

[jira] [Commented] (YARN-5892) Capacity Scheduler: Support user-specific minimum user limit percent

2017-04-18 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972821#comment-15972821 ] Jason Lowe commented on YARN-5892: -- I'm +1 for weight == 0. As long as it doesn't break the code (e.g.:

[jira] [Commented] (YARN-6480) Timeout is too aggressive for TestAMRestart.testPreemptedAMRestartOnRMRestart

2017-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969614#comment-15969614 ] Jason Lowe commented on YARN-6480: -- +1 lgtm. Committing this. > Timeout is too aggressive for

[jira] [Commented] (YARN-2985) YARN should support to delete the aggregated logs for Non-MapReduce applications

2017-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969088#comment-15969088 ] Jason Lowe commented on YARN-2985: -- Doing a config for branch-2 seems reasonable. bq. my understanding is

[jira] [Assigned] (YARN-3839) Quit throwing NMNotYetReadyException

2017-04-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-3839: Assignee: Manikandan R > Quit throwing NMNotYetReadyException >

[jira] [Updated] (YARN-5617) AMs only intended to run one attempt can be run more than once

2017-04-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5617: - Attachment: YARN-5617.003.patch Updated the patch. > AMs only intended to run one attempt can be run more

[jira] [Commented] (YARN-3839) Quit throwing NMNotYetReadyException

2017-04-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15968022#comment-15968022 ] Jason Lowe commented on YARN-3839: -- My understanding is the same. It looks like the existing cases when

[jira] [Commented] (YARN-2985) YARN should support to delete the aggregated logs for Non-MapReduce applications

2017-04-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967748#comment-15967748 ] Jason Lowe commented on YARN-2985: -- Based on the description of this JIRA, I think there's some confusion

[jira] [Commented] (YARN-6461) TestRMAdminCLI has very low test timeouts

2017-04-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964424#comment-15964424 ] Jason Lowe commented on YARN-6461: -- +1 lgtm. Committing this. > TestRMAdminCLI has very low test

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-04-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964375#comment-15964375 ] Jason Lowe commented on YARN-6195: -- I do not think a separate HADOOP JIRA is necessary here. Committing

[jira] [Created] (YARN-6461) TestRMAdminCLI has very low test timeouts

2017-04-10 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6461: Summary: TestRMAdminCLI has very low test timeouts Key: YARN-6461 URL: https://issues.apache.org/jira/browse/YARN-6461 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-6456) Isolation of Docker containers In LinuxContainerExecutor

2017-04-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15962801#comment-15962801 ] Jason Lowe commented on YARN-6456: -- bq. DockerLinuxContainerRuntime mounts containerLocalDirs

[jira] [Commented] (YARN-6451) Create a monitor to check whether we maintain RM (scheduling) invariants

2017-04-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961022#comment-15961022 ] Jason Lowe commented on YARN-6451: -- Interesting idea. For some of these invariants, would it make more

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-04-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960991#comment-15960991 ] Jason Lowe commented on YARN-6195: -- Thanks for updating the patch! At first I thought we had a potential

[jira] [Commented] (YARN-6443) Allow for Priority order relaxing in favor of improved node/rack locality

2017-04-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15960755#comment-15960755 ] Jason Lowe commented on YARN-6443: -- Ah, so this apparently is describing a problem that only can occur if

[jira] [Commented] (YARN-6288) Exceptions during aggregated log writes are mishandled

2017-04-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15959782#comment-15959782 ] Jason Lowe commented on YARN-6288: -- +1 for the branch-2.8 patch as well. The unit test failures are

[jira] [Updated] (YARN-6288) Exceptions during aggregated log writes are mishandled

2017-04-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6288: - Affects Version/s: 2.8.0 2.7.3 Priority: Critical (was: Minor)

[jira] [Commented] (YARN-6344) Rethinking OFF_SWITCH locality in CapacityScheduler

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957849#comment-15957849 ] Jason Lowe commented on YARN-6344: -- I'd prefer a configured rack locality delay of zero means no

[jira] [Updated] (YARN-6450) TestContainerManagerWithLCE requires override for each new test added to ContainerManagerTest

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6450: - Attachment: YARN-6450.001.patch Using {{Assume.assumeTrue(shouldRunTest())}} in the existing setup

[jira] [Created] (YARN-6450) TestContainerManagerWithLCE requires override for each new test added to ContainerManagerTest

2017-04-05 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6450: Summary: TestContainerManagerWithLCE requires override for each new test added to ContainerManagerTest Key: YARN-6450 URL: https://issues.apache.org/jira/browse/YARN-6450

[jira] [Commented] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957290#comment-15957290 ] Jason Lowe commented on YARN-6403: -- +1 for the latest trunk and 2.8 patches. Committing this. > Invalid

[jira] [Updated] (YARN-6436) TestSchedulingPolicy#testParseSchedulingPolicy timeout is too low

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6436: - Fix Version/s: 2.8.1 I committed this to branch-2.8 as well. >

[jira] [Commented] (YARN-6443) Allow for Priority order relaxing in favor of improved node/rack locality

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956947#comment-15956947 ] Jason Lowe commented on YARN-6443: -- Could you elaborate a bit on the use case where an application bothers

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-04-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956922#comment-15956922 ] Jason Lowe commented on YARN-6195: -- I'm totally OK with just reporting the default partition's stats in

[jira] [Updated] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-04-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6403: - Attachment: YARN-6403.branch-2.8.004.patch Thanks for updating the patch! Looks good to me. Posting the

[jira] [Commented] (YARN-6436) TestSchedulingPolicy#testParseSchedulingPolicy timeout is too low

2017-04-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1594#comment-1594 ] Jason Lowe commented on YARN-6436: -- Do we even need a timeout for this test? >

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-04-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1592#comment-1592 ] Jason Lowe commented on YARN-6195: -- This seems like a reasonable approach to take until the node label

[jira] [Updated] (YARN-6437) TestSignalContainer#testSignalRequestDeliveryToNM fails intermittently

2017-04-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6437: - Attachment: YARN-6437.001.patch Patch that accumulates the received containers across all the allocate

[jira] [Created] (YARN-6437) TestSignalContainer#testSignalRequestDeliveryToNM fails intermittently

2017-04-04 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6437: Summary: TestSignalContainer#testSignalRequestDeliveryToNM fails intermittently Key: YARN-6437 URL: https://issues.apache.org/jira/browse/YARN-6437 Project: Hadoop YARN

[jira] [Created] (YARN-6436) TestSchedulingPolicy#testParseSchedulingPolicy timeout is too low

2017-04-04 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6436: Summary: TestSchedulingPolicy#testParseSchedulingPolicy timeout is too low Key: YARN-6436 URL: https://issues.apache.org/jira/browse/YARN-6436 Project: Hadoop YARN

[jira] [Commented] (YARN-6406) Garbage Collect unused SchedulerRequestKeys

2017-03-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951542#comment-15951542 ] Jason Lowe commented on YARN-6406: -- Yep, the refcount was only added because of the possibility of the two

[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-03-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951538#comment-15951538 ] Jason Lowe commented on YARN-2113: -- The answer is no to both questions. Both users are below the 100%

[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-03-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951445#comment-15951445 ] Jason Lowe commented on YARN-2113: -- bq. once user submit app3 with highest priority, it should get

[jira] [Commented] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-03-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951182#comment-15951182 ] Jason Lowe commented on YARN-6403: -- Thanks for updating the patch! bq.

[jira] [Commented] (YARN-6411) Clean up the overwrite of createDispatcher() in subclass of MockRM

2017-03-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951087#comment-15951087 ] Jason Lowe commented on YARN-6411: -- +1 lgtm. Committing this. > Clean up the overwrite of

[jira] [Commented] (YARN-6354) LeveldbRMStateStore can parse invalid keys when recovering reservations

2017-03-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949637#comment-15949637 ] Jason Lowe commented on YARN-6354: -- The TestRMRestart failure is unrelated. > LeveldbRMStateStore can

[jira] [Commented] (YARN-6411) Clean up the overwrite of createDispatcher() in subclass of MockRM

2017-03-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949230#comment-15949230 ] Jason Lowe commented on YARN-6411: -- Thanks for the patch! Looks good overall. It would be nice to

[jira] [Assigned] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-03-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-6403: Assignee: Tao Yang Target Version/s: 2.8.1 Submitting patch so Jenkins can comment on

[jira] [Updated] (YARN-6354) LeveldbRMStateStore can parse invalid keys when recovering reservations

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6354: - Attachment: YARN-6354.001.patch Patch that adds a termination check for the reservation key traversal loop

[jira] [Assigned] (YARN-6354) LeveldbRMStateStore can parse invalid keys when recovering reservations

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-6354: Assignee: Jason Lowe > LeveldbRMStateStore can parse invalid keys when recovering reservations >

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947730#comment-15947730 ] Jason Lowe commented on YARN-6195: -- Latest patch lgtm, with the caveat that I don't think we can really

[jira] [Commented] (YARN-6401) terminating signal should be able to specify per application to support graceful-stop

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947656#comment-15947656 ] Jason Lowe commented on YARN-6401: -- Ah, sorry. I was thinking it was ignoring SIGTERM and thus not

[jira] [Commented] (YARN-6168) Restarted RM may not inform AM about all existing containers

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947326#comment-15947326 ] Jason Lowe commented on YARN-6168: -- This sounds like the RM isn't waiting long enough for all the live NMs

[jira] [Commented] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947283#comment-15947283 ] Jason Lowe commented on YARN-6403: -- Sorry, I completely missed the server-side change in ContainerImpl.

[jira] [Commented] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-03-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947251#comment-15947251 ] Jason Lowe commented on YARN-6403: -- Thanks for the patch! This patch is changing the client code but not

[jira] [Commented] (YARN-6406) Garbage Collect unused SchedulerRequestKeys

2017-03-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15946028#comment-15946028 ] Jason Lowe commented on YARN-6406: -- I haven't dug into YARN-6040, but in general I'm a big +1 for having

[jira] [Commented] (YARN-6403) Invalid local resource request can raise NPE and make NM exit

2017-03-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15945116#comment-15945116 ] Jason Lowe commented on YARN-6403: -- The NM should definitely be hardened against malformed data being sent

[jira] [Commented] (YARN-6359) TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition

2017-03-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944122#comment-15944122 ] Jason Lowe commented on YARN-6359: -- +1 lgtm. Will commit this tomorrow if there are no objections. >

[jira] [Commented] (YARN-2113) Add cross-user preemption within CapacityScheduler's leaf-queue

2017-03-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944117#comment-15944117 ] Jason Lowe commented on YARN-2113: -- I agree with Eric here. I see priority as a way for a user to change

[jira] [Commented] (YARN-6401) terminating signal should be able to specify per application to support graceful-stop

2017-03-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15943307#comment-15943307 ] Jason Lowe commented on YARN-6401: -- Is this something YARN needs to support directly? This seems

[jira] [Commented] (YARN-6217) TestLocalCacheDirectoryManager test timeout is too aggressive

2017-03-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930513#comment-15930513 ] Jason Lowe commented on YARN-6217: -- +1 lgtm. Committing this. > TestLocalCacheDirectoryManager test

[jira] [Commented] (YARN-6359) TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition

2017-03-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930358#comment-15930358 ] Jason Lowe commented on YARN-6359: -- Thanks for the report and patch! The timeout in the loop is 80

[jira] [Commented] (YARN-6217) TestLocalCacheDirectoryManager test timeout is too aggressive

2017-03-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930346#comment-15930346 ] Jason Lowe commented on YARN-6217: -- I tend to agree. Originally there was an edict to put a timeout on

[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files

2017-03-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930205#comment-15930205 ] Jason Lowe commented on YARN-6315: -- I tried to run this in an end-to-end test and found it doesn't work in

[jira] [Updated] (YARN-6354) LeveldbRMStateStore can parse invalid keys when recovering reservations

2017-03-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6354: - Priority: Major (was: Critical) Summary: LeveldbRMStateStore can parse invalid keys when recovering

[jira] [Commented] (YARN-6354) RM fails to upgrade to 2.8 with leveldb state store

2017-03-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928843#comment-15928843 ] Jason Lowe commented on YARN-6354: -- Sample stacktrace: {noformat} 2017-03-16 15:17:26,616 INFO [main]

[jira] [Created] (YARN-6354) RM fails to upgrade to 2.8 with leveldb state store

2017-03-16 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6354: Summary: RM fails to upgrade to 2.8 with leveldb state store Key: YARN-6354 URL: https://issues.apache.org/jira/browse/YARN-6354 Project: Hadoop YARN Issue Type:

[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files

2017-03-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928292#comment-15928292 ] Jason Lowe commented on YARN-6315: -- Thanks for updating the patch! Catching Exception is too wide of a

[jira] [Created] (YARN-6349) Container kill request from AM can be lost if container is still recovering

2017-03-16 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6349: Summary: Container kill request from AM can be lost if container is still recovering Key: YARN-6349 URL: https://issues.apache.org/jira/browse/YARN-6349 Project: Hadoop YARN

[jira] [Commented] (YARN-6349) Container kill request from AM can be lost if container is still recovering

2017-03-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928185#comment-15928185 ] Jason Lowe commented on YARN-6349: -- See YARN-4051 for related discussion. > Container kill request from

[jira] [Commented] (YARN-4051) ContainerKillEvent lost when container is still recovering and application finishes

2017-03-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928139#comment-15928139 ] Jason Lowe commented on YARN-4051: -- +1 for the branch-2 patch as well. The unit test failure appears to

[jira] [Commented] (YARN-4051) ContainerKillEvent lost when container is still recovering and application finishes

2017-03-15 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926276#comment-15926276 ] Jason Lowe commented on YARN-4051: -- +1 for the latest patch, however it doesn't apply to branch-2. Could

[jira] [Commented] (YARN-6315) Improve LocalResourcesTrackerImpl#isResourcePresent to return false for corrupted files

2017-03-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925152#comment-15925152 ] Jason Lowe commented on YARN-6315: -- Thanks for the patch! Looks good overall, just some minor nits:

[jira] [Commented] (YARN-6325) ParentQueue and LeafQueue with same name can cause queue name based operations to fail

2017-03-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924952#comment-15924952 ] Jason Lowe commented on YARN-6325: -- I don't have the full backstory on queue name requirements, but I

[jira] [Commented] (YARN-3884) App History status not updated when RMContainer transitions from RESERVED to KILLED

2017-03-14 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924637#comment-15924637 ] Jason Lowe commented on YARN-3884: -- If only nodemanagers are reporting then allocations that are never

[jira] [Updated] (YARN-4051) ContainerKillEvent lost when container is still recovering and application finishes

2017-03-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4051: - Summary: ContainerKillEvent lost when container is still recovering and application finishes (was:

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-03-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15922687#comment-15922687 ] Jason Lowe commented on YARN-6195: -- Pinging [~leftnoteasy] and [~sunilg] to see if there's an opinion on

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-03-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907541#comment-15907541 ] Jason Lowe commented on YARN-6195: -- Thanks for updating the patch! I discovered the queue metrics are

[jira] [Commented] (YARN-6321) TestResources test timeouts are too aggressive

2017-03-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905561#comment-15905561 ] Jason Lowe commented on YARN-6321: -- +1 lgtm. Committing this. > TestResources test timeouts are too

[jira] [Commented] (YARN-6310) OutputStreams in AggregatedLogFormat.LogWriter can be left open upon exceptions

2017-03-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905407#comment-15905407 ] Jason Lowe commented on YARN-6310: -- +1 lgtm. Committing this. > OutputStreams in

[jira] [Commented] (YARN-6321) TestResources test timeouts are too aggressive

2017-03-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905372#comment-15905372 ] Jason Lowe commented on YARN-6321: -- Sample stacktrace from a timeout: {noformat} java.lang.Exception: test

[jira] [Created] (YARN-6321) TestResources test timeouts are too aggressive

2017-03-10 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6321: Summary: TestResources test timeouts are too aggressive Key: YARN-6321 URL: https://issues.apache.org/jira/browse/YARN-6321 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-03-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905293#comment-15905293 ] Jason Lowe commented on YARN-6195: -- What really needs to happen is a metrics-per-partition similar to how

[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902110#comment-15902110 ] Jason Lowe commented on YARN-6165: -- +1 lgtm. The TestRMRestart failure is unrelated and will be fixed by

[jira] [Commented] (YARN-6165) Intra-queue preemption occurs even when preemption is turned off for a specific queue.

2017-03-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901915#comment-15901915 ] Jason Lowe commented on YARN-6165: -- I'd like to take a quick look, and this needs a Jenkins run anyway. >

[jira] [Commented] (YARN-4236) Metric for aggregated resources allocation per queue

2017-03-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901600#comment-15901600 ] Jason Lowe commented on YARN-4236: -- Oops spoke too soon. Just before committing I noticed there's a place

[jira] [Commented] (YARN-4236) Metric for aggregated resources allocation per queue

2017-03-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901575#comment-15901575 ] Jason Lowe commented on YARN-4236: -- +1 lgtm. Committing this. > Metric for aggregated resources

[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering

2017-03-08 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901553#comment-15901553 ] Jason Lowe commented on YARN-4051: -- Thanks for updating the patch! In the future, please don't delete

[jira] [Commented] (YARN-6292) YARN log aggregation doesn't support HDFS/ViewFs namespace other than what is specified in fs.defaultFS

2017-03-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898091#comment-15898091 ] Jason Lowe commented on YARN-6292: -- What version is involved here? Is this a duplicate of YARN-3269? >

[jira] [Commented] (YARN-6195) Export UsedCapacity and AbsoluteUsedCapacity to JMX

2017-03-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898083#comment-15898083 ] Jason Lowe commented on YARN-6195: -- Thanks for the patch, [~benson.qiu]! I'm confused about how labels

[jira] [Commented] (YARN-6276) Now container kill mechanism may lead process leak

2017-03-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897373#comment-15897373 ] Jason Lowe commented on YARN-6276: -- Processes escaping from the session is a known problem. If that's the

[jira] [Updated] (YARN-6274) Documentation refers to incorrect nodemanager health checker interval property

2017-03-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6274: - Summary: Documentation refers to incorrect nodemanager health checker interval property (was: One error

[jira] [Updated] (YARN-6274) One error in the documentation of hadoop 2.7.3

2017-03-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6274: - Fix Version/s: (was: 2.7.3) Thanks for the report, [~rebeyond1218] and for the patch, [~cheersyang]!

[jira] [Commented] (YARN-6276) Now container kill mechanism may lead process leak

2017-03-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894382#comment-15894382 ] Jason Lowe commented on YARN-6276: -- When the nodemanager kills a container it first sends a SIGTERM to the

[jira] [Commented] (YARN-6263) NMTokenSecretManagerInRM.createAndGetNMToken is not thread safe

2017-03-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892647#comment-15892647 ] Jason Lowe commented on YARN-6263: -- +1 lgtm. The unit test failures do not appear to be related, and the

[jira] [Commented] (YARN-3884) App History status not updated when RMContainer transitions from RESERVED to KILLED

2017-02-22 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879197#comment-15879197 ] Jason Lowe commented on YARN-3884: -- +1 for only publishing metrics for "real" containers that an

[jira] [Created] (YARN-6217) TestLocalCacheDirectoryManager test timeout is too aggressive

2017-02-22 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-6217: Summary: TestLocalCacheDirectoryManager test timeout is too aggressive Key: YARN-6217 URL: https://issues.apache.org/jira/browse/YARN-6217 Project: Hadoop YARN

[jira] [Commented] (YARN-6214) NullPointer Exception while querying timeline server API

2017-02-22 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878482#comment-15878482 ] Jason Lowe commented on YARN-6214: -- It's a little difficult to line up the source with that stacktrace.

[jira] [Commented] (YARN-6191) CapacityScheduler preemption by container priority can be problematic for MapReduce

2017-02-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870131#comment-15870131 ] Jason Lowe commented on YARN-6191: -- Thanks, Chris! Having the AM react to the preemption message in the

[jira] [Commented] (YARN-6177) Yarn client should exit with an informative error message if an incompatible Jersey library is used at client

2017-02-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870096#comment-15870096 ] Jason Lowe commented on YARN-6177: -- The best effort setting was primarily targeting the scenario where the

<    5   6   7   8   9   10   11   12   13   14   >