[jira] [Updated] (YARN-7454) RMAppAttemptMetrics#getAggregateResourceUsage can NPE due to double lookup

2017-11-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7454: - Description: RMAppAttemptMetrics#getAggregateResourceUsage does a double-lookup on a concurrent hash map,

[jira] [Created] (YARN-7454) RMAppAttemptMetrics#getAggregate can NPE due to double lookup

2017-11-07 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7454: Summary: RMAppAttemptMetrics#getAggregate can NPE due to double lookup Key: YARN-7454 URL: https://issues.apache.org/jira/browse/YARN-7454 Project: Hadoop YARN

[jira] [Updated] (YARN-7454) RMAppAttemptMetrics#getAggregateResourceUsage can NPE due to double lookup

2017-11-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7454: - Summary: RMAppAttemptMetrics#getAggregateResourceUsage can NPE due to double lookup (was:

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-11-07 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242223#comment-16242223 ] Jason Lowe commented on YARN-7102: -- Oh right, newNode is an RMNode but rmNode is an RMNodeImpl. My

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241014#comment-16241014 ] Jason Lowe commented on YARN-7197: -- bq. I see that I missed a key point about mounting above parent

[jira] [Commented] (YARN-7272) Enable timeline collector fault tolerance

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240801#comment-16240801 ] Jason Lowe commented on YARN-7272: -- bq. Another possible case to handle is the case where storage is down

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240765#comment-16240765 ] Jason Lowe commented on YARN-7197: -- bq. Even if it worked, we could be leaking private container

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240578#comment-16240578 ] Jason Lowe commented on YARN-7102: -- Thanks for updating the patch! The new

[jira] [Resolved] (YARN-7433) java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-7433. -- Resolution: Invalid Closing this since this is a user issue with the build and/or deployment of Hadoop

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-11-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16240440#comment-16240440 ] Jason Lowe commented on YARN-7197: -- bq. Explode might be exaggeration. Yes, by "explode" I mean the OS

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-11-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238528#comment-16238528 ] Jason Lowe commented on YARN-7197: -- Thanks for updating the patch! bq. Container-Executor has no prior

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-11-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238148#comment-16238148 ] Jason Lowe commented on YARN-7102: -- Thanks for updating the patch! I'm not so sure these tests are timing

[jira] [Updated] (YARN-7433) java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.

2017-11-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7433: - Environment: (was: From centos6.5 upgrade centos7,hadoop version(2.7.1) is compiled on centos6.5

[jira] [Updated] (YARN-7286) Add support for docker to have no capabilities

2017-11-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7286: - Fix Version/s: (was: 2.10.0) 2.9.0 I committed this to branch-2.9 as well. > Add

[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities

2017-11-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235823#comment-16235823 ] Jason Lowe commented on YARN-7286: -- I agree the unit test failure is unrelated. I verified that

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234333#comment-16234333 ] Jason Lowe commented on YARN-7197: -- Thanks for updating the patch! bq. If black list contains

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234235#comment-16234235 ] Jason Lowe commented on YARN-7102: -- bq. it is indeed a race condition between node heartbeat vs node

[jira] [Commented] (YARN-7422) Application History Server URL does not direct to the appropriate UI for failed/killed jobs

2017-11-01 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234083#comment-16234083 ] Jason Lowe commented on YARN-7422: -- I am a little confused on the goals of this JIRA. This cannot be

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-10-31 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226800#comment-16226800 ] Jason Lowe commented on YARN-7197: -- I was under the impression the blacklist would only mount the empty

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-10-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225812#comment-16225812 ] Jason Lowe commented on YARN-7197: -- Solution 3 is more secure since the paths are unavailable within the

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-10-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225721#comment-16225721 ] Jason Lowe commented on YARN-7244: -- The ASF warnings are unrelated. +1 for the branch-2.8 patch.

[jira] [Commented] (YARN-7408) total capacity could be occupied by a large container request

2017-10-30 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224965#comment-16224965 ] Jason Lowe commented on YARN-7408: -- Yes. reservation-continue-looking simply means the scheduler will

[jira] [Commented] (YARN-7299) TestDistributedScheduler is failing

2017-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222876#comment-16222876 ] Jason Lowe commented on YARN-7299: -- Any update on this? This is still showing up a lot in precommit

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222874#comment-16222874 ] Jason Lowe commented on YARN-7244: -- +1 for the latest patch. The unit test failure is unrelated and

[jira] [Commented] (YARN-7197) Add support for a volume blacklist for docker containers

2017-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222610#comment-16222610 ] Jason Lowe commented on YARN-7197: -- Like [~ebadger], I am a bit confused on how this adds a lot of value.

[jira] [Commented] (YARN-7408) total capacity could be occupied by a large container request

2017-10-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222472#comment-16222472 ] Jason Lowe commented on YARN-7408: -- I assume you are using CapacityScheduler? The answers may change for

[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations

2017-10-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217791#comment-16217791 ] Jason Lowe commented on YARN-4511: -- Yeah for a dev branch scenario it greatly reduces the number of

[jira] [Commented] (YARN-4511) Common scheduler changes supporting scheduler-specific implementations

2017-10-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217700#comment-16217700 ] Jason Lowe commented on YARN-4511: -- bq. I can the patch up if we are willing to check in the rest of the

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-24 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217597#comment-16217597 ] Jason Lowe commented on YARN-7102: -- I'm wondering about the branch-2 patch test failures. The SLS

[jira] [Commented] (YARN-4163) Audit getQueueInfo and getApplications calls

2017-10-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213011#comment-16213011 ] Jason Lowe commented on YARN-4163: -- Thanks for updating the patch! +1 lgtm. > Audit getQueueInfo and

[jira] [Updated] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-20 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7102: - Attachment: YARN-7102-branch-2.v9.patch Thanks for porting the patches! I'm uploading the branch-2 patch

[jira] [Commented] (YARN-7365) ResourceLocalization cache cleanup thread stuck

2017-10-19 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211770#comment-16211770 ] Jason Lowe commented on YARN-7365: -- Thanks for the report! YARN-4655 apparently only went into 2.9. Is

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208274#comment-16208274 ] Jason Lowe commented on YARN-7102: -- Thanks for updating the patch! +1 for the trunk patch. It does not

[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities

2017-10-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208159#comment-16208159 ] Jason Lowe commented on YARN-7286: -- Thanks for updating the patch! Copy-n-paste error on the assert

[jira] [Commented] (YARN-7341) TestRouterWebServiceUtil#testMergeMetrics is flakey

2017-10-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208092#comment-16208092 ] Jason Lowe commented on YARN-7341: -- Ah, sorry, I was simply going off of the JIRA versions and should have

[jira] [Commented] (YARN-7341) TestRouterWebServiceUtil#testMergeMetrics is flakey

2017-10-17 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208048#comment-16208048 ] Jason Lowe commented on YARN-7341: -- [~haibochen] does this need to go into branch-2 as well? YARN-7095

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-10-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206560#comment-16206560 ] Jason Lowe commented on YARN-7244: -- Thanks for updating the patch! Nit: The whitespace separating the

[jira] [Updated] (YARN-7333) container-executor fails to remove entries from a directory that is not writable or executable

2017-10-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7333: - Attachment: YARN-7333.002.patch Thanks for the review, Nathan! Updated the patch to fix the log messages.

[jira] [Commented] (YARN-7333) container-executor fails to remove entries from a directory that is not writable or executable

2017-10-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206102#comment-16206102 ] Jason Lowe commented on YARN-7333: -- The TestDistributedScheduler failure is an unrelated, known issue

[jira] [Updated] (YARN-7333) container-executor fails to remove entries from a directory that is not writable or executable

2017-10-16 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7333: - Attachment: YARN-7333.001.patch Attaching a patch that avoids opening the file unless we believe it is a

[jira] [Created] (YARN-7333) container-executor fails to remove entries from a directory that is not writable or executable

2017-10-16 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7333: Summary: container-executor fails to remove entries from a directory that is not writable or executable Key: YARN-7333 URL: https://issues.apache.org/jira/browse/YARN-7333

[jira] [Commented] (YARN-7246) Fix the default docker binary path

2017-10-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204143#comment-16204143 ] Jason Lowe commented on YARN-7246: -- Thanks for updating the patch! +1 lgtm. Committing this. > Fix the

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-10-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204107#comment-16204107 ] Jason Lowe commented on YARN-7244: -- I forgot to mention that AuxiliaryLocalPathHandler should be marked

[jira] [Updated] (YARN-7325) Remove unused container variable in DockerLinuxContainerRuntime

2017-10-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7325: - Summary: Remove unused container variable in DockerLinuxContainerRuntime (was: Remove unused container

[jira] [Commented] (YARN-7325) Remove unused container variable in DockerLinuxContainerRuntime for branch-2.8

2017-10-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204089#comment-16204089 ] Jason Lowe commented on YARN-7325: -- Thanks for the patch! +1 lgtm. Committing this. > Remove unused

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-10-13 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203720#comment-16203720 ] Jason Lowe commented on YARN-7244: -- Thanks for updating the patch! The test failure appears to be

[jira] [Commented] (YARN-7190) Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user classpath

2017-10-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202601#comment-16202601 ] Jason Lowe commented on YARN-7190: -- Patch looks good overall, works as advertised. It would be good to

[jira] [Commented] (YARN-4163) Audit getQueueInfo and getApplications calls

2017-10-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202180#comment-16202180 ] Jason Lowe commented on YARN-4163: -- Thanks for updating the patch! The builder pattern solves the

[jira] [Resolved] (YARN-7319) java.net.UnknownHostException when trying contact node by hostname

2017-10-12 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-7319. -- Resolution: Invalid JIRA is for tracking features and defects in Apache Hadoop and not for general user

[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities

2017-10-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200987#comment-16200987 ] Jason Lowe commented on YARN-7286: -- I suspect the behavior difference stems from the different handling of

[jira] [Commented] (YARN-7082) TestContainerManagerSecurity failing in trunk

2017-10-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200457#comment-16200457 ] Jason Lowe commented on YARN-7082: -- Thanks for the patch! +1 lgtm. Committing this. >

[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-10-11 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200400#comment-16200400 ] Jason Lowe commented on YARN-6930: -- Thanks for updating the patch! +1 for the branch-2.8 patch.

[jira] [Updated] (YARN-7124) LogAggregationTFileController deletes/renames while file is open

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7124: - Attachment: YARN-7124.001.patch This isn't pretty, but it's small and I think it will do the trick. It

[jira] [Assigned] (YARN-7124) LogAggregationTFileController deletes/renames while file is open

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-7124: Assignee: Jason Lowe Affects Version/s: (was: 2.8.2) 2.9.0

[jira] [Comment Edited] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198941#comment-16198941 ] Jason Lowe edited comment on YARN-6930 at 10/10/17 6:26 PM: Thanks for the 2.8

[jira] [Commented] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198941#comment-16198941 ] Jason Lowe commented on YARN-6930: -- Thanks for the 2.8 patch! I don't understand why

[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198880#comment-16198880 ] Jason Lowe commented on YARN-7286: -- Yeah, the core problem with trying to use empty as a value was

[jira] [Commented] (YARN-6523) RM requires large memory in sending out security tokens as part of Node Heartbeat in large cluster

2017-10-10 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198847#comment-16198847 ] Jason Lowe commented on YARN-6523: -- [~Naganarasimha] is this issue still Critical after the revelations

[jira] [Commented] (YARN-7286) Add support for docker to have no capabilities

2017-10-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197615#comment-16197615 ] Jason Lowe commented on YARN-7286: -- Thanks for the patch! The description of the property in

[jira] [Commented] (YARN-7299) TestDistributedScheduler is failing

2017-10-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197249#comment-16197249 ] Jason Lowe commented on YARN-7299: -- git bisect shows this started failing consistently after YARN-7258 was

[jira] [Created] (YARN-7299) TestDistributedScheduler is failing

2017-10-09 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7299: Summary: TestDistributedScheduler is failing Key: YARN-7299 URL: https://issues.apache.org/jira/browse/YARN-7299 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-7246) Fix the default docker binary path

2017-10-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197142#comment-16197142 ] Jason Lowe commented on YARN-7246: -- Thanks for updating the patch! I thought at first that there was a

[jira] [Updated] (YARN-6930) Admins should be able to explicitly enable specific LinuxContainerRuntime in the NodeManager

2017-10-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6930: - Target Version/s: 2.8.2 Fix Version/s: (was: 2.8.2) Thanks for the patch! In the future,

[jira] [Commented] (YARN-7272) Enable timeline collector fault tolerance

2017-10-09 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196949#comment-16196949 ] Jason Lowe commented on YARN-7272: -- I'm not proposing we use leveldb for persisting the entities

[jira] [Commented] (YARN-7272) Enable timeline collector fault tolerance

2017-10-06 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194763#comment-16194763 ] Jason Lowe commented on YARN-7272: -- Leveldb seems like a great fit for this, IMO. It has high performance

[jira] [Commented] (YARN-7246) Fix the default docker binary path

2017-10-05 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193124#comment-16193124 ] Jason Lowe commented on YARN-7246: -- Sorry to show up late here. Patch looks mostly OK, but it would be

[jira] [Updated] (YARN-7285) ContainerExecutor always launches with priorities due to yarn-default property

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7285: - Attachment: YARN-7285.002.patch Updated the patch to remove the commented-out property value. >

[jira] [Commented] (YARN-7285) ContainerExecutor always launches with priorities due to yarn-default property

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191762#comment-16191762 ] Jason Lowe commented on YARN-7285: -- I thought it best to leave an example value there so it's clear what

[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191311#comment-16191311 ] Jason Lowe commented on YARN-7226: -- The unit test failures appear to be unrelated, and they pass for me

[jira] [Resolved] (YARN-7288) ContainerLocalizer with multiple JVM Options

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-7288. -- Resolution: Invalid > ContainerLocalizer with multiple JVM Options >

[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7226: - Attachment: YARN-7226-branch-2.007.patch Rebased the branch-2 patch. > Whitelisted variables do not

[jira] [Updated] (YARN-7288) ContainerLocalizer with multiple JVM Options

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7288: - Summary: ContainerLocalizer with multiple JVM Options (was: ContaninerLocalizer with multiple JVM

[jira] [Commented] (YARN-7288) ContaninerLocalizer with multiple JVM Options

2017-10-04 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191192#comment-16191192 ] Jason Lowe commented on YARN-7288: -- As the name implies, yarn.nodemanager.container-localizer.java.opts

[jira] [Commented] (YARN-4163) Audit getQueueInfo and getApplications calls

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190317#comment-16190317 ] Jason Lowe commented on YARN-4163: -- Thanks for updating the patch! It would be good to cleanup the

[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190265#comment-16190265 ] Jason Lowe commented on YARN-7226: -- The findbugs warning is unrelated. > Whitelisted variables do not

[jira] [Updated] (YARN-7285) ContainerExecutor always launches with priorities due to yarn-default property

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7285: - Attachment: YARN-7285.001.patch Attaching a patch that removes a specified value for the scheduling

[jira] [Commented] (YARN-7285) ContainerExecutor always launches with priorities due to yarn-default property

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190238#comment-16190238 ] Jason Lowe commented on YARN-7285: -- YARN-5444 first reported the test failure, but it was misinterpreted

[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190214#comment-16190214 ] Jason Lowe commented on YARN-7226: -- The unit test failure is interesting. It's a latent bug caused by

[jira] [Updated] (YARN-5444) Fix failing unit tests in TestLinuxContainerExecutorWithMocks

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5444: - Fix Version/s: 2.8.3 Thanks, Yufei! I pulled this into branch-2.8 as well. > Fix failing unit tests in

[jira] [Created] (YARN-7285) ContainerExecutor always launches with priorities due to yarn-default property

2017-10-03 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-7285: Summary: ContainerExecutor always launches with priorities due to yarn-default property Key: YARN-7285 URL: https://issues.apache.org/jira/browse/YARN-7285 Project: Hadoop

[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7226: - Attachment: YARN-7226-branch-2.8.006.patch The unit test failures are unrelated and are all caused by a

[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-03 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7226: - Attachment: YARN-7226-branch-2.006.patch Thanks for the review and commit! Here's the patch for branch-2.

[jira] [Commented] (YARN-7102) NM heartbeat stuck when responseId overflows MAX_INT

2017-10-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188935#comment-16188935 ] Jason Lowe commented on YARN-7102: -- Thanks for updating the patch! This now grabs the RMNodeImpl write

[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-10-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7226: - Attachment: YARN-7226.006.patch Rebased the patch on trunk. I can provide patches for branch-2. and

[jira] [Commented] (YARN-7117) Capacity Scheduler: Support Auto Creation of Leaf Queues While Doing Queue Mapping

2017-10-02 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16188674#comment-16188674 ] Jason Lowe commented on YARN-7117: -- bq. The current CS code has a bug in that it allows "." in the queue

[jira] [Commented] (YARN-7117) Capacity Scheduler: Support Auto Creation of Leaf Queues While Doing Queue Mapping

2017-09-29 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186396#comment-16186396 ] Jason Lowe commented on YARN-7117: -- Thanks for providing the doc, Wangda! I think the syntax would be

[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7226: - Attachment: YARN-7226.005.patch Thanks for the review, Sidharta! I added the interface to

[jira] [Commented] (YARN-7265) Hadoop Server Log Correlation

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184738#comment-16184738 ] Jason Lowe commented on YARN-7265: -- Is this really appropriate to be built directly into Hadoop? A

[jira] [Updated] (YARN-6059) Update paused container state in the NM state store

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6059: - Fix Version/s: (was: 3.1.0) 3.0.0 > Update paused container state in the NM state

[jira] [Commented] (YARN-7256) Giving Yarn Application the Option to Black Out Certain Nodes On the Fly

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184723#comment-16184723 ] Jason Lowe commented on YARN-7256: -- Closing this as a duplicate of YARN-750 which allows applications to

[jira] [Resolved] (YARN-7256) Giving Yarn Application the Option to Black Out Certain Nodes On the Fly

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-7256. -- Resolution: Duplicate > Giving Yarn Application the Option to Black Out Certain Nodes On the Fly >

[jira] [Updated] (YARN-5216) Expose configurable preemption policy for OPPORTUNISTIC containers running on the NM

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5216: - Fix Version/s: (was: 3.1.0) 3.0.0 > Expose configurable preemption policy for

[jira] [Updated] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-7240: - Fix Version/s: (was: 3.1.0) 3.0.0 > Add more states and transitions to stabilize

[jira] [Updated] (YARN-5292) NM Container lifecycle and state transitions to support for PAUSED container state.

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5292: - Fix Version/s: (was: 3.1.0) 3.0.0 > NM Container lifecycle and state transitions to

[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184686#comment-16184686 ] Jason Lowe commented on YARN-7248: -- Thanks for updating the patch! +1 lgtm. Committing this. > NM

[jira] [Commented] (YARN-7260) yarn.router.pipeline.cache-max-size is missing in yarn-default.xml

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184362#comment-16184362 ] Jason Lowe commented on YARN-7260: -- bq. As part of test improvement, I think it would be a good to print

[jira] [Commented] (YARN-7244) ShuffleHandler is not aware of disks that are added

2017-09-28 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16184319#comment-16184319 ] Jason Lowe commented on YARN-7244: -- bq. rather a new api as you mentioned in LocalDirAllocator named

[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183279#comment-16183279 ] Jason Lowe commented on YARN-7248: -- Thanks for updating the patch! I believe the unit test failures are

[jira] [Commented] (YARN-7190) Ensure only NM classpath in 2.x gets TSv2 related hbase jars, not the user classpath

2017-09-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183228#comment-16183228 ] Jason Lowe commented on YARN-7190: -- My personal preference would be to remove any new jar we know is only

[jira] [Updated] (YARN-6059) Update paused container state in the NM state store

2017-09-27 Thread Jason Lowe (JIRA)
[ https://issues.apache.org/jira/browse/YARN-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-6059: - Fix Version/s: (was: 3.0.0) 3.1.0 > Update paused container state in the NM state

<    1   2   3   4   5   6   7   8   9   10   >