[jira] [Assigned] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned HADOOP-13493: - Assignee: Daniel Templeton (was: Karthik Kambatla) [~templedf] - I guess this will be handled by the update to compatibility guidelines you are working on. Assigning to you. Please close as duplicate if the patch there already takes care of it. > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter >Assignee: Daniel Templeton >Priority: Critical > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14284) Shade Guava everywhere
[ https://issues.apache.org/jira/browse/HADOOP-14284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044995#comment-16044995 ] Karthik Kambatla commented on HADOOP-14284: --- I agree with [~vinodkv] on shading only the YARN/MR client modules. For YARN, that is yarn-common, yarn-client, and yarn-api modules. For MR, that should be mapreduce-client-* modules. We probably don't need to shade hadoop-mapreduce-client-hs and hadoop-mapreduce-client-hs-plugins jars though, as they are for the HistoryServer and have no @Stable APIs. In cases where devs extend YARN classes like the SchedulingPolicy in FairScheduler or implement their own scheduler, the dev will be responsible for ensuring they don't use guava or use it with a version that is consistent with what Hadoop uses. I expect these devs to be sophisticated enough to figure this out. That said, we should probably still call out cases like these in the compatibility guide. /cc [~templedf] > Shade Guava everywhere > -- > > Key: HADOOP-14284 > URL: https://issues.apache.org/jira/browse/HADOOP-14284 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.0.0-alpha4 >Reporter: Andrew Wang >Assignee: Tsuyoshi Ozawa >Priority: Blocker > Attachments: HADOOP-14238.pre001.patch, HADOOP-14284.002.patch, > HADOOP-14284.004.patch, HADOOP-14284.007.patch, HADOOP-14284.010.patch, > HADOOP-14284.012.patch > > > HADOOP-10101 upgraded the guava version for 3.x to 21. > Guava is broadly used by Java projects that consume our artifacts. > Unfortunately, these projects also consume our private artifacts like > {{hadoop-hdfs}}. They also are unlikely on the new shaded client introduced > by HADOOP-11804, currently only available in 3.0.0-alpha2. > We should shade Guava everywhere to proactively avoid breaking downstreams. > This isn't a requirement for all dependency upgrades, but it's necessary for > known-bad dependencies like Guava. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10584: -- Assignee: (was: Karthik Kambatla) > ActiveStandbyElector goes down if ZK quorum become unavailable > -- > > Key: HADOOP-10584 > URL: https://issues.apache.org/jira/browse/HADOOP-10584 > Project: Hadoop Common > Issue Type: Bug > Components: ha >Affects Versions: 2.4.0 >Reporter: Karthik Kambatla >Priority: Critical > Attachments: hadoop-10584-prelim.patch, rm.log > > > ActiveStandbyElector retries operations for a few times. If the ZK quorum > itself is down, it goes down and the daemons will have to be brought up > again. > Instead, it should log the fact that it is unable to talk to ZK, call > becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13961) mvn install fails on trunk
Karthik Kambatla created HADOOP-13961: - Summary: mvn install fails on trunk Key: HADOOP-13961 URL: https://issues.apache.org/jira/browse/HADOOP-13961 Project: Hadoop Common Issue Type: Bug Components: build Affects Versions: 3.0.0-alpha2 Reporter: Karthik Kambatla Priority: Blocker mvn install fails for me on trunk on a new environment with the following: {noformat} [ERROR] Failed to execute goal on project hadoop-hdfs: Could not resolve dependencies for project org.apache.hadoop:hadoop-hdfs:jar:3.0.0-alpha2-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-kms:jar:classes:3.0.0-alpha2-20161228.102554-925 in apache.snapshots.https (https://repository.apache.org/content/repositories/snapshots) -> [Help 1] {noformat} This works on an existing dev setup, likely because I have the jar in my m2 cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-11804) Shaded Hadoop client artifacts and minicluster
[ https://issues.apache.org/jira/browse/HADOOP-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784173#comment-15784173 ] Karthik Kambatla commented on HADOOP-11804: --- A profile to opt-out will be very useful. Also, is it possible to shade only on package and not on install? > Shaded Hadoop client artifacts and minicluster > -- > > Key: HADOOP-11804 > URL: https://issues.apache.org/jira/browse/HADOOP-11804 > Project: Hadoop Common > Issue Type: Sub-task > Components: build >Reporter: Sean Busbey >Assignee: Sean Busbey > Fix For: 3.0.0-alpha2 > > Attachments: HADOOP-11804.1.patch, HADOOP-11804.10.patch, > HADOOP-11804.11.patch, HADOOP-11804.12.patch, HADOOP-11804.13.patch, > HADOOP-11804.14.patch, HADOOP-11804.2.patch, HADOOP-11804.3.patch, > HADOOP-11804.4.patch, HADOOP-11804.5.patch, HADOOP-11804.6.patch, > HADOOP-11804.7.patch, HADOOP-11804.8.patch, HADOOP-11804.9.patch, > hadoop-11804-client-test.tar.gz > > > make a hadoop-client-api and hadoop-client-runtime that i.e. HBase can use to > talk with a Hadoop cluster without seeing any of the implementation > dependencies. > see proposal on parent for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13860) ZKFailoverController.ElectorCallbacks should have a non-trivial implementation for enterNeutralMode
[ https://issues.apache.org/jira/browse/HADOOP-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15730408#comment-15730408 ] Karthik Kambatla commented on HADOOP-13860: --- IMO, we should implement it similar to YARN-5677. That said, I would defer the judgment to HDFS devs. [~atm] - do you have an opinion here? > ZKFailoverController.ElectorCallbacks should have a non-trivial > implementation for enterNeutralMode > --- > > Key: HADOOP-13860 > URL: https://issues.apache.org/jira/browse/HADOOP-13860 > Project: Hadoop Common > Issue Type: Bug >Reporter: Karthik Kambatla > > ZKFailoverController.ElectorCallbacks implements enterNeutralMode trivially. > This can lead to a master staying active for longer than necessary, unless > the fencing scheme ensures the first active is transitioned to standby before > transitioning another master to active (e.g. ssh fencing). > YARN-5677 does this for YARN in EmbeddedElectorService. If we choose not to > implement, we should at least document this so any user of > ZKFailoverController in the future is aware. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13859) TestConfigurationFieldsBase fails for fields that are DEFAULT values of skipped properties.
[ https://issues.apache.org/jira/browse/HADOOP-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13859: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha2 2.9.0 Status: Resolved (was: Patch Available) Haibo, thanks for fixing the test failures and future-proofing for other such cases. Just committed this to trunk and branch-2. > TestConfigurationFieldsBase fails for fields that are DEFAULT values of > skipped properties. > --- > > Key: HADOOP-13859 > URL: https://issues.apache.org/jira/browse/HADOOP-13859 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.0.0-alpha1 >Reporter: Haibo Chen >Assignee: Haibo Chen > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: HADOOP-13859.01.patch > > > In YARN-5922, two new default values are added in YarnConfiguration for two > timeline-service properties that are skipped in TestConfigurationFieldsBase. > TestConfigurationFieldsBase fails as it mistakenly treats the two newly added > default-values as regular properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13859) TestConfigurationFieldsBase fails for fields that are DEFAULT values of skipped properties.
[ https://issues.apache.org/jira/browse/HADOOP-13859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726874#comment-15726874 ] Karthik Kambatla commented on HADOOP-13859: --- +1. Checking this in.. > TestConfigurationFieldsBase fails for fields that are DEFAULT values of > skipped properties. > --- > > Key: HADOOP-13859 > URL: https://issues.apache.org/jira/browse/HADOOP-13859 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.0.0-alpha1 >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: HADOOP-13859.01.patch > > > In YARN-5922, two new default values are added in YarnConfiguration for two > timeline-service properties that are skipped in TestConfigurationFieldsBase. > TestConfigurationFieldsBase fails as it mistakenly treats the two newly added > default-values as regular properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13860) ZKFailoverController.ElectorCallbacks should have a non-trivial implementation for enterNeutralMode
Karthik Kambatla created HADOOP-13860: - Summary: ZKFailoverController.ElectorCallbacks should have a non-trivial implementation for enterNeutralMode Key: HADOOP-13860 URL: https://issues.apache.org/jira/browse/HADOOP-13860 Project: Hadoop Common Issue Type: Bug Reporter: Karthik Kambatla ZKFailoverController.ElectorCallbacks implements enterNeutralMode trivially. This can lead to a master staying active for longer than necessary, unless the fencing scheme ensures the first active is transitioned to standby before transitioning another master to active (e.g. ssh fencing). YARN-5677 does this for YARN in EmbeddedElectorService. If we choose not to implement, we should at least document this so any user of ZKFailoverController in the future is aware. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13821) Improve findbugs rules to not require trivially overriding equals and hashCode methods
Karthik Kambatla created HADOOP-13821: - Summary: Improve findbugs rules to not require trivially overriding equals and hashCode methods Key: HADOOP-13821 URL: https://issues.apache.org/jira/browse/HADOOP-13821 Project: Hadoop Common Issue Type: Improvement Components: tools Reporter: Karthik Kambatla Priority: Minor If we override {{equals}} and {{hashCode}} methods of a class, findbugs requires us to override the subclasses as well even if trivially. e.g. YARN-5783. Filing this JIRA based on feedback from Sangjin on the DISCUSS thread for YARN-4752. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-10075) Update jetty dependency to version 9
[ https://issues.apache.org/jira/browse/HADOOP-10075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576049#comment-15576049 ] Karthik Kambatla commented on HADOOP-10075: --- [~rkanter] - mind throwing this on RB or Github PR for easier review? > Update jetty dependency to version 9 > > > Key: HADOOP-10075 > URL: https://issues.apache.org/jira/browse/HADOOP-10075 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 2.2.0, 2.6.0 >Reporter: Robert Rati >Assignee: Robert Kanter > Attachments: HADOOP-10075-002-wip.patch, HADOOP-10075.003.patch, > HADOOP-10075.004.patch, HADOOP-10075.005.patch, HADOOP-10075.006.patch, > HADOOP-10075.007.patch, HADOOP-10075.008.patch, HADOOP-10075.patch > > > Jetty6 is no longer maintained. Update the dependency to jetty9. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13714) Tighten up our compatibility guidelines for Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13714: -- Target Version/s: 3.0.0-beta1 (was: 3.0.0-alpha2) > Tighten up our compatibility guidelines for Hadoop 3 > > > Key: HADOOP-13714 > URL: https://issues.apache.org/jira/browse/HADOOP-13714 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.3 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Our current compatibility guidelines are incomplete and loose. For many > categories, we do not have a policy. It would be nice to actually define > those policies so our users know what to expect and the developers know what > releases to target their changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-13714) Tighten up our compatibility guidelines for Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569408#comment-15569408 ] Karthik Kambatla edited comment on HADOOP-13714 at 10/12/16 5:55 PM: - Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this this month. Please feel free to pick this up if you are interested an have cycles to work on this. was (Author: kasha): Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this in the next month. Please feel free to pick this up if you are interested an have cycles to work on this. > Tighten up our compatibility guidelines for Hadoop 3 > > > Key: HADOOP-13714 > URL: https://issues.apache.org/jira/browse/HADOOP-13714 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.3 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Our current compatibility guidelines are incomplete and loose. For many > categories, we do not have a policy. It would be nice to actually define > those policies so our users know what to expect and the developers know what > releases to target their changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-13714) Tighten up our compatibility guidelines for Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569408#comment-15569408 ] Karthik Kambatla edited comment on HADOOP-13714 at 10/12/16 5:55 PM: - Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this, this month. Please feel free to pick this up if you are interested an have cycles to work on this. was (Author: kasha): Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this this month. Please feel free to pick this up if you are interested an have cycles to work on this. > Tighten up our compatibility guidelines for Hadoop 3 > > > Key: HADOOP-13714 > URL: https://issues.apache.org/jira/browse/HADOOP-13714 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.3 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Our current compatibility guidelines are incomplete and loose. For many > categories, we do not have a policy. It would be nice to actually define > those policies so our users know what to expect and the developers know what > releases to target their changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13714) Tighten up our compatibility guidelines for Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-13714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569408#comment-15569408 ] Karthik Kambatla commented on HADOOP-13714: --- Created this JIRA so we don't miss it, and assigning to myself. Will not be able to get to this in the next month. Please feel free to pick this up if you are interested an have cycles to work on this. > Tighten up our compatibility guidelines for Hadoop 3 > > > Key: HADOOP-13714 > URL: https://issues.apache.org/jira/browse/HADOOP-13714 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.3 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > > Our current compatibility guidelines are incomplete and loose. For many > categories, we do not have a policy. It would be nice to actually define > those policies so our users know what to expect and the developers know what > releases to target their changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-13714) Tighten up our compatibility guidelines for Hadoop 3
Karthik Kambatla created HADOOP-13714: - Summary: Tighten up our compatibility guidelines for Hadoop 3 Key: HADOOP-13714 URL: https://issues.apache.org/jira/browse/HADOOP-13714 Project: Hadoop Common Issue Type: Improvement Components: documentation Affects Versions: 2.7.3 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Our current compatibility guidelines are incomplete and loose. For many categories, we do not have a policy. It would be nice to actually define those policies so our users know what to expect and the developers know what releases to target their changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned HADOOP-13493: - Assignee: Karthik Kambatla > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter >Assignee: Karthik Kambatla >Priority: Critical > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13493: -- Target Version/s: 3.0.0-alpha1 (was: 2.8.0) > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter >Assignee: Karthik Kambatla >Priority: Critical > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Moved] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla moved YARN-5515 to HADOOP-13493: - Affects Version/s: (was: 2.7.2) 2.7.2 Component/s: (was: documentation) documentation Key: HADOOP-13493 (was: YARN-5515) Project: Hadoop Common (was: Hadoop YARN) > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13493: -- Target Version/s: 2.8.0 > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter >Priority: Critical > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13493) Compatibility Docs should clarify the policy for what takes precedence when a conflict is found
[ https://issues.apache.org/jira/browse/HADOOP-13493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13493: -- Priority: Critical (was: Major) > Compatibility Docs should clarify the policy for what takes precedence when a > conflict is found > --- > > Key: HADOOP-13493 > URL: https://issues.apache.org/jira/browse/HADOOP-13493 > Project: Hadoop Common > Issue Type: Task > Components: documentation >Affects Versions: 2.7.2 >Reporter: Robert Kanter >Priority: Critical > > The Compatibility Docs > (https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Compatibility.html#Java_API) > list the policies for Private, Public, not annotated, etc Classes and > members, but it doesn't say what happens when there's a conflict. We should > try obviously try to avoid this situation, but it would be good to explicitly > state what takes precedence. > As an example, until YARN-3225 made it consistent, {{RefreshNodesRequest}} > looked like this: > {code:java} > @Private > @Stable > public abstract class RefreshNodesRequest { > @Public > @Stable > public static RefreshNodesRequest newInstance() { > RefreshNodesRequest request = > Records.newRecord(RefreshNodesRequest.class); > return request; > } > } > {code} > Note that the class is marked {{\@Private}}, but the method is marked > {{\@Public}}. > In this example, I'd say that the class level should have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13299) JMXJsonServlet is vulnerable to TRACE
[ https://issues.apache.org/jira/browse/HADOOP-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13299: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks [~haibochen] for the contribution, and [~templedf] for the review. Just committed this to trunk, branch-2, and branch-2.8. > JMXJsonServlet is vulnerable to TRACE > -- > > Key: HADOOP-13299 > URL: https://issues.apache.org/jira/browse/HADOOP-13299 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Minor > Fix For: 2.8.0 > > Attachments: hadoop13299.001.patch > > > Nessus scan shows that JMXJsonServlet is vulnerable to TRACE/TRACK requests. > We could disable this to avoid such vulnerability. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13299) JMXJsonServlet is vulnerable to TRACE
[ https://issues.apache.org/jira/browse/HADOOP-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414195#comment-15414195 ] Karthik Kambatla commented on HADOOP-13299: --- +1. Checking this in. > JMXJsonServlet is vulnerable to TRACE > -- > > Key: HADOOP-13299 > URL: https://issues.apache.org/jira/browse/HADOOP-13299 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Minor > Attachments: hadoop13299.001.patch > > > Nessus scan shows that JMXJsonServlet is vulnerable to TRACE/TRACK requests. > We could disable this to avoid such vulnerability. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13299) JMXJsonServlet is vulnerable to TRACE
[ https://issues.apache.org/jira/browse/HADOOP-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414155#comment-15414155 ] Karthik Kambatla commented on HADOOP-13299: --- The patch looks good to me. [~haibochen] - could you confirm this on a cluster as well? > JMXJsonServlet is vulnerable to TRACE > -- > > Key: HADOOP-13299 > URL: https://issues.apache.org/jira/browse/HADOOP-13299 > Project: Hadoop Common > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Minor > Attachments: hadoop13299.001.patch > > > Nessus scan shows that JMXJsonServlet is vulnerable to TRACE/TRACK requests. > We could disable this to avoid such vulnerability. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-13243) TestRollingFileSystemSink.testSetInitialFlushTime() fails intermittently
[ https://issues.apache.org/jira/browse/HADOOP-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-13243: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Just committed this to trunk and branch-2. Thanks [~templedf] for the fix. > TestRollingFileSystemSink.testSetInitialFlushTime() fails intermittently > > > Key: HADOOP-13243 > URL: https://issues.apache.org/jira/browse/HADOOP-13243 > Project: Hadoop Common > Issue Type: Bug > Components: test >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Fix For: 2.9.0 > > Attachments: HADOOP-13243.001.patch > > > Because of poor checking of boundary conditions, the test fails 1% of the > time: > {noformat} > The initial flush time was calculated incorrectly: 0 > Stacktrace > java.lang.AssertionError: The initial flush time was calculated incorrectly: 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSink.testSetInitialFlushTime(TestRollingFileSystemSink.java:120) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13243) TestRollingFileSystemSink.testSetInitialFlushTime() fails intermittently
[ https://issues.apache.org/jira/browse/HADOOP-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15326634#comment-15326634 ] Karthik Kambatla commented on HADOOP-13243: --- +1 > TestRollingFileSystemSink.testSetInitialFlushTime() fails intermittently > > > Key: HADOOP-13243 > URL: https://issues.apache.org/jira/browse/HADOOP-13243 > Project: Hadoop Common > Issue Type: Bug > Components: test >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: HADOOP-13243.001.patch > > > Because of poor checking of boundary conditions, the test fails 1% of the > time: > {noformat} > The initial flush time was calculated incorrectly: 0 > Stacktrace > java.lang.AssertionError: The initial flush time was calculated incorrectly: 0 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.metrics2.sink.TestRollingFileSystemSink.testSetInitialFlushTime(TestRollingFileSystemSink.java:120) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13184) Add "Apache" to Hadoop project logo
[ https://issues.apache.org/jira/browse/HADOOP-13184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318601#comment-15318601 ] Karthik Kambatla commented on HADOOP-13184: --- +1 on option 4. > Add "Apache" to Hadoop project logo > --- > > Key: HADOOP-13184 > URL: https://issues.apache.org/jira/browse/HADOOP-13184 > Project: Hadoop Common > Issue Type: Task >Reporter: Chris Douglas >Assignee: Abhishek > > Many ASF projects include "Apache" in their logo. We should add it to Hadoop. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13161) remove JDK7 from Dockerfile
[ https://issues.apache.org/jira/browse/HADOOP-13161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285942#comment-15285942 ] Karthik Kambatla commented on HADOOP-13161: --- Patch looks good to me. +1. > remove JDK7 from Dockerfile > --- > > Key: HADOOP-13161 > URL: https://issues.apache.org/jira/browse/HADOOP-13161 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Affects Versions: 3.0.0-alpha1 >Reporter: Allen Wittenauer >Assignee: Allen Wittenauer > Attachments: HADOOP-13161.00.patch, HADOOP-13161.01.patch, > HADOOP-13161.02.patch > > > We should slim down the Docker image by removing JDK7 now that trunk no > longer supports it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10584: -- Target Version/s: 2.9.0 (was: 2.7.3) > ActiveStandbyElector goes down if ZK quorum become unavailable > -- > > Key: HADOOP-10584 > URL: https://issues.apache.org/jira/browse/HADOOP-10584 > Project: Hadoop Common > Issue Type: Bug > Components: ha >Affects Versions: 2.4.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Critical > Attachments: hadoop-10584-prelim.patch, rm.log > > > ActiveStandbyElector retries operations for a few times. If the ZK quorum > itself is down, it goes down and the daemons will have to be brought up > again. > Instead, it should log the fact that it is unable to talk to ZK, call > becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12456) Verify setting HADOOP_HOME explicitly works
[ https://issues.apache.org/jira/browse/HADOOP-12456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12456: -- Assignee: (was: Karthik Kambatla) > Verify setting HADOOP_HOME explicitly works > --- > > Key: HADOOP-12456 > URL: https://issues.apache.org/jira/browse/HADOOP-12456 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Karthik Kambatla >Priority: Blocker > > This is the equivalent of HADOOP-12451 for trunk (aka 3.0.0). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-10321) TestCompositeService should cover all enumerations of adding a service to a parent service
[ https://issues.apache.org/jira/browse/HADOOP-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10321: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Thanks a bunch for working on this, [~rchiang]. Feels good to commit this two-year old JIRA :) Just committed to trunk and branch-2. > TestCompositeService should cover all enumerations of adding a service to a > parent service > -- > > Key: HADOOP-10321 > URL: https://issues.apache.org/jira/browse/HADOOP-10321 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Ray Chiang > Labels: BB2015-05-RFC, supportability, test > Fix For: 2.9.0 > > Attachments: HADOOP-10321-02.patch, HADOOP-10321-03.patch, > HADOOP-10321-04.patch, HADOOP-10321-04.patch, HADOOP10321-01.patch > > > HADOOP-10085 fixes some synchronization issues in > CompositeService#addService(). The tests should cover all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-10321) TestCompositeService should cover all enumerations of adding a service to a parent service
[ https://issues.apache.org/jira/browse/HADOOP-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10321: -- Attachment: HADOOP-10321-04.patch The patch looks good to me. +1 Resubmitting the patch to kick Jenkins to be safe. Will commit once it passes. > TestCompositeService should cover all enumerations of adding a service to a > parent service > -- > > Key: HADOOP-10321 > URL: https://issues.apache.org/jira/browse/HADOOP-10321 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Ray Chiang > Labels: BB2015-05-RFC, supportability, test > Attachments: HADOOP-10321-02.patch, HADOOP-10321-03.patch, > HADOOP-10321-04.patch, HADOOP-10321-04.patch, HADOOP10321-01.patch > > > HADOOP-10085 fixes some synchronization issues in > CompositeService#addService(). The tests should cover all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12835) RollingFileSystemSink can throw an NPE on non-secure clusters
[ https://issues.apache.org/jira/browse/HADOOP-12835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159752#comment-15159752 ] Karthik Kambatla commented on HADOOP-12835: --- Looks good. +1, pending Jenkins. > RollingFileSystemSink can throw an NPE on non-secure clusters > - > > Key: HADOOP-12835 > URL: https://issues.apache.org/jira/browse/HADOOP-12835 > Project: Hadoop Common > Issue Type: Bug > Components: metrics >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12835.001.patch > > > If the sink init fails (such as because the HDFS cluster isn't running) on a > non-secure cluster, the init will throw an NPE because of missing properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12817) Enable TLS v1.1 and 1.2
[ https://issues.apache.org/jira/browse/HADOOP-12817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151660#comment-15151660 ] Karthik Kambatla commented on HADOOP-12817: --- LGTM. +1. > Enable TLS v1.1 and 1.2 > --- > > Key: HADOOP-12817 > URL: https://issues.apache.org/jira/browse/HADOOP-12817 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: HADOOP-12817.001.patch, HADOOP-12817.002.patch > > > Java 7 supports TLSv1.1 and TLSv1.2, which are more secure than TLSv1 (which > was all that was supported in Java 6), so we should add those to the default > list for {{hadoop.ssl.enabled.protocols}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122394#comment-15122394 ] Karthik Kambatla commented on HADOOP-12702: --- bq. Should close() call flush()? Missed the fact that close() calls close() on the underlying stream, that in turn takes care of flush. Latest patch looks good, but for one nit - we should null out the underlying streams on close. +1 after that. > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch, HADOOP-12702.004.patch, HADOOP-12702.005.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12702: -- Issue Type: New Feature (was: Improvement) > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch, HADOOP-12702.004.patch, HADOOP-12702.005.patch, > HADOOP-12702.006.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122704#comment-15122704 ] Karthik Kambatla commented on HADOOP-12702: --- The unit test failures look unrelated. +1, checking this in. > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch, HADOOP-12702.004.patch, HADOOP-12702.005.patch, > HADOOP-12702.006.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12702: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Thanks Daniel for reporting and working on this. Just committed to trunk and branch-2. > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: New Feature > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.9.0 > > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch, HADOOP-12702.004.patch, HADOOP-12702.005.patch, > HADOOP-12702.006.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120115#comment-15120115 ] Karthik Kambatla commented on HADOOP-12702: --- Looks mostly good. Few minor comments: # Should close() call flush()? # rollLogDirIfNeeded: The comments in the method are wrongly indented. And, the exception message at the end should be fixed up to say "Failed to *create* new log file". > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch, HADOOP-12702.004.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12702) Add an HDFS metrics sink
[ https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118451#comment-15118451 ] Karthik Kambatla commented on HADOOP-12702: --- Thanks for filing and working on this, [~templedf]. Comments on the latest patch: # FileSystemSink: ## Given the sink we are adding here as some quirks to its behavior - new directory every hour etc., the class name FileSystemSink seems too simple. Can we capture more of the behavior in the name? ## Rename currentPath to currentDirPath, currentFile to currentFilePath, currentOut to currentOutStream for clarity? ## While reading {{BASEPATH_KEY}} from conf, there is no default value? ## {{checkAppend}}: If appending throws an IOE that is not because of not being supported, should we allow appending? I would think not. ## {{rollLogDirIfNeeded}}: For readability, should we split it into two ifs - the first is when the directories don't match. Also, the comment in the method is wrongly indented and slightly confusing. {code} if (!path.equals(currentPath)) { if (currentOut != null) { currentOut.close(); currentOut = null; } currentPath = path; } if (currentOut == null) { // rest of the code } {code} ## Typo in the javadoc for createLogFile - nonExistant ## {{putMetrics}}: When throwing MetricsException, no need for a new line between setting the message and actually throwing the exception. Also, should just have a method that takes a message (String) and throws an exception if ignore error is not turned on. The only downside would be the intern objects for the strings here. ## Should {{flush}} be also invoking {{currentFSOut.hflush}}? The tests look good. Should we play around with the allowed configs also? I am fine with not doing that or following up in another JIRA. > Add an HDFS metrics sink > > > Key: HADOOP-12702 > URL: https://issues.apache.org/jira/browse/HADOOP-12702 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch, > HADOOP-12702.003.patch > > > We need a metrics2 sink that can write metrics to HDFS. The sink should > accept as configuration a "directory prefix" and do the following in > {{putMetrics()}} > * Get MMddHH from current timestamp. > * If HDFS dir "dir prefix" + MMddHH doesn't exist, create it. Close any > currently open file and create a new file called .log in the new > directory. > * Write metrics to the current log file. > * If a write fails, it should be fatal to the process running the sink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12683: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Thanks for the contribution, [~vickyuec]. Just committed this to trunk and branch-2. > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Fix For: 2.9.0 > > Attachments: HADOOP-12683.001.patch, HADOOP-12683.002.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097292#comment-15097292 ] Karthik Kambatla commented on HADOOP-12683: --- +1 > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Attachments: HADOOP-12683.001.patch, HADOOP-12683.002.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095081#comment-15095081 ] Karthik Kambatla commented on HADOOP-12683: --- The patch itself looks good to me. > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Attachments: HADOOP-12683.001.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12683: -- Target Version/s: 2.9.0 > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Attachments: HADOOP-12683.001.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095080#comment-15095080 ] Karthik Kambatla commented on HADOOP-12683: --- Thanks for filing and working on this, [~vickyuec]. Any chance we could add a test to verify? > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Attachments: HADOOP-12683.001.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12683) Add number of samples in last interval in snapshot of MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12683: -- Issue Type: Improvement (was: Task) > Add number of samples in last interval in snapshot of MutableStat > - > > Key: HADOOP-12683 > URL: https://issues.apache.org/jira/browse/HADOOP-12683 > Project: Hadoop Common > Issue Type: Improvement > Components: metrics >Affects Versions: 2.7.1 >Reporter: Vikram Srivastava >Assignee: Vikram Srivastava >Priority: Minor > Attachments: HADOOP-12683.001.patch > > > Besides the total number of samples, it is also useful to know the number of > samples in the last snapshot of MutableStat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12566) Add NullGroupMapping
[ https://issues.apache.org/jira/browse/HADOOP-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12566: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.9.0 Status: Resolved (was: Patch Available) Just committed this to trunk and branch-2. > Add NullGroupMapping > > > Key: HADOOP-12566 > URL: https://issues.apache.org/jira/browse/HADOOP-12566 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.9.0 > > Attachments: HADOOP-12566.001.patch, HADOOP-12566.002.patch > > > Add a {{NullGroupMapping}} for cases where user groups are not used. > {{ShellBasedUnixGroupMapping}} can be used in places where latency is not > important. In places like starting a container, it's worth in to avoid the > extra fork and exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12566) Add NullGroupMapping
[ https://issues.apache.org/jira/browse/HADOOP-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070271#comment-15070271 ] Karthik Kambatla commented on HADOOP-12566: --- Thanks for filing and working on this, [~templedf] > Add NullGroupMapping > > > Key: HADOOP-12566 > URL: https://issues.apache.org/jira/browse/HADOOP-12566 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.9.0 > > Attachments: HADOOP-12566.001.patch, HADOOP-12566.002.patch > > > Add a {{NullGroupMapping}} for cases where user groups are not used. > {{ShellBasedUnixGroupMapping}} can be used in places where latency is not > important. In places like starting a container, it's worth in to avoid the > extra fork and exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12566) Add NullGroupMapping
[ https://issues.apache.org/jira/browse/HADOOP-12566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15070263#comment-15070263 ] Karthik Kambatla commented on HADOOP-12566: --- Fairly straight-forward patch. +1. Committing this.. > Add NullGroupMapping > > > Key: HADOOP-12566 > URL: https://issues.apache.org/jira/browse/HADOOP-12566 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: HADOOP-12566.001.patch, HADOOP-12566.002.patch > > > Add a {{NullGroupMapping}} for cases where user groups are not used. > {{ShellBasedUnixGroupMapping}} can be used in places where latency is not > important. In places like starting a container, it's worth in to avoid the > extra fork and exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12584) Disable directory browsing in HttpServer2
[ https://issues.apache.org/jira/browse/HADOOP-12584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15012247#comment-15012247 ] Karthik Kambatla commented on HADOOP-12584: --- +1 > Disable directory browsing in HttpServer2 > - > > Key: HADOOP-12584 > URL: https://issues.apache.org/jira/browse/HADOOP-12584 > Project: Hadoop Common > Issue Type: Bug > Components: security >Affects Versions: 2.8.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: HADOOP-12584.001.patch > > > We found a minor security issue with the Yarn Web UIs (or anything using > {{HttpServer2}}. Currently, you can list the contents of the {{/static}} > directory for the RM, NM, and JHS. This isn't a huge deal, but there are > some ways to abuse this to get access to files on the host, though it would > be pretty difficult. It's also good practice to disable directory listing on > web apps. > Here are the URLs: > - http://HOST:8088/static/ > - http://HOST:19888/static/ > - http://HOST:8042/static/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15004828#comment-15004828 ] Karthik Kambatla commented on HADOOP-10584: --- Based on my recollection from a while ago and briefly looking at the attached prelim patch, there are a couple of issues here: # When RM loses connection while executing an operation, the operation just fails without enough retries. The patch adds a retry-loop to handle this. # When RM loses connection to ZK but doesn't give up being Active. This leads to the RM continuing to serve apps and nodes connected to it. The patch, in addition to rejoining election, has the client (ZKFC/RM) enter neutral mode. Today, the RM doesn't do anything on {{enterNeutralMode}} but of course this can be improved going forward. I won't be able to work on this for the next month or so. If anyone has cycles, please feel free to take it up. > ActiveStandbyElector goes down if ZK quorum become unavailable > -- > > Key: HADOOP-10584 > URL: https://issues.apache.org/jira/browse/HADOOP-10584 > Project: Hadoop Common > Issue Type: Bug > Components: ha >Affects Versions: 2.4.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Critical > Attachments: hadoop-10584-prelim.patch, rm.log > > > ActiveStandbyElector retries operations for a few times. If the ZK quorum > itself is down, it goes down and the daemons will have to be brought up > again. > Instead, it should log the fact that it is unable to talk to ZK, call > becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) [Branch-2] Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14999390#comment-14999390 ] Karthik Kambatla commented on HADOOP-12451: --- Thanks Chris and Vinod. HADOOP-12456 tracks this for trunk. > [Branch-2] Setting HADOOP_HOME explicitly should be allowed > --- > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch, hadoop-12451-branch-2.addendum-2.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997132#comment-14997132 ] Karthik Kambatla commented on HADOOP-12451: --- Looking into this, now. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997180#comment-14997180 ] Karthik Kambatla commented on HADOOP-12451: --- Brought up a pseudo-distributed cluster on mac, and ran a few MR jobs. [~cnauroth] - could you take it for a spin? If the patch needs any further fixes, I should be able to re-iterate later today. Thanks. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch, hadoop-12451-branch-2.addendum-2.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Attachment: hadoop-12451-branch-2.addendum-2.patch Thanks for pointing the issues out, Vinod. Updated patch addresses them. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch, hadoop-12451-branch-2.addendum-2.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14945465#comment-14945465 ] Karthik Kambatla edited comment on HADOOP-12451 at 11/4/15 6:41 PM: The committed patch prevents HADOOP_HOME from being overwritten. In addition to this, HADOOP-11464 changes the behavior in yet another way. It actually sets HADOOP_HOME whereas previously it was left empty. We realized we were relying on HADOOP_HOME not being set at all, and it broke a bunch of things. Should we change the patch to set HADOOP_HOME only if it is cygwin. Per [~aw]'s comment [here|https://issues.apache.org/jira/browse/HADOOP-12456?focusedCommentId=14940643=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14940643], HADOOP_HOME is supported only on Windows. was (Author: kasha): The committed patch avoids HADOOP_HOME is not overwritten. In addition to this, HADOOP-11464 changes the behavior in yet another way. It actually sets HADOOP_HOME whereas previously it was left empty. We realized we were relying on HADOOP_HOME not being set at all, and it broke a bunch of things. Should we change the patch to set HADOOP_HOME only if it is cygwin. Per [~aw]'s comment [here|https://issues.apache.org/jira/browse/HADOOP-12456?focusedCommentId=14940643=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14940643], HADOOP_HOME is supported only on Windows. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Attachment: HADOOP-12451-branch-2.addendum-1.patch Straight-forward addendum patch that sets HADOOP_HOME only when using cygwin. Haven't tested this on cygwin, since I have no access to Windows box. [~cnauroth], [~vinodkv] - appreciate any help with test and review. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Hadoop Flags: (was: Reviewed) Status: Patch Available (was: Reopened) > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch, > HADOOP-12451-branch-2.addendum-1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975130#comment-14975130 ] Karthik Kambatla commented on HADOOP-12451: --- We should do an addendum. I ll try to get to this this week. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reopened HADOOP-12451: --- The committed patch avoids HADOOP_HOME is not overwritten. In addition to this, HADOOP-11464 changes the behavior in yet another way. It actually sets HADOOP_HOME whereas previously it was left empty. We realized we were relying on HADOOP_HOME not being set at all, and it broke a bunch of things. Should we change the patch to set HADOOP_HOME only if it is cygwin. Per [~aw]'s comment [here|https://issues.apache.org/jira/browse/HADOOP-12456?focusedCommentId=14940643=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14940643], HADOOP_HOME is supported only on Windows. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12456) Verify setting HADOOP_HOME explicitly works
Karthik Kambatla created HADOOP-12456: - Summary: Verify setting HADOOP_HOME explicitly works Key: HADOOP-12456 URL: https://issues.apache.org/jira/browse/HADOOP-12456 Project: Hadoop Common Issue Type: Bug Affects Versions: 3.0.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker This is the equivalent of HADOOP-12451 for trunk (aka 3.0.0). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940171#comment-14940171 ] Karthik Kambatla commented on HADOOP-12451: --- Thanks Chris. Checking this in. > One should be able to set HADOOP_HOME outside as well > - > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14940176#comment-14940176 ] Karthik Kambatla commented on HADOOP-12451: --- [~cnauroth] - trunk scripts are very different. Will create a JIRA to make sure setting HADOOP_HOME explicitly works in trunk as well. > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Summary: Setting HADOOP_HOME explicitly should be allowed (was: One should be able to set HADOOP_HOME outside as well) > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) Setting HADOOP_HOME explicitly should be allowed
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Resolution: Fixed Fix Version/s: 2.7.2 Status: Resolved (was: Patch Available) > Setting HADOOP_HOME explicitly should be allowed > > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: scripts >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Fix For: 2.7.2 > > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14938219#comment-14938219 ] Karthik Kambatla commented on HADOOP-12451: --- This breaks some of our internal tests. I deployed the change and verified the tests pass. > One should be able to set HADOOP_HOME outside as well > - > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Status: Patch Available (was: Open) > One should be able to set HADOOP_HOME outside as well > - > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939125#comment-14939125 ] Karthik Kambatla commented on HADOOP-12451: --- Thanks for the review, ATM. I ll go ahead and commit this tomorrow at noon PT. [~cnauroth] - are you able to take a quick look at the patch before then? > One should be able to set HADOOP_HOME outside as well > - > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
[ https://issues.apache.org/jira/browse/HADOOP-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12451: -- Attachment: HADOOP-12451-branch-2.1.patch Simple patch that adds a check to see if HADOOP_HOME has already been set, similar to how HADOOP_HDFS_HOME is handled. [~cnauroth] - can you please review the patch since you know the latest details on how to ensure it runs with cygwin? > One should be able to set HADOOP_HOME outside as well > - > > Key: HADOOP-12451 > URL: https://issues.apache.org/jira/browse/HADOOP-12451 > Project: Hadoop Common > Issue Type: Bug > Components: conf >Affects Versions: 2.7.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Blocker > Attachments: HADOOP-12451-branch-2.1.patch > > > HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME > explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12451) One should be able to set HADOOP_HOME outside as well
Karthik Kambatla created HADOOP-12451: - Summary: One should be able to set HADOOP_HOME outside as well Key: HADOOP-12451 URL: https://issues.apache.org/jira/browse/HADOOP-12451 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 2.7.1 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker HADOOP-11464 reinstates cygwin support. In the process, it sets HADOOP_HOME explicitly in hadoop-config.sh without checking if it has already been set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12313) Possible NPE in JvmPauseMonitor.stop()
[ https://issues.apache.org/jira/browse/HADOOP-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694751#comment-14694751 ] Karthik Kambatla commented on HADOOP-12313: --- I believe I created a JIRA long time ago to allow for pause/resume or stop/start semantics for services. Any takers? Possible NPE in JvmPauseMonitor.stop() -- Key: HADOOP-12313 URL: https://issues.apache.org/jira/browse/HADOOP-12313 Project: Hadoop Common Issue Type: Bug Reporter: Rohith Sharma K S Assignee: Gabor Liptak Priority: Critical Attachments: HADOOP-12313.2.patch, HADOOP-12313.3.patch, YARN-4035.1.patch It is observed that after YARN-4019 some tests are failing in TestRMAdminService with null pointer exceptions in build [build failure |https://builds.apache.org/job/PreCommit-YARN-Build/8792/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt] {noformat} unning org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService Tests run: 19, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 11.541 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService testModifyLabelsOnNodesWithDistributedConfigurationDisabled(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService) Time elapsed: 0.132 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.util.JvmPauseMonitor.stop(JvmPauseMonitor.java:86) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:601) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:983) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1038) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1085) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250) at org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testModifyLabelsOnNodesWithDistributedConfigurationDisabled(TestRMAdminService.java:824) testRemoveClusterNodeLabelsWithDistributedConfigurationEnabled(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService) Time elapsed: 0.121 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.util.JvmPauseMonitor.stop(JvmPauseMonitor.java:86) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:601) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:983) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1038) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1085) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250) at org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testRemoveClusterNodeLabelsWithDistributedConfigurationEnabled(TestRMAdminService.java:867) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12313) Possible NPE in JvmPauseMonitor.stop()
[ https://issues.apache.org/jira/browse/HADOOP-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12313: -- Priority: Critical (was: Major) Possible NPE in JvmPauseMonitor.stop() -- Key: HADOOP-12313 URL: https://issues.apache.org/jira/browse/HADOOP-12313 Project: Hadoop Common Issue Type: Bug Reporter: Rohith Sharma K S Assignee: Gabor Liptak Priority: Critical Attachments: HADOOP-12313.2.patch, HADOOP-12313.3.patch, YARN-4035.1.patch It is observed that after YARN-4019 some tests are failing in TestRMAdminService with null pointer exceptions in build [build failure |https://builds.apache.org/job/PreCommit-YARN-Build/8792/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt] {noformat} unning org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService Tests run: 19, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 11.541 sec FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService testModifyLabelsOnNodesWithDistributedConfigurationDisabled(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService) Time elapsed: 0.132 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.util.JvmPauseMonitor.stop(JvmPauseMonitor.java:86) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:601) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:983) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1038) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1085) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250) at org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testModifyLabelsOnNodesWithDistributedConfigurationDisabled(TestRMAdminService.java:824) testRemoveClusterNodeLabelsWithDistributedConfigurationEnabled(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService) Time elapsed: 0.121 sec ERROR! java.lang.NullPointerException: null at org.apache.hadoop.util.JvmPauseMonitor.stop(JvmPauseMonitor.java:86) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStop(ResourceManager.java:601) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stopActiveServices(ResourceManager.java:983) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToStandby(ResourceManager.java:1038) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1085) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250) at org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testRemoveClusterNodeLabelsWithDistributedConfigurationEnabled(TestRMAdminService.java:867) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12180: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks Chris for working on this. Just committed to trunk and branch-2. Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Fix For: 2.8.0 Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch, HADOOP-12180.004.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14620833#comment-14620833 ] Karthik Kambatla commented on HADOOP-12180: --- +1. Checking this in. Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch, HADOOP-12180.004.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11878) FileContext.java # fixRelativePart should check for not null for a more informative exception
[ https://issues.apache.org/jira/browse/HADOOP-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619109#comment-14619109 ] Karthik Kambatla commented on HADOOP-11878: --- By the way, I checked none of the direct callers check for NPE and their javadocs don't necessarily mention throwing an NPE if the path is null either. There is a possibility of some indirect caller catching NPE in case the path is null, but I would think that is pretty bleak. FileContext.java # fixRelativePart should check for not null for a more informative exception - Key: HADOOP-11878 URL: https://issues.apache.org/jira/browse/HADOOP-11878 Project: Hadoop Common Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Labels: BB2015-05-TBR Attachments: HADOOP-11878-002.patch, HADOOP-11878-003.patch, HADOOP-11878-004.patch, HADOOP-11878.patch Following will come when job failed and deletion service trying to delete the log fiels 2015-04-27 14:56:17,113 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-04-27 14:56:17,113 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:457) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11878) FileContext.java # fixRelativePart should check for not null for a more informative exception
[ https://issues.apache.org/jira/browse/HADOOP-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-11878: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks for looking into this, [~brahma]. Just committed this to trunk and branch-2. FileContext.java # fixRelativePart should check for not null for a more informative exception - Key: HADOOP-11878 URL: https://issues.apache.org/jira/browse/HADOOP-11878 Project: Hadoop Common Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Labels: BB2015-05-TBR Fix For: 2.8.0 Attachments: HADOOP-11878-002.patch, HADOOP-11878-003.patch, HADOOP-11878-004.patch, HADOOP-11878.patch Following will come when job failed and deletion service trying to delete the log fiels 2015-04-27 14:56:17,113 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-04-27 14:56:17,113 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:457) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11878) FileContext.java # fixRelativePart should check for not null for a more informative exception
[ https://issues.apache.org/jira/browse/HADOOP-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-11878: -- Summary: FileContext.java # fixRelativePart should check for not null for a more informative exception (was: NPE in FileContext.java # fixRelativePart(Path p)) FileContext.java # fixRelativePart should check for not null for a more informative exception - Key: HADOOP-11878 URL: https://issues.apache.org/jira/browse/HADOOP-11878 Project: Hadoop Common Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Labels: BB2015-05-TBR Attachments: HADOOP-11878-002.patch, HADOOP-11878-003.patch, HADOOP-11878-004.patch, HADOOP-11878.patch Following will come when job failed and deletion service trying to delete the log fiels 2015-04-27 14:56:17,113 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-04-27 14:56:17,113 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:457) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11878) FileContext.java # fixRelativePart should check for not null for a more informative exception
[ https://issues.apache.org/jira/browse/HADOOP-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619086#comment-14619086 ] Karthik Kambatla commented on HADOOP-11878: --- +1. Checking this in. FileContext.java # fixRelativePart should check for not null for a more informative exception - Key: HADOOP-11878 URL: https://issues.apache.org/jira/browse/HADOOP-11878 Project: Hadoop Common Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Labels: BB2015-05-TBR Attachments: HADOOP-11878-002.patch, HADOOP-11878-003.patch, HADOOP-11878-004.patch, HADOOP-11878.patch Following will come when job failed and deletion service trying to delete the log fiels 2015-04-27 14:56:17,113 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-04-27 14:56:17,113 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:457) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-12180: -- Target Version/s: 2.8.0 (was: 3.0.0) Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619301#comment-14619301 ] Karthik Kambatla commented on HADOOP-12180: --- Patch looks mostly good, just one question. SysInfo#newInstance - if OS type is not Linux or Windows, is that a security issue? Would RuntimeException be more appropriate than SecurityException? Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619389#comment-14619389 ] Karthik Kambatla commented on HADOOP-12180: --- What do you think of {{UnsupportedOperationException}} or a custom {{UnknownOSException}}? Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616995#comment-14616995 ] Karthik Kambatla commented on HADOOP-12180: --- For YARN-3332 and other related work, it would be simpler if trunk and branch-2 are same. Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-12186) ActiveStandbyElector shouldn't call monitorLockNodeAsync multiple times
[ https://issues.apache.org/jira/browse/HADOOP-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615230#comment-14615230 ] Karthik Kambatla commented on HADOOP-12186: --- Should we include this in 2.7.2 as well? ActiveStandbyElector shouldn't call monitorLockNodeAsync multiple times --- Key: HADOOP-12186 URL: https://issues.apache.org/jira/browse/HADOOP-12186 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.7.1 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: HADOOP-12186.000.patch ActiveStandbyElector shouldn't call {{monitorLockNodeAsync}} before StatCallback for previous {{zkClient.exists}} is received. We saw RM shutdown because ActiveStandbyElector retrying monitorLockNodeAsync exceeded limit. The following is the logs. Based on the log, it looks like multiple {{monitorLockNodeAsync}} are called at the same time due to back-to-back SyncConnected event received. The current code doesn't prevent {{zkClient.exists}} from being called before AsyncCallback.StatCallback for previous {{zkClient.exists}} is received. So the retry for {{monitorLockNodeAsync}} doesn't work correctly sometimes. {code} 2015-07-01 19:24:12,806 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 6674ms for sessionid 0x14e47693cc20007, closing socket connection and attempting reconnect 2015-07-01 19:24:12,919 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode... 2015-07-01 19:24:14,704 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node-1.internal/192.168.123.3:2181. Will not attempt to authenticate using SASL (unknown error) 2015-07-01 19:24:14,704 INFO org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.123.3:43487, server: node-1.internal/192.168.123.3:2181 2015-07-01 19:24:14,707 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node-1.internal/192.168.123.3:2181, sessionid = 0x14e47693cc20007, negotiated timeout = 1 2015-07-01 19:24:14,712 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected. 2015-07-01 19:24:21,374 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 6667ms for sessionid 0x14e47693cc20007, closing socket connection and attempting reconnect 2015-07-01 19:24:21,477 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode... 2015-07-01 19:24:22,640 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node-1.internal/192.168.123.3:2181. Will not attempt to authenticate using SASL (unknown error) 2015-07-01 19:24:22,640 INFO org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.123.3:43526, server: node-1.internal/192.168.123.3:2181 2015-07-01 19:24:22,641 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node-1.internal/192.168.123.3:2181, sessionid = 0x14e47693cc20007, negotiated timeout = 1 2015-07-01 19:24:22,642 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected. 2015-07-01 19:24:29,310 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 6669ms for sessionid 0x14e47693cc20007, closing socket connection and attempting reconnect 2015-07-01 19:24:29,413 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode... 2015-07-01 19:24:30,738 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node-1.internal/192.168.123.3:2181. Will not attempt to authenticate using SASL (unknown error) 2015-07-01 19:24:30,739 INFO org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.123.3:43574, server: node-1.internal/192.168.123.3:2181 2015-07-01 19:24:30,739 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node-1.internal/192.168.123.3:2181, sessionid = 0x14e47693cc20007, negotiated timeout = 1 2015-07-01 19:24:30,740 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session connected. 2015-07-01 19:24:37,409 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 6670ms for sessionid 0x14e47693cc20007, closing socket connection and attempting reconnect 2015-07-01 19:24:37,512 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session disconnected. Entering neutral mode... 2015-07-01 19:24:38,979 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node-1.internal/192.168.123.3:2181.
[jira] [Commented] (HADOOP-12180) Move ResourceCalculatorPlugin from YARN to Common
[ https://issues.apache.org/jira/browse/HADOOP-12180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14616226#comment-14616226 ] Karthik Kambatla commented on HADOOP-12180: --- Any reason we are targeting only trunk and not branch-2? Move ResourceCalculatorPlugin from YARN to Common - Key: HADOOP-12180 URL: https://issues.apache.org/jira/browse/HADOOP-12180 Project: Hadoop Common Issue Type: Improvement Components: util Reporter: Chris Douglas Assignee: Chris Douglas Attachments: HADOOP-12180.000.patch, HADOOP-12180.001.patch, HADOOP-12180.002.patch, HADOOP-12180.003.patch Some of the monitoring functions could be moved from YARN to Common for easier sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587180#comment-14587180 ] Karthik Kambatla commented on HADOOP-10584: --- We run into this occasionally on our test clusters. From my previous investigation, the patch I posted should help. However, I couldn't test because I couldn't find a way to reproduce the problem. It should be okay to punt to 2.7.2. ActiveStandbyElector goes down if ZK quorum become unavailable -- Key: HADOOP-10584 URL: https://issues.apache.org/jira/browse/HADOOP-10584 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: hadoop-10584-prelim.patch, rm.log ActiveStandbyElector retries operations for a few times. If the ZK quorum itself is down, it goes down and the daemons will have to be brought up again. Instead, it should log the fact that it is unable to talk to ZK, call becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11878) NPE in FileContext.java # fixRelativePart(Path p)
[ https://issues.apache.org/jira/browse/HADOOP-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14582208#comment-14582208 ] Karthik Kambatla commented on HADOOP-11878: --- As others have mentioned here, we should figure out why the directory is null and fix the root cause in Yarn. Can we use YARN-3793 to do that? Please feel free to take it up. That said, it would be nice to make the exception easier to decipher and document it at every place FileContext#fixRelativePart is used. And, I believe checkNotNull is more appropriate than checkArgument. NPE in FileContext.java # fixRelativePart(Path p) - Key: HADOOP-11878 URL: https://issues.apache.org/jira/browse/HADOOP-11878 Project: Hadoop Common Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Labels: BB2015-05-TBR Attachments: HADOOP-11878-002.patch, HADOOP-11878.patch Following will come when job failed and deletion service trying to delete the log fiels 2015-04-27 14:56:17,113 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : null 2015-04-27 14:56:17,113 ERROR org.apache.hadoop.yarn.server.nodemanager.DeletionService: Exception during execution of task in DeletionService java.lang.NullPointerException at org.apache.hadoop.fs.FileContext.fixRelativePart(FileContext.java:274) at org.apache.hadoop.fs.FileContext.delete(FileContext.java:761) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.deleteAsUser(DefaultContainerExecutor.java:457) at org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:293) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10584: -- Attachment: rm.log Ran into this on one of our test clusters. Looks like this needs fixing. ActiveStandbyElector goes down if ZK quorum become unavailable -- Key: HADOOP-10584 URL: https://issues.apache.org/jira/browse/HADOOP-10584 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: hadoop-10584-prelim.patch, rm.log ActiveStandbyElector retries operations for a few times. If the ZK quorum itself is down, it goes down and the daemons will have to be brought up again. Instead, it should log the fact that it is unable to talk to ZK, call becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-10584: -- Target Version/s: 2.7.1 (was: 2.6.0) ActiveStandbyElector goes down if ZK quorum become unavailable -- Key: HADOOP-10584 URL: https://issues.apache.org/jira/browse/HADOOP-10584 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: hadoop-10584-prelim.patch, rm.log ActiveStandbyElector retries operations for a few times. If the ZK quorum itself is down, it goes down and the daemons will have to be brought up again. Instead, it should log the fact that it is unable to talk to ZK, call becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11779) Fix pre-commit builds to execute the right set of tests
Karthik Kambatla created HADOOP-11779: - Summary: Fix pre-commit builds to execute the right set of tests Key: HADOOP-11779 URL: https://issues.apache.org/jira/browse/HADOOP-11779 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 2.7.0 Reporter: Karthik Kambatla Priority: Critical I have noticed that our pre-commit builds could end up running the wrong set of unit tests for patches. For instance, YARN-3412 changes only YARN files, but the test were run against one of the MR modules. I suspect there is a race condition when there are multiple builds executing on the same node, or remnants from a previous run are getting picked up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11779) Fix pre-commit builds to execute the right set of tests
[ https://issues.apache.org/jira/browse/HADOOP-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388841#comment-14388841 ] Karthik Kambatla commented on HADOOP-11779: --- Ravi - thanks for pointing that out. Nice to see it being reworked. Hope that takes care of this too, but I guess we ll only know once we start using the new version. Fix pre-commit builds to execute the right set of tests --- Key: HADOOP-11779 URL: https://issues.apache.org/jira/browse/HADOOP-11779 Project: Hadoop Common Issue Type: Bug Components: scripts Affects Versions: 2.7.0 Reporter: Karthik Kambatla Priority: Critical Labels: infra, jenkins I have noticed that our pre-commit builds could end up running the wrong set of unit tests for patches. For instance, YARN-3412 changes only YARN files, but the test were run against one of the MR modules. I suspect there is a race condition when there are multiple builds executing on the same node, or remnants from a previous run are getting picked up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11447) Add a more meaningful toString method to SampleStat and MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-11447: -- Resolution: Fixed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Thanks for your reviews, Robert and Steve. Just committed this to trunk and branch-2. Add a more meaningful toString method to SampleStat and MutableStat --- Key: HADOOP-11447 URL: https://issues.apache.org/jira/browse/HADOOP-11447 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Fix For: 2.8.0 Attachments: hadoop-11447-1.patch, hadoop-11447-2.patch SampleStat and MutableStat don't override the toString method. A more meaningful implementation could help with debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11492) Bump up curator version to 2.7.1
[ https://issues.apache.org/jira/browse/HADOOP-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355506#comment-14355506 ] Karthik Kambatla commented on HADOOP-11492: --- Thanks for drafting it, [~busbey]. Just marked it an incompatible change and copied your notes to Release Notes. I am not sure how these release notes actually show up in the release notes of the release, but will follow up at 2.7 release time. PS: I can't wait for us to isolate classpaths. Bump up curator version to 2.7.1 Key: HADOOP-11492 URL: https://issues.apache.org/jira/browse/HADOOP-11492 Project: Hadoop Common Issue Type: Task Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Arun Suresh Fix For: 2.7.0 Attachments: hadoop-11492-1.patch, hadoop-11492-2.patch, hadoop-11492-3.patch, hadoop-11492-3.patch Curator 2.7.1 got released recently and contains CURATOR-111 that YARN-2716 requires. PS: Filing a common JIRA so folks from other sub-projects also notice this change and shout out if there are any reservations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11492) Bump up curator version to 2.7.1
[ https://issues.apache.org/jira/browse/HADOOP-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-11492: -- Release Note: Apache Curator version change: Apache Hadoop has updated the version of Apache Curator used from 2.6.0 to 2.7.1. This change should be binary and source compatible for the majority of downstream users. Notable exceptions: # Binary incompatible change: org.apache.curator.utils.PathUtils.validatePath(String) changed return types. Downstream users of this method will need to recompile. # Source incompatible change: org.apache.curator.framework.recipes.shared.SharedCountReader added a method to its interface definition. Downstream users with custom implementations of this interface can continue without binary compatibility problems but will need to modify their source code to recompile. # Source incompatible change: org.apache.curator.framework.recipes.shared.SharedValueReader added a method to its interface definition. Downstream users with custom implementations of this interface can continue without binary compatibility problems but will need to modify their source code to recompile. Downstream users are reminded that while the Hadoop community will attempt to avoid egregious incompatible dependency changes, there is currently no policy around when Hadoop's exposed dependencies will change across versions (ref http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#Java_Classpath). Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) Bump up curator version to 2.7.1 Key: HADOOP-11492 URL: https://issues.apache.org/jira/browse/HADOOP-11492 Project: Hadoop Common Issue Type: Task Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Arun Suresh Fix For: 2.7.0 Attachments: hadoop-11492-1.patch, hadoop-11492-2.patch, hadoop-11492-3.patch, hadoop-11492-3.patch Curator 2.7.1 got released recently and contains CURATOR-111 that YARN-2716 requires. PS: Filing a common JIRA so folks from other sub-projects also notice this change and shout out if there are any reservations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11447) Add a more meaningful toString method to SampleStat and MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350488#comment-14350488 ] Karthik Kambatla commented on HADOOP-11447: --- [~ste...@apache.org] - ping. [~rkanter] - are you comfortable with me committing this without a +1 from Steve? Add a more meaningful toString method to SampleStat and MutableStat --- Key: HADOOP-11447 URL: https://issues.apache.org/jira/browse/HADOOP-11447 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Attachments: hadoop-11447-1.patch, hadoop-11447-2.patch SampleStat and MutableStat don't override the toString method. A more meaningful implementation could help with debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11447) Add a more meaningful toString method to SampleStat and MutableStat
[ https://issues.apache.org/jira/browse/HADOOP-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344171#comment-14344171 ] Karthik Kambatla commented on HADOOP-11447: --- [~ste...@apache.org] - are you okay with the latest patch here? Add a more meaningful toString method to SampleStat and MutableStat --- Key: HADOOP-11447 URL: https://issues.apache.org/jira/browse/HADOOP-11447 Project: Hadoop Common Issue Type: Improvement Components: metrics Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Attachments: hadoop-11447-1.patch, hadoop-11447-2.patch SampleStat and MutableStat don't override the toString method. A more meaningful implementation could help with debugging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HADOOP-10584) ActiveStandbyElector goes down if ZK quorum become unavailable
[ https://issues.apache.org/jira/browse/HADOOP-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reopened HADOOP-10584: --- Reopening to investigate it more. If anyone wants to pick this up, they are more than welcome. ActiveStandbyElector goes down if ZK quorum become unavailable -- Key: HADOOP-10584 URL: https://issues.apache.org/jira/browse/HADOOP-10584 Project: Hadoop Common Issue Type: Bug Components: ha Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: hadoop-10584-prelim.patch ActiveStandbyElector retries operations for a few times. If the ZK quorum itself is down, it goes down and the daemons will have to be brought up again. Instead, it should log the fact that it is unable to talk to ZK, call becomeStandby on its client, and continue to attempt connecting to ZK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11612) Workaround for Curator's ChildReaper requiring Guava 15+
[ https://issues.apache.org/jira/browse/HADOOP-11612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328325#comment-14328325 ] Karthik Kambatla commented on HADOOP-11612: --- bq. What's the best way to reflect that? Add javadoc comments, mark the class Private-Unstable and doomed, file a follow-up blocker JIRA targeting 3.0.0. Workaround for Curator's ChildReaper requiring Guava 15+ Key: HADOOP-11612 URL: https://issues.apache.org/jira/browse/HADOOP-11612 Project: Hadoop Common Issue Type: Task Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: HADOOP-11612.001.patch, HADOOP-11612.002.patch HADOOP-11492 upped the Curator version to 2.7.1, which makes the {{ChildReaper}} class use a method that only exists in newer versions of Guava (we have 11.0.2, and it needs 15+). As a workaround, we can copy the {{ChildReaper}} class into hadoop-common and make a minor modification to allow it to work with Guava 11. The {{ChildReaper}} is used by Curator to cleanup old lock znodes. Curator locks are needed by YARN-2942. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11467) KerberosAuthenticator can connect to a non-secure cluster
[ https://issues.apache.org/jira/browse/HADOOP-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313264#comment-14313264 ] Karthik Kambatla commented on HADOOP-11467: --- Nits: # A new variable httpok is not required, we could just use the condition directly. # The if-condition and return seems very complicated and hard to read. Is there a way to simplify this more? KerberosAuthenticator can connect to a non-secure cluster - Key: HADOOP-11467 URL: https://issues.apache.org/jira/browse/HADOOP-11467 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.6.0 Reporter: Robert Kanter Assignee: Yongjun Zhang Priority: Critical Attachments: HADOOP-11467.001.patch, HADOOP-11467.002.patch While looking at HADOOP-10895, we discovered that the {{KerberosAuthenticator}} can authenticate with a non-secure cluster, even without falling back. The problematic code is here: {code:java} if (conn.getResponseCode() == HttpURLConnection.HTTP_OK) {// - A LOG.debug(JDK performed authentication on our behalf.); // If the JDK already did the SPNEGO back-and-forth for // us, just pull out the token. AuthenticatedURL.extractToken(conn, token); return; } else if (isNegotiate()) { // - B LOG.debug(Performing our own SPNEGO sequence.); doSpnegoSequence(token); } else { // - C LOG.debug(Using fallback authenticator sequence.); Authenticator auth = getFallBackAuthenticator(); // Make sure that the fall back authenticator have the same // ConnectionConfigurator, since the method might be overridden. // Otherwise the fall back authenticator might not have the information // to make the connection (e.g., SSL certificates) auth.setConnectionConfigurator(connConfigurator); auth.authenticate(url, token); } } {code} Sometimes the JVM does the SPNEGO for us, and path A is used. However, if the {{KerberosAuthenticator}} tries to talk to a non-secure cluster, path A also succeeds in this case. More details can be found in this comment: https://issues.apache.org/jira/browse/HADOOP-10895?focusedCommentId=14247476page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14247476 We've actually dealt with this before. HADOOP-8883 tried to fix a related problem by adding another condition to path A that would look for a header. However, the JVM hides this header, making path A never occur. We reverted this change in HADOOP-10078, and didn't realize that there was still a problem until now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-11492) Bump up curator version to 2.7.1
[ https://issues.apache.org/jira/browse/HADOOP-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305948#comment-14305948 ] Karthik Kambatla commented on HADOOP-11492: --- Checking this in. Bump up curator version to 2.7.1 Key: HADOOP-11492 URL: https://issues.apache.org/jira/browse/HADOOP-11492 Project: Hadoop Common Issue Type: Task Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: hadoop-11492-1.patch, hadoop-11492-2.patch, hadoop-11492-3.patch, hadoop-11492-3.patch Curator 2.7.1 got released recently and contains CURATOR-111 that YARN-2716 requires. PS: Filing a common JIRA so folks from other sub-projects also notice this change and shout out if there are any reservations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HADOOP-11492) Bump up curator version to 2.7.1
[ https://issues.apache.org/jira/browse/HADOOP-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated HADOOP-11492: -- Assignee: Arun Suresh (was: Karthik Kambatla) Bump up curator version to 2.7.1 Key: HADOOP-11492 URL: https://issues.apache.org/jira/browse/HADOOP-11492 Project: Hadoop Common Issue Type: Task Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Arun Suresh Attachments: hadoop-11492-1.patch, hadoop-11492-2.patch, hadoop-11492-3.patch, hadoop-11492-3.patch Curator 2.7.1 got released recently and contains CURATOR-111 that YARN-2716 requires. PS: Filing a common JIRA so folks from other sub-projects also notice this change and shout out if there are any reservations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)