Re: Planning Hadoop 2.6.1 release
Thanks Vinod for creating the list! @Akira, let’s sync up offline on how to take this forward in terms of the release process. Okay. Thanks, Akira On 7/16/15 10:24, Vinod Kumar Vavilapalli wrote: Alright, I’d like to make progress while the issue is hot. I created a label to discuss on the candidate list of patches: https://issues.apache.org/jira/issues/?jql=labels%20%3D%202.6.1-candidatehttps://issues.apache.org/jira/issues/?jql=labels%20=%202.6.1-candidate Next steps, I’ll do the following - Review 2.7 and 2.8 blocker/critical tickets and see what makes sense for 2.6.1 and add as candidates - I haven’t reviewed the current list yet, the seed list is from this email thread. Will review them. - I also have a bunch of patches that I’d like to include, will update them right away. Others, please look at the current list and let me know what else you’d like to include. I’d like to keep this ‘candidate-collection’ cycle’ for a max of a week and then start the release process. @Akira, let’s sync up offline on how to take this forward in terms of the release process. Thanks +Vinod On Jul 15, 2015, at 1:12 PM, Vinod Kumar Vavilapalli vino...@hortonworks.commailto:vino...@hortonworks.com wrote: Got pinged on a recent thread on this one. As I mentioned there, I had many offline discussions re 2.6.1. The biggest problem I found offline was about what bug-fixes are acceptable and what aren’t for everyone wishing to consume 2.6.1. Given the number of bug-fixes that went into 2.7.x and into branch-2.8, figuring out a set of patches that is acceptable for everyone is a huge challenge which kind of stalled my attempts. Thanks +Vinod On Jul 1, 2015, at 12:41 PM, Sean Busbey bus...@cloudera.commailto:bus...@cloudera.com wrote: Any update on a release plan for 2.6.1? On Wed, Jun 10, 2015 at 1:25 AM, Brahma Reddy Battula brahmareddy.batt...@huawei.commailto:brahmareddy.batt...@huawei.com wrote: HI vinod any update on this..? are we planning to give 2.6.1 Or can we make 2.7.1 as stable give..? Thanks Regards Brahma Reddy Battula From: Zhihai Xu [z...@cloudera.commailto:z...@cloudera.com] Sent: Wednesday, May 13, 2015 12:04 PM To: mapreduce-...@hadoop.apache.orgmailto:mapreduce-...@hadoop.apache.org Cc: common-dev@hadoop.apache.orgmailto:common-dev@hadoop.apache.org; yarn-...@hadoop.apache.orgmailto:yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.orgmailto:hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Hi Akira, Can we also include YARN-3242? YARN-3242 fixed a critical ZKRMStateStore bug. It will work better with YARN-2992. thanks zhihai On Tue, May 12, 2015 at 10:38 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Thanks all for collecting jiras for 2.6.1 release. In addition, I'd like to include the following: * HADOOP-11343. Overflow is not properly handled in calculating final iv for AES CTR * YARN-2874. Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps * YARN-2992. ZKRMStateStore crashes due to session expiry * YARN-3013. AMRMClientImpl does not update AMRM token properly * YARN-3369. Missing NullPointer check in AppSchedulingInfo causes RM to die * MAPREDUCE-6303. Read timeout when retrying a fetch error can be fatal to a reducer All of these are marked as blocker bug for 2.7.0 but not fixed in 2.6.0. Regards, Akira On 5/4/15 11:15, Brahma Reddy Battula wrote: Hello Vinod, I am thinking,can we include HADOOP-11491 also..? wihout this jira harfs will not be usable when cluster installed in HA mode and try to get filecontext like below.. Path path = new Path(har:///archivedLogs/application_1428917727658_0005-application_1428917727658_0008-1428927448352.har); FileSystem fs = path.getFileSystem(new Configuration()); path = fs.makeQualified(path); FileContext fc = FileContext.getFileContext(path.toUri(),new Configuration()); Thanks Regards Brahma Reddy Battula From: Chris Nauroth [cnaur...@hortonworks.com] Sent: Friday, May 01, 2015 4:32 AM To: mapreduce-...@hadoop.apache.org; common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org Subject: Re: Planning Hadoop 2.6.1 release Thank you, Arpit. In addition, I suggest we include the following: HADOOP-11333. Fix deadlock in DomainSocketWatcher when the notification pipe is full HADOOP-11604. Prevent ConcurrentModificationException while closing domain sockets during shutdown of DomainSocketWatcher thread. HADOOP-11648. Set DomainSocketWatcher thread name explicitly HADOOP-11802. DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm HADOOP-11604 and 11648 are not critical by themselves, but they are pre-requisites to getting a clean cherry-pick of 11802, which we believe finally fixes the root cause of this issue. --Chris Nauroth On 4/30/15, 3:55 PM, Arpit Agarwal
Re: [Test-Patch TLP] consensus on naming
Hi Allen Just to be clear: yetus is about the layer between Jenkins and the unit tests. Many Apache communities don’t bother looking at Jenkins and it’s reporting because a) it’s extremely noisy when you’re testing literally hundreds to thousands of patches a week (plus full builds!) and b) why log into something when one can have the results sent (passive vs. active)? Plus it’s much easier to run a pre-existing script across a ton of different source bases than work on integrating xyz tool into an existing source tree. Good points. This layer between Jenkins and the unit tests seems useful. That said, I did look at tap4j last year when I was looking to integrate bats into Hadoop’s test framework. As someone who isn’t particularly proficient in Java, it wasn’t clear to me how exactly I would turn bats TAP output into something JUnit could process using tap4j. You can use something like this Perl script [1] to convert the output of bats TAP to JUnit. I tested it locally and the XML looked good. It would be possible to use tap4j for that too, but I think it would be not very efficient unless you already had a JVM running or needed to integrate it with Jenkins. Cheers Bruno [1] http://taint.org/2008/03/26/124602a.html From: Allen Wittenauer a...@altiscale.com To: common-dev@hadoop.apache.org; Bruno P. Kinoshita brunodepau...@yahoo.com.br Sent: Wednesday, July 15, 2015 4:43 AM Subject: Re: [Test-Patch TLP] consensus on naming On Jul 14, 2015, at 3:08 AM, Bruno P. Kinoshita brunodepau...@yahoo.com.br wrote: Hi Has anyone considered using TAP (Test Anything Protocol) for test reporting? [1][2] disclaimer: I'm the maintainer of the Jenkins TAP plug-in and tap4j Java library Just to be clear: yetus is about the layer between Jenkins and the unit tests. Many Apache communities don’t bother looking at Jenkins and it’s reporting because a) it’s extremely noisy when you’re testing literally hundreds to thousands of patches a week (plus full builds!) and b) why log into something when one can have the results sent (passive vs. active)? Plus it’s much easier to run a pre-existing script across a ton of different source bases than work on integrating xyz tool into an existing source tree. That said, I did look at tap4j last year when I was looking to integrate bats into Hadoop’s test framework. As someone who isn’t particularly proficient in Java, it wasn’t clear to me how exactly I would turn bats TAP output into something JUnit could process using tap4j.
[jira] [Created] (HADOOP-12241) Add specification of FileStatus
Masatake Iwasaki created HADOOP-12241: - Summary: Add specification of FileStatus Key: HADOOP-12241 URL: https://issues.apache.org/jira/browse/HADOOP-12241 Project: Hadoop Common Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12242) Add TOC to filesystem.md
Masatake Iwasaki created HADOOP-12242: - Summary: Add TOC to filesystem.md Key: HADOOP-12242 URL: https://issues.apache.org/jira/browse/HADOOP-12242 Project: Hadoop Common Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions
+1 for steve's suggestion. I agree completely that everything should be finalized before committing in. -Vinay On Jul 16, 2015 2:29 PM, Steve Loughran ste...@hortonworks.com wrote: 1. I agree that the bar for patches going in should be very high: there's always the risk of some subtle regression. The more patches, the higher the risk, the more traumatic the update 2. I like the idea of having a list of proposed candidate patches, all of which can be reviewed and discussed before going in. On 16 Jul 2015, at 02:43, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: https://issues.apache.org/jira/issues/?jql=labels%20%3D%202.6.1-candidate https://issues.apache.org/jira/issues/?jql=labels%20=%202.6.1-candidate Link is https://issues.apache.org/jira/browse/YARN-3575?jql=labels%20%3D%202.6.1-candidate 3. Maybe we should have some guidelines of what isn't going to get in except in very, very special cases -any change to classpath/dependencies -any change to the signature of an API, including exception types text -changes to wire formats 4. We could also consider driving patches based on those that downstream redistributors of Hadoop felt were important enough to backport. That's cloudera as well as us, Amazon if they filed JIRAs, Microsoft, + others. Ideally patches that have been tested and released, so there's a high chance regressions would have surfaced already. 5. Then there's the these broke HBase changes; vinod already has HADOOP-11710 in there, as an example. 6. And of course, any security issue patch should go in. Overall then: the expectation should be that patches won't go in by default, unless viewed as critical. We have to be ruthless, and people shouldn't commit things without getting approval from others. -Steve
Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions
1. I agree that the bar for patches going in should be very high: there's always the risk of some subtle regression. The more patches, the higher the risk, the more traumatic the update 2. I like the idea of having a list of proposed candidate patches, all of which can be reviewed and discussed before going in. On 16 Jul 2015, at 02:43, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: https://issues.apache.org/jira/issues/?jql=labels%20%3D%202.6.1-candidatehttps://issues.apache.org/jira/issues/?jql=labels%20=%202.6.1-candidate Link is https://issues.apache.org/jira/browse/YARN-3575?jql=labels%20%3D%202.6.1-candidate 3. Maybe we should have some guidelines of what isn't going to get in except in very, very special cases -any change to classpath/dependencies -any change to the signature of an API, including exception types text -changes to wire formats 4. We could also consider driving patches based on those that downstream redistributors of Hadoop felt were important enough to backport. That's cloudera as well as us, Amazon if they filed JIRAs, Microsoft, + others. Ideally patches that have been tested and released, so there's a high chance regressions would have surfaced already. 5. Then there's the these broke HBase changes; vinod already has HADOOP-11710 in there, as an example. 6. And of course, any security issue patch should go in. Overall then: the expectation should be that patches won't go in by default, unless viewed as critical. We have to be ruthless, and people shouldn't commit things without getting approval from others. -Steve
Re: 2.7.2 release plan
Thanks Vinod for starting 2.7.2 release plan. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. Can we adopt the plan as Karthik mentioned in Additional maintenance releases for Hadoop 2.y versions thread? That way we can include not only blocker but also critical bug fixes to 2.7.2 release. In addition, branch-2.7 is a special case. (2.7.1 is the first stable release) Therefore I'm thinking we can include major bug fixes as well. Regards, Akira On 7/16/15 04:13, Vinod Kumar Vavilapalli wrote: Hi all, Thanks everyone for the push on 2.7.1! Branch-2.7 is now open for commits to a 2.7.2 release. JIRA also now has a 2.7.2 version for all the sub-projects. Continuing the previous 2.7.1 thread on steady maintenance releases [1], we should follow up 2.7.1 with a 2.7.2 within 4 weeks. Earlier I tried a 2-3 week cycle for 2.7.1, but it seems to be impractical given the community size. So, I propose we target a release by the end for 4 weeks from now, starting the release close-down within 2-3 weeks. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. I need help from all committers in automatically merging in any patch that fits the above criterion into 2.7.2 instead of only on trunk or 2.8. Thoughts? Thanks, +Vinod [1] A 2.7.1 release to follow up 2.7.0 http://markmail.org/message/zwzze6cqqgwq4rmw [2] 2.7.2 release blockers: https://issues.apache.org/jira/issues/?filter=12332867
Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions
On Thu, Jul 16, 2015 at 4:59 AM, Steve Loughran ste...@hortonworks.com wrote: 1. I agree that the bar for patches going in should be very high: there's always the risk of some subtle regression. The more patches, the higher the risk, the more traumatic the update 2. I like the idea of having a list of proposed candidate patches, all of which can be reviewed and discussed before going in. +1 from me too. On 16 Jul 2015, at 02:43, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: https://issues.apache.org/jira/issues/?jql=labels%20%3D%202.6.1-candidate https://issues.apache.org/jira/issues/?jql=labels%20=%202.6.1-candidate Link is https://issues.apache.org/jira/browse/YARN-3575?jql=labels%20%3D%202.6.1-candidate 3. Maybe we should have some guidelines of what isn't going to get in except in very, very special cases -any change to classpath/dependencies Sounds good. We should work towards making this a requirement within a major release, at least for dependencies on the client side. -any change to the signature of an API, including exception types text -changes to wire formats These two should hold for minor releases also, no? 4. We could also consider driving patches based on those that downstream redistributors of Hadoop felt were important enough to backport. That's cloudera as well as us, Amazon if they filed JIRAs, Microsoft, + others. Ideally patches that have been tested and released, so there's a high chance regressions would have surfaced already. 5. Then there's the these broke HBase changes; vinod already has HADOOP-11710 in there, as an example. 6. And of course, any security issue patch should go in. Overall then: the expectation should be that patches won't go in by default, unless viewed as critical. We have to be ruthless, and people shouldn't commit things without getting approval from others. -Steve -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es
Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions
On Thu, Jul 16, 2015 at 9:17 AM, Karthik Kambatla ka...@cloudera.com wrote: On Thu, Jul 16, 2015 at 4:59 AM, Steve Loughran ste...@hortonworks.com wrote: -any change to the signature of an API, including exception types text -changes to wire formats These two should hold for minor releases also, no? At the risk of derailing this thread, no definitely not. any change would include backwards compatible additions / changes. Using this stricter restriction is great for patch releases, since it means that a user can safely move onto a newer patch release with the assurance that if some regression should show up they can move back to an earlier patch release without risk that changes in their application since upgrading won't work due to reliance on an addition. -- Sean
Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions
On Thu, Jul 16, 2015 at 10:42 AM, Sean Busbey bus...@cloudera.com wrote: On Thu, Jul 16, 2015 at 9:17 AM, Karthik Kambatla ka...@cloudera.com wrote: On Thu, Jul 16, 2015 at 4:59 AM, Steve Loughran ste...@hortonworks.com wrote: -any change to the signature of an API, including exception types text -changes to wire formats These two should hold for minor releases also, no? At the risk of derailing this thread, no definitely not. any change would include backwards compatible additions / changes. Using this stricter restriction is great for patch releases, since it means that a user can safely move onto a newer patch release with the assurance that if some regression should show up they can move back to an earlier patch release without risk that changes in their application since upgrading won't work due to reliance on an addition. I am not sure I understand the need for restriction for source and binary backwards-compatible API changes. -- Sean -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es
Re: [Test-Patch TLP] consensus on naming
On Jul 15, 2015, at 11:22 PM, Bruno P. Kinoshita brunodepau...@yahoo.com.br wrote: Good points. This layer between Jenkins and the unit tests seems useful. Thanks! We think so too! It’s why we’re working on pulling the code out of Hadoop to make it more generalized for lots of different projects. You can use something like this Perl script [1] to convert the output of bats TAP to JUnit. I tested it locally and the XML looked good. It would be possible to use tap4j for that too, but I think it would be not very efficient unless you already had a JVM running or needed to integrate it with Jenkins. JVM startup costs aren’t a problem because we’d want to do this inside maven. So it’d be *extremely* useful if tap4j’s maven plugin had a hook that said “convert this directory of TAP-formatted files to JUnit XML”. [We post-process the JUnit XML.] I’ll take a look at the perl code. Thanks!
Re: 2.7.2 release plan
I'd be comfortable with inclusion of any doc-only patch in minor releases. There is a lot of value to end users in pushing documentation fixes as quickly as possible, and they don't bear the same risk of regressions or incompatibilities as code changes. --Chris Nauroth On 7/16/15, 12:38 AM, Tsuyoshi Ozawa oz...@apache.org wrote: Hi, thank you for starting the discussion about 2.7.2 release. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. I've committed YARN-3170, which is an improvement of documentation. I thought documentation pages which can be fit into branch-2.7 can be included easily. Should I revert it? I need help from all committers in automatically merging in any patch that fits the above criterion into 2.7.2 instead of only on trunk or 2.8. Sure, I'll try my best. That way we can include not only blocker but also critical bug fixes to 2.7.2 release. As Vinod mentioned, we should also apply major bug fixes into branch-2.7. Thanks, - Tsuyoshi On Thu, Jul 16, 2015 at 3:52 PM, Akira AJISAKA ajisa...@oss.nttdata.co.jp wrote: Thanks Vinod for starting 2.7.2 release plan. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. Can we adopt the plan as Karthik mentioned in Additional maintenance releases for Hadoop 2.y versions thread? That way we can include not only blocker but also critical bug fixes to 2.7.2 release. In addition, branch-2.7 is a special case. (2.7.1 is the first stable release) Therefore I'm thinking we can include major bug fixes as well. Regards, Akira On 7/16/15 04:13, Vinod Kumar Vavilapalli wrote: Hi all, Thanks everyone for the push on 2.7.1! Branch-2.7 is now open for commits to a 2.7.2 release. JIRA also now has a 2.7.2 version for all the sub-projects. Continuing the previous 2.7.1 thread on steady maintenance releases [1], we should follow up 2.7.1 with a 2.7.2 within 4 weeks. Earlier I tried a 2-3 week cycle for 2.7.1, but it seems to be impractical given the community size. So, I propose we target a release by the end for 4 weeks from now, starting the release close-down within 2-3 weeks. The focus obviously is to have blocker issues [2], bug-fixes and *no* features / improvements. I need help from all committers in automatically merging in any patch that fits the above criterion into 2.7.2 instead of only on trunk or 2.8. Thoughts? Thanks, +Vinod [1] A 2.7.1 release to follow up 2.7.0 http://markmail.org/message/zwzze6cqqgwq4rmw [2] 2.7.2 release blockers: https://issues.apache.org/jira/issues/?filter=12332867
[jira] [Created] (HADOOP-12243) Grep command used in test-patch and smart-apply-patch should be GNU's one, not POSIX
Kengo Seki created HADOOP-12243: --- Summary: Grep command used in test-patch and smart-apply-patch should be GNU's one, not POSIX Key: HADOOP-12243 URL: https://issues.apache.org/jira/browse/HADOOP-12243 Project: Hadoop Common Issue Type: Sub-task Components: yetus Reporter: Kengo Seki test-patch and smart-apply-patch use grep -o, but POSIX does not support that option. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/grep.html#tag_20_55_04 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12244) recover broken rebase?
Allen Wittenauer created HADOOP-12244: - Summary: recover broken rebase? Key: HADOOP-12244 URL: https://issues.apache.org/jira/browse/HADOOP-12244 Project: Hadoop Common Issue Type: Sub-task Affects Versions: HADOOP-12111 Reporter: Allen Wittenauer One of the Jenkins hosts is failing during the git rebase that happens in bootstrap. We should probably do something better than just fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HADOOP-9882) Trunk doesn't compile
[ https://issues.apache.org/jira/browse/HADOOP-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Casile reopened HADOOP-9882: - I now have an opposite-ish problem I think. I followed all of the directions, but now my protoc is at 3.0.0 and hadoop is still looking for 2.5.0. I tried modifying hadoop-project/pom.xml ... but that just caused an issue finding the 3.0.0 on the java and jar. oi!! Even if I can get it to allow me to use the 3.0.0 ... is there any way to be sure I won't cause issues down the road. I am building on my notebook and only do basic unit testing there ... another X86-64 linux system is the target. Not sure if protoc level is really that crucial ... but I guess if it is, I'm not going to help matters any. Trunk doesn't compile - Key: HADOOP-9882 URL: https://issues.apache.org/jira/browse/HADOOP-9882 Project: Hadoop Common Issue Type: Bug Components: build Reporter: Jean-Baptiste Onofré Currently, trunk does not compile (in hadoop-common-project/hadoop-common module): [ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:3.0.0-SNAPSHOT:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 2.4.1', expected version is '2.5.0' - [Help 1] I gonna fix that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12246) Apache Hadoop should be listed under big-data and hadoop categories
Ajoy Bhatia created HADOOP-12246: Summary: Apache Hadoop should be listed under big-data and hadoop categories Key: HADOOP-12246 URL: https://issues.apache.org/jira/browse/HADOOP-12246 Project: Hadoop Common Issue Type: Task Components: site Reporter: Ajoy Bhatia Assignee: Ajoy Bhatia Priority: Trivial In the Projects by category list on https://projects.apache.org/projects.html?category , Apache Hadoop is listed in the database category only. Apache Hadoop project should also be listed under big-data and hadoop categories. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12245) References to misspelled REMAINING_QUATA in FileSystemShell.md
Gera Shegalov created HADOOP-12245: -- Summary: References to misspelled REMAINING_QUATA in FileSystemShell.md Key: HADOOP-12245 URL: https://issues.apache.org/jira/browse/HADOOP-12245 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.7.0 Reporter: Gera Shegalov HADOOP-10378 fixed the help message but there is still a reference to the misspelled REMAINING_QUATA: {code} [tw-mbp-gshegalov hadoop-common (trunk)]$ ack QUATA . hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md 177:The output columns with -count -q are: QUOTA, REMAINING\_QUATA, SPACE\_QUOTA, REMAINING\_SPACE\_QUOTA, DIR\_COUNT, FILE\_COUNT, CONTENT\_SIZE, PATHNAME {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)