[jira] [Created] (HADOOP-12111) Split test-patch off into its own TLP
Allen Wittenauer created HADOOP-12111: - Summary: Split test-patch off into its own TLP Key: HADOOP-12111 URL: https://issues.apache.org/jira/browse/HADOOP-12111 Project: Hadoop Common Issue Type: Bug Components: yetus Reporter: Allen Wittenauer Given test-patch's tendency to get forked into a variety of different projects, it makes a lot of sense to make an Apache TLP so that everyone can benefit from a common code base. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HADOOP-12110) Consolidate usage of JSON libraries
[ https://issues.apache.org/jira/browse/HADOOP-12110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang resolved HADOOP-12110. Resolution: Invalid Opened for the wrong project. Sorry, close as invalid. > Consolidate usage of JSON libraries > --- > > Key: HADOOP-12110 > URL: https://issues.apache.org/jira/browse/HADOOP-12110 > Project: Hadoop Common > Issue Type: Bug >Reporter: Eric Yang >Assignee: Eric Yang > > Chukwa uses JSON jar from json.org and also json-simple from googlecode. It > would be nice if we only use one implementation of JSON to be consistent. > Mindev JSON-smart was also considered as replacement for JSON simple to > improve performance, but it doesn't handle some characters correctly. > Therefore, it's best to use JSON Simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12110) Consolidate usage of JSON libraries
Eric Yang created HADOOP-12110: -- Summary: Consolidate usage of JSON libraries Key: HADOOP-12110 URL: https://issues.apache.org/jira/browse/HADOOP-12110 Project: Hadoop Common Issue Type: Bug Reporter: Eric Yang Assignee: Eric Yang Chukwa uses JSON jar from json.org and also json-simple from googlecode. It would be nice if we only use one implementation of JSON to be consistent. Mindev JSON-smart was also considered as replacement for JSON simple to improve performance, but it doesn't handle some characters correctly. Therefore, it's best to use JSON Simple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error
Phil D'Amore created HADOOP-12109: - Summary: Distcp of file > 5GB to swift fails with HTTP 413 error Key: HADOOP-12109 URL: https://issues.apache.org/jira/browse/HADOOP-12109 Project: Hadoop Common Issue Type: Bug Components: fs/swift Affects Versions: 2.6.0 Reporter: Phil D'Amore Trying to use distcp to copy a file more than 5GB to swift fs results in a stack like the following: 15/06/01 20:58:57 ERROR util.RetriableCommand: Failure in Retriable command: Copying hdfs://xxx:8020/path/to/random-5Gplus.dat to swift://xxx/5Gplus.dat Invalid Response: Method COPY on http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0 failed, status code: 413, status line: HTTP/1.1 413 Request Entity Too Large COPY http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0 => 413 : Request Entity Too LargeThe body of your request was too large for this server. at org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1502) at org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403) at org.apache.hadoop.fs.swift.http.SwiftRestClient.copyObject(SwiftRestClient.java:923) at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.copyObject(SwiftNativeFileSystemStore.java:765) at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.rename(SwiftNativeFileSystemStore.java:617) at org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.rename(SwiftNativeFileSystem.java:577) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:137) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:280) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:252) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) It looks like the problem actually occurs in the rename operation which happens after the copy. The rename is implemented as a copy/delete, and this secondary copy looks like it's not done in a way that breaks up the file into smaller chunks. It looks like the following bug: https://bugs.launchpad.net/sahara/+bug/1428941 It does not look like the fix for this is incorporated into hadoop's swift client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: > On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe > wrote: > > > You mentioned that "most of our project will be focused on shell > > scripts" I guess based on the existing test-patch code. Allen did a > > lot of good work in this area recently. I am curious if you evaluated > > languages such as Python or Node.js for this use-case. Shell scripts > > can get a little... tricky beyond a certain size. On the other hand, > > if we are standardizing on shell, which shell and which version? > > Perhaps bash 3.5+? > > > > I'll also add that shell is not helpful for a cross-platform set of > tooling. I recently added a daemon to Apache Phoenix; an explicit > requirement was Windows support. I ended up implementing a solution in > python because that environment is platform-agnostic and still systems-y > enough. I think this is something this project should seriously consider. > In my opinion, historically, test-patch hasn't needed to be cross platform because the only first class development environment for Hadoop has been Linux. Growing beyond this could absolutely be one focus of Yetus should that be a consensus goal of the community. The seed of the project, though, is today's test-patch, which is implemented in bash. That's where we are today. Language "discussions" (smile) can and should be forward looking. On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk wrote: > On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe > wrote: > > > You mentioned that "most of our project will be focused on shell > > scripts" I guess based on the existing test-patch code. Allen did a > > lot of good work in this area recently. I am curious if you evaluated > > languages such as Python or Node.js for this use-case. Shell scripts > > can get a little... tricky beyond a certain size. On the other hand, > > if we are standardizing on shell, which shell and which version? > > Perhaps bash 3.5+? > > > > I'll also add that shell is not helpful for a cross-platform set of > tooling. I recently added a daemon to Apache Phoenix; an explicit > requirement was Windows support. I ended up implementing a solution in > python because that environment is platform-agnostic and still systems-y > enough. I think this is something this project should seriously consider. > > -n > > On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > > > I'm going to try responding to several things at once here, so > apologies > > if > > > I miss anyone and sorry for the long email. :) > > > > > > > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran < > ste...@hortonworks.com> > > > wrote: > > > > > >> I think it's good to have a general build/test process projects can > > share, > > >> so +1 to pulling it out. You should get help from others. > > >> > > >> regarding incubation, it is a lot of work, especially for something > > that's > > >> more of an in-house tool than an artifact to release and redistribute. > > >> > > >> You can't just use apache labs or the build project's repo to work on > > this? > > >> > > >> if you do want to incubate, we may want to nominate the hadoop project > > as > > >> the monitoring PMC, rather than incubator@. > > >> > > >> -steve > > >> > > >> > > > Important note: we're proposing a board resolution that would directly > > pull > > > this code base out into a new TLP; there'd be no incubator, we'd just > > > continue building community and start making releases. > > > > > > The proposed PMC believes the tooling we're talking about has direct > > > applicability to projects well outside of the ASF. Lot's of other open > > > source projects run on community contributions and have a general need > > for > > > better QA tools. Given that problem set and the presence of a community > > > working to solve it, there's no reason this needs to be treated as an > > > in-house build project. We certainly want to be useful to ASF projects > > and > > > getting them on-board given our current optimization for ASF infra will > > > certainly be easier, but we're not limited to that (and our current > > > prerequisites, a CI tool and jira or github, are pretty broadly > > available). > > > > > > > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk > > wrote: > > > > > >> > > >> Since we're tossing out names, how about Apache Bootstrap? It's a > > >> meta-project to help other projects get off the ground, after all. > > >> > > > > > > > > > There's already a web development framework named Bootstrap[1]. It's > also > > > used by several ASF projects, so I think it best to avoid the > confusion. > > > > > > The name is, of course, up to the proposed PMC. As a bit of background, > > the > > > current name Yetus fulfills Allen's desire to have something shell > > related > > > and my desire to have a project that starts with Y (there are currently > > no > > > ASF projects that start with Y). The universe of names that fill in > these > > > two is very small, AFAICT. I did a brie
[jira] [Resolved] (HADOOP-12108) Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs
[ https://issues.apache.org/jira/browse/HADOOP-12108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash resolved HADOOP-12108. --- Resolution: Invalid Thanks Aman! Steve is right. You do need to use quotes when there is already a file on the local file system which would match the wildcard > Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs > > > Key: HADOOP-12108 > URL: https://issues.apache.org/jira/browse/HADOOP-12108 > Project: Hadoop Common > Issue Type: Bug >Reporter: Aman Goyal >Priority: Critical > > If you have following directories in your LOCAL file system > /data/hadoop/sample/00/contents1.txt > /data/hadoop/sample/01/contents2.txt > and following directories in hdfs : > /data/hadoop/sample/00/contents1.txt > /data/hadoop/sample/01/contents2.txt > /data/hadoop/sample/02/contents3.txt > suppose you run the following hdfs ls command: > hdfs dfs -ls -R /data/hadoop/sample/* > the paths that are printed have a reference to local paths, and only 00 & 01 > directories get listed. > this happens only when wildcard ( * ) character is used in input paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe wrote: > You mentioned that "most of our project will be focused on shell > scripts" I guess based on the existing test-patch code. Allen did a > lot of good work in this area recently. I am curious if you evaluated > languages such as Python or Node.js for this use-case. Shell scripts > can get a little... tricky beyond a certain size. On the other hand, > if we are standardizing on shell, which shell and which version? > Perhaps bash 3.5+? > I'll also add that shell is not helpful for a cross-platform set of tooling. I recently added a daemon to Apache Phoenix; an explicit requirement was Windows support. I ended up implementing a solution in python because that environment is platform-agnostic and still systems-y enough. I think this is something this project should seriously consider. -n On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > > I'm going to try responding to several things at once here, so apologies > if > > I miss anyone and sorry for the long email. :) > > > > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran > > wrote: > > > >> I think it's good to have a general build/test process projects can > share, > >> so +1 to pulling it out. You should get help from others. > >> > >> regarding incubation, it is a lot of work, especially for something > that's > >> more of an in-house tool than an artifact to release and redistribute. > >> > >> You can't just use apache labs or the build project's repo to work on > this? > >> > >> if you do want to incubate, we may want to nominate the hadoop project > as > >> the monitoring PMC, rather than incubator@. > >> > >> -steve > >> > >> > > Important note: we're proposing a board resolution that would directly > pull > > this code base out into a new TLP; there'd be no incubator, we'd just > > continue building community and start making releases. > > > > The proposed PMC believes the tooling we're talking about has direct > > applicability to projects well outside of the ASF. Lot's of other open > > source projects run on community contributions and have a general need > for > > better QA tools. Given that problem set and the presence of a community > > working to solve it, there's no reason this needs to be treated as an > > in-house build project. We certainly want to be useful to ASF projects > and > > getting them on-board given our current optimization for ASF infra will > > certainly be easier, but we're not limited to that (and our current > > prerequisites, a CI tool and jira or github, are pretty broadly > available). > > > > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk > wrote: > > > >> > >> Since we're tossing out names, how about Apache Bootstrap? It's a > >> meta-project to help other projects get off the ground, after all. > >> > > > > > > There's already a web development framework named Bootstrap[1]. It's also > > used by several ASF projects, so I think it best to avoid the confusion. > > > > The name is, of course, up to the proposed PMC. As a bit of background, > the > > current name Yetus fulfills Allen's desire to have something shell > related > > and my desire to have a project that starts with Y (there are currently > no > > ASF projects that start with Y). The universe of names that fill in these > > two is very small, AFAICT. I did a brief suitability search and didn't > find > > any blockers. > > > > > > On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer > > wrote: > > > >> > >> Since a couple of people have brought it up: > >> > >> I think the release question is probably one of the big question > >> marks. Other than tar balls, how does something like this actually get > >> used downstream? > >> > >> For test-patch, in particular, I have a few thoughts on this: > >> > >> Short term: > >> > >> * Projects that want to move RIGHT NOW would modify their > Jenkins > >> jobs to checkout from the Yetus repo (preferably at a well known tag or > >> branch) in one directory and their project repo in another directory. > Then > >> it’s just a matter of passing the correct flags to test-patch. This is > >> pretty much how I’ve been personally running test-patch for about 6 > months > >> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. > >> > >> * Create a stub version of test-patch that projects could check > >> into their repo, replacing the existing test-patch. This stub version > >> would git clone from either ASF or github and then execute test-patch > >> accordingly on demand. With the correct smarts, it could make sure it > has > >> a cached version to prevent continual clones. > >> > >> Longer term: > >> > >> * I’ve been toying with the idea of (ab)using Java repos and > >> packaging as a transportation layer, either in addition or in > combination > >> with something like a maven plugin. Something like this would clearly > be > >> better for offline usage and/or to lower the network tra
Re: [DISCUSS] More Maintenance Releases
+1 for creating a maintenance release with a more rapid release cadence and more effort put into stability backports. I think this would really be great for the project. Colin On Mon, Jun 22, 2015 at 2:43 AM, Akira AJISAKA wrote: > Hi everyone, > > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache > Hadoop developers at Yahoo!, Twitter, and other non-distributors work very > hard to maintenance Hadoop by cherry-picking patches to their own branches. > > I want to share the work with the community. If we can cherry-pick bug fix > patches and have more maintenance releases, it'd be very happy not only for > users but also for developers who work very hard for stabilizing their own > branches. > > To have more maintenance releases, I propose two changes: > > * Major/Minor/Trivial bug fixes can be cherry-picked > * (Roughly) Monthly maintenance release > > I would like to start the work from branch-2.6. If the change will be > accepted by the community, I'm willing to work for the maintenance, as a > release manager. > > Best regards, > Akira
Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)
+1 for making this a separate project. We've always struggled with a lot of forks of the test-patch code and perhaps this project can help create something that works well for multiple projects. Bypassing the incubator seems kind of weird (I didn't know that was an option) but I will let other people with more experience in the ASF comment on that. You mentioned that "most of our project will be focused on shell scripts" I guess based on the existing test-patch code. Allen did a lot of good work in this area recently. I am curious if you evaluated languages such as Python or Node.js for this use-case. Shell scripts can get a little... tricky beyond a certain size. On the other hand, if we are standardizing on shell, which shell and which version? Perhaps bash 3.5+? Also, what will be the mechanism for customizing this for each project? Ideally the customizations needed would be small so we could share the most code. cheers, Colin On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey wrote: > I'm going to try responding to several things at once here, so apologies if > I miss anyone and sorry for the long email. :) > > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran > wrote: > >> I think it's good to have a general build/test process projects can share, >> so +1 to pulling it out. You should get help from others. >> >> regarding incubation, it is a lot of work, especially for something that's >> more of an in-house tool than an artifact to release and redistribute. >> >> You can't just use apache labs or the build project's repo to work on this? >> >> if you do want to incubate, we may want to nominate the hadoop project as >> the monitoring PMC, rather than incubator@. >> >> -steve >> >> > Important note: we're proposing a board resolution that would directly pull > this code base out into a new TLP; there'd be no incubator, we'd just > continue building community and start making releases. > > The proposed PMC believes the tooling we're talking about has direct > applicability to projects well outside of the ASF. Lot's of other open > source projects run on community contributions and have a general need for > better QA tools. Given that problem set and the presence of a community > working to solve it, there's no reason this needs to be treated as an > in-house build project. We certainly want to be useful to ASF projects and > getting them on-board given our current optimization for ASF infra will > certainly be easier, but we're not limited to that (and our current > prerequisites, a CI tool and jira or github, are pretty broadly available). > > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk wrote: > >> >> Since we're tossing out names, how about Apache Bootstrap? It's a >> meta-project to help other projects get off the ground, after all. >> > > > There's already a web development framework named Bootstrap[1]. It's also > used by several ASF projects, so I think it best to avoid the confusion. > > The name is, of course, up to the proposed PMC. As a bit of background, the > current name Yetus fulfills Allen's desire to have something shell related > and my desire to have a project that starts with Y (there are currently no > ASF projects that start with Y). The universe of names that fill in these > two is very small, AFAICT. I did a brief suitability search and didn't find > any blockers. > > > On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer > wrote: > >> >> Since a couple of people have brought it up: >> >> I think the release question is probably one of the big question >> marks. Other than tar balls, how does something like this actually get >> used downstream? >> >> For test-patch, in particular, I have a few thoughts on this: >> >> Short term: >> >> * Projects that want to move RIGHT NOW would modify their Jenkins >> jobs to checkout from the Yetus repo (preferably at a well known tag or >> branch) in one directory and their project repo in another directory. Then >> it’s just a matter of passing the correct flags to test-patch. This is >> pretty much how I’ve been personally running test-patch for about 6 months >> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already. >> >> * Create a stub version of test-patch that projects could check >> into their repo, replacing the existing test-patch. This stub version >> would git clone from either ASF or github and then execute test-patch >> accordingly on demand. With the correct smarts, it could make sure it has >> a cached version to prevent continual clones. >> >> Longer term: >> >> * I’ve been toying with the idea of (ab)using Java repos and >> packaging as a transportation layer, either in addition or in combination >> with something like a maven plugin. Something like this would clearly be >> better for offline usage and/or to lower the network traffic. >> > > It's important that the project follow ASF guidelines on publishing > releases[2]. So long as we publish rel
Re: [DISCUSS] More Maintenance Releases
+1 for the idea of maintenance releases. Considering the amount code changes done in trunk and branch-2, cherry-picking may not be easy and straight forward in all issues. I would love to help in cherry-picking the fixes and reviewing them. I would also love to help in release process. Regards, Vinay On Mon, Jun 22, 2015 at 9:49 PM, Allen Wittenauer wrote: > > If 2.6 is the target, someone will have to verify that any > cherry-picked patches actually work with JDK6 since the PMC voted to > officially kill backward compatibility in a minor release. It’s going to be > easier and probably smarter to fix 2.7 if that’s really desired. [1] > > Frankly, I’d rather see effort spent on stabilizing trunk and > ditching the now broken branch-2. We’re approaching the 4 year anniversary > of 0.23.0’s release (which later begat 2.x, which is already past the 3 > year mark). It’s hard to claim health when its been so long since a branch > off of trunk was cut and turned into something official. > > [1] Kengo and I are hard at work getting multiJDK testing working in > Yetus, but it’s not quite ready for prime time. :( It could certain help > here, but… it’s not very stable yet. > > On Jun 22, 2015, at 7:50 AM, Karthik Kambatla wrote: > > > Thanks for starting this thread, Akira. > > > > +1 to more maintenance releases. More stable upstream releases avoids > > duplicating cherry-pick work across consumers/vendors, and shows the > > maturity of the project to users. > > > > I see value in backporting blocker/critical issues, but have mixed > feelings > > about doing the same for major/minor/trivial issues. IMO, every commit > has > > non-zero potential to introduce other bugs. Depending on the kind of fix > > (say, documentation), it might be okay to include these non-critical > fixes. > > One approach could be to allow all bug fixes for 2.x.1, blocker/critical > > for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure > > increasing stability of maintenance releases? > > > > I am also +1 to any committer picking up RM duties for a maintenance > > release. It is healthy to have more people participate in the release > > process, so long as we have some method to maintenance release madness. > > > > A committer (who is not yet a PMC member) could be a Release Manager, but > > his vote is not binding for the release. I RM-ed the 2.5.x releases as a > > committer. RM-ing a release and voting non-binding could be a good way to > > remind the PMC to include the committer in PMC :) > > > > Cheers > > Karthik > > > > On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa > wrote: > > > >> Hi Akira, > >> > >> Thank you for starting interesting topic. +1 on the idea of More > >> Maintenance Releases for old branches. It would be good if this > >> activity is more coupled with Apache Yetus for users. > >> > >> BTW, I don't know one of committers, who is not PMC, can be a release > >> manager. Does anyone know about this? It's described in detail as > >> follows: http://hadoop.apache.org/bylaws#Decision+Making > >> > >>> Release Manager > >>> A Release Manager (RM) is a committer who volunteers to produce a > >> Release Candidate according to HowToRelease. > >>> > >>> Project Management Committee > >>> Deciding what is distributed as products of the Apache Hadoop project. > >> In particular all releases must be approved by the PMC > >> > >> Thanks, > >> - Tsuyoshi > >> > >> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA > >> wrote: > >>> Hi everyone, > >>> > >>> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that > Apache > >>> Hadoop developers at Yahoo!, Twitter, and other non-distributors work > >> very > >>> hard to maintenance Hadoop by cherry-picking patches to their own > >> branches. > >>> > >>> I want to share the work with the community. If we can cherry-pick bug > >> fix > >>> patches and have more maintenance releases, it'd be very happy not only > >> for > >>> users but also for developers who work very hard for stabilizing their > >> own > >>> branches. > >>> > >>> To have more maintenance releases, I propose two changes: > >>> > >>> * Major/Minor/Trivial bug fixes can be cherry-picked > >>> * (Roughly) Monthly maintenance release > >>> > >>> I would like to start the work from branch-2.6. If the change will be > >>> accepted by the community, I'm willing to work for the maintenance, as > a > >>> release manager. > >>> > >>> Best regards, > >>> Akira > >> > > > > > > > > -- > > Karthik Kambatla > > Software Engineer, Cloudera Inc. > > > > http://five.sentenc.es > >
Re: [DISCUSS] More Maintenance Releases
If 2.6 is the target, someone will have to verify that any cherry-picked patches actually work with JDK6 since the PMC voted to officially kill backward compatibility in a minor release. It’s going to be easier and probably smarter to fix 2.7 if that’s really desired. [1] Frankly, I’d rather see effort spent on stabilizing trunk and ditching the now broken branch-2. We’re approaching the 4 year anniversary of 0.23.0’s release (which later begat 2.x, which is already past the 3 year mark). It’s hard to claim health when its been so long since a branch off of trunk was cut and turned into something official. [1] Kengo and I are hard at work getting multiJDK testing working in Yetus, but it’s not quite ready for prime time. :( It could certain help here, but… it’s not very stable yet. On Jun 22, 2015, at 7:50 AM, Karthik Kambatla wrote: > Thanks for starting this thread, Akira. > > +1 to more maintenance releases. More stable upstream releases avoids > duplicating cherry-pick work across consumers/vendors, and shows the > maturity of the project to users. > > I see value in backporting blocker/critical issues, but have mixed feelings > about doing the same for major/minor/trivial issues. IMO, every commit has > non-zero potential to introduce other bugs. Depending on the kind of fix > (say, documentation), it might be okay to include these non-critical fixes. > One approach could be to allow all bug fixes for 2.x.1, blocker/critical > for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure > increasing stability of maintenance releases? > > I am also +1 to any committer picking up RM duties for a maintenance > release. It is healthy to have more people participate in the release > process, so long as we have some method to maintenance release madness. > > A committer (who is not yet a PMC member) could be a Release Manager, but > his vote is not binding for the release. I RM-ed the 2.5.x releases as a > committer. RM-ing a release and voting non-binding could be a good way to > remind the PMC to include the committer in PMC :) > > Cheers > Karthik > > On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa wrote: > >> Hi Akira, >> >> Thank you for starting interesting topic. +1 on the idea of More >> Maintenance Releases for old branches. It would be good if this >> activity is more coupled with Apache Yetus for users. >> >> BTW, I don't know one of committers, who is not PMC, can be a release >> manager. Does anyone know about this? It's described in detail as >> follows: http://hadoop.apache.org/bylaws#Decision+Making >> >>> Release Manager >>> A Release Manager (RM) is a committer who volunteers to produce a >> Release Candidate according to HowToRelease. >>> >>> Project Management Committee >>> Deciding what is distributed as products of the Apache Hadoop project. >> In particular all releases must be approved by the PMC >> >> Thanks, >> - Tsuyoshi >> >> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA >> wrote: >>> Hi everyone, >>> >>> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache >>> Hadoop developers at Yahoo!, Twitter, and other non-distributors work >> very >>> hard to maintenance Hadoop by cherry-picking patches to their own >> branches. >>> >>> I want to share the work with the community. If we can cherry-pick bug >> fix >>> patches and have more maintenance releases, it'd be very happy not only >> for >>> users but also for developers who work very hard for stabilizing their >> own >>> branches. >>> >>> To have more maintenance releases, I propose two changes: >>> >>> * Major/Minor/Trivial bug fixes can be cherry-picked >>> * (Roughly) Monthly maintenance release >>> >>> I would like to start the work from branch-2.6. If the change will be >>> accepted by the community, I'm willing to work for the maintenance, as a >>> release manager. >>> >>> Best regards, >>> Akira >> > > > > -- > Karthik Kambatla > Software Engineer, Cloudera Inc. > > http://five.sentenc.es
Re: [DISCUSS] More Maintenance Releases
Thanks for starting this thread, Akira. +1 to more maintenance releases. More stable upstream releases avoids duplicating cherry-pick work across consumers/vendors, and shows the maturity of the project to users. I see value in backporting blocker/critical issues, but have mixed feelings about doing the same for major/minor/trivial issues. IMO, every commit has non-zero potential to introduce other bugs. Depending on the kind of fix (say, documentation), it might be okay to include these non-critical fixes. One approach could be to allow all bug fixes for 2.x.1, blocker/critical for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure increasing stability of maintenance releases? I am also +1 to any committer picking up RM duties for a maintenance release. It is healthy to have more people participate in the release process, so long as we have some method to maintenance release madness. A committer (who is not yet a PMC member) could be a Release Manager, but his vote is not binding for the release. I RM-ed the 2.5.x releases as a committer. RM-ing a release and voting non-binding could be a good way to remind the PMC to include the committer in PMC :) Cheers Karthik On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa wrote: > Hi Akira, > > Thank you for starting interesting topic. +1 on the idea of More > Maintenance Releases for old branches. It would be good if this > activity is more coupled with Apache Yetus for users. > > BTW, I don't know one of committers, who is not PMC, can be a release > manager. Does anyone know about this? It's described in detail as > follows: http://hadoop.apache.org/bylaws#Decision+Making > > > Release Manager > > A Release Manager (RM) is a committer who volunteers to produce a > Release Candidate according to HowToRelease. > > > > Project Management Committee > > Deciding what is distributed as products of the Apache Hadoop project. > In particular all releases must be approved by the PMC > > Thanks, > - Tsuyoshi > > On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA > wrote: > > Hi everyone, > > > > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache > > Hadoop developers at Yahoo!, Twitter, and other non-distributors work > very > > hard to maintenance Hadoop by cherry-picking patches to their own > branches. > > > > I want to share the work with the community. If we can cherry-pick bug > fix > > patches and have more maintenance releases, it'd be very happy not only > for > > users but also for developers who work very hard for stabilizing their > own > > branches. > > > > To have more maintenance releases, I propose two changes: > > > > * Major/Minor/Trivial bug fixes can be cherry-picked > > * (Roughly) Monthly maintenance release > > > > I would like to start the work from branch-2.6. If the change will be > > accepted by the community, I'm willing to work for the maintenance, as a > > release manager. > > > > Best regards, > > Akira > -- Karthik Kambatla Software Engineer, Cloudera Inc. http://five.sentenc.es
Re: [DISCUSS] More Maintenance Releases
More maintenance releases would be excellent. If y'all are going to make more releases on the 2.6 line, please consider backporting HADOOP-11710 as without it HBase is unusable on top of HDFS encryption. It's been inconvenient that the fix is only available in a non-production release line. -Sean On Mon, Jun 22, 2015 at 6:36 AM, Tsuyoshi Ozawa wrote: > Hi Akira, > > Thank you for starting interesting topic. +1 on the idea of More > Maintenance Releases for old branches. It would be good if this > activity is more coupled with Apache Yetus for users. > > BTW, I don't know one of committers, who is not PMC, can be a release > manager. Does anyone know about this? It's described in detail as > follows: http://hadoop.apache.org/bylaws#Decision+Making > > > Release Manager > > A Release Manager (RM) is a committer who volunteers to produce a > Release Candidate according to HowToRelease. > > > > Project Management Committee > > Deciding what is distributed as products of the Apache Hadoop project. > In particular all releases must be approved by the PMC > > Thanks, > - Tsuyoshi > > On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA > wrote: > > Hi everyone, > > > > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache > > Hadoop developers at Yahoo!, Twitter, and other non-distributors work > very > > hard to maintenance Hadoop by cherry-picking patches to their own > branches. > > > > I want to share the work with the community. If we can cherry-pick bug > fix > > patches and have more maintenance releases, it'd be very happy not only > for > > users but also for developers who work very hard for stabilizing their > own > > branches. > > > > To have more maintenance releases, I propose two changes: > > > > * Major/Minor/Trivial bug fixes can be cherry-picked > > * (Roughly) Monthly maintenance release > > > > I would like to start the work from branch-2.6. If the change will be > > accepted by the community, I'm willing to work for the maintenance, as a > > release manager. > > > > Best regards, > > Akira > -- Sean
[jira] [Created] (HADOOP-12108) Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs
Aman Goyal created HADOOP-12108: --- Summary: Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs Key: HADOOP-12108 URL: https://issues.apache.org/jira/browse/HADOOP-12108 Project: Hadoop Common Issue Type: Bug Reporter: Aman Goyal Priority: Critical If you have following directories in your LOCAL file system /data/hadoop/sample/00/contents1.txt /data/hadoop/sample/01/contents2.txt and following directories in hdfs : /data/hadoop/sample/00/contents1.txt /data/hadoop/sample/01/contents2.txt /data/hadoop/sample/02/contents3.txt suppose you run the following hdfs ls command: hdfs dfs -ls -R /data/hadoop/sample/* the paths that are printed have a reference to local paths, and only 00 & 01 directories get listed. this happens only when wildcard (*) character is used in input paths. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] More Maintenance Releases
Hi Akira, Thank you for starting interesting topic. +1 on the idea of More Maintenance Releases for old branches. It would be good if this activity is more coupled with Apache Yetus for users. BTW, I don't know one of committers, who is not PMC, can be a release manager. Does anyone know about this? It's described in detail as follows: http://hadoop.apache.org/bylaws#Decision+Making > Release Manager > A Release Manager (RM) is a committer who volunteers to produce a Release > Candidate according to HowToRelease. > > Project Management Committee > Deciding what is distributed as products of the Apache Hadoop project. In > particular all releases must be approved by the PMC Thanks, - Tsuyoshi On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA wrote: > Hi everyone, > > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache > Hadoop developers at Yahoo!, Twitter, and other non-distributors work very > hard to maintenance Hadoop by cherry-picking patches to their own branches. > > I want to share the work with the community. If we can cherry-pick bug fix > patches and have more maintenance releases, it'd be very happy not only for > users but also for developers who work very hard for stabilizing their own > branches. > > To have more maintenance releases, I propose two changes: > > * Major/Minor/Trivial bug fixes can be cherry-picked > * (Roughly) Monthly maintenance release > > I would like to start the work from branch-2.6. If the change will be > accepted by the community, I'm willing to work for the maintenance, as a > release manager. > > Best regards, > Akira
[DISCUSS] More Maintenance Releases
Hi everyone, In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache Hadoop developers at Yahoo!, Twitter, and other non-distributors work very hard to maintenance Hadoop by cherry-picking patches to their own branches. I want to share the work with the community. If we can cherry-pick bug fix patches and have more maintenance releases, it'd be very happy not only for users but also for developers who work very hard for stabilizing their own branches. To have more maintenance releases, I propose two changes: * Major/Minor/Trivial bug fixes can be cherry-picked * (Roughly) Monthly maintenance release I would like to start the work from branch-2.6. If the change will be accepted by the community, I'm willing to work for the maintenance, as a release manager. Best regards, Akira