[jira] [Created] (HADOOP-12111) Split test-patch off into its own TLP

2015-06-22 Thread Allen Wittenauer (JIRA)
Allen Wittenauer created HADOOP-12111:
-

 Summary: Split test-patch off into its own TLP
 Key: HADOOP-12111
 URL: https://issues.apache.org/jira/browse/HADOOP-12111
 Project: Hadoop Common
  Issue Type: Bug
  Components: yetus
Reporter: Allen Wittenauer


Given test-patch's tendency to get forked into a variety of different projects, 
it makes a lot of sense to make an Apache TLP so that everyone can benefit from 
a common code base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-12110) Consolidate usage of JSON libraries

2015-06-22 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang resolved HADOOP-12110.

Resolution: Invalid

Opened for the wrong project.  Sorry, close as invalid.

> Consolidate usage of JSON libraries
> ---
>
> Key: HADOOP-12110
> URL: https://issues.apache.org/jira/browse/HADOOP-12110
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Eric Yang
>Assignee: Eric Yang
>
> Chukwa uses JSON jar from json.org and also json-simple from googlecode.  It 
> would be nice if we only use one implementation of JSON to be consistent.  
> Mindev JSON-smart was also considered as replacement for JSON simple to 
> improve performance, but it doesn't handle some characters correctly.  
> Therefore, it's best to use JSON Simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12110) Consolidate usage of JSON libraries

2015-06-22 Thread Eric Yang (JIRA)
Eric Yang created HADOOP-12110:
--

 Summary: Consolidate usage of JSON libraries
 Key: HADOOP-12110
 URL: https://issues.apache.org/jira/browse/HADOOP-12110
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Eric Yang
Assignee: Eric Yang


Chukwa uses JSON jar from json.org and also json-simple from googlecode.  It 
would be nice if we only use one implementation of JSON to be consistent.  
Mindev JSON-smart was also considered as replacement for JSON simple to improve 
performance, but it doesn't handle some characters correctly.  Therefore, it's 
best to use JSON Simple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12109) Distcp of file > 5GB to swift fails with HTTP 413 error

2015-06-22 Thread Phil D'Amore (JIRA)
Phil D'Amore created HADOOP-12109:
-

 Summary: Distcp of file > 5GB to swift fails with HTTP 413 error
 Key: HADOOP-12109
 URL: https://issues.apache.org/jira/browse/HADOOP-12109
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/swift
Affects Versions: 2.6.0
Reporter: Phil D'Amore


Trying to use distcp to copy a file more than 5GB to swift fs results in a 
stack like the following:

15/06/01 20:58:57 ERROR util.RetriableCommand: Failure in Retriable command: 
Copying hdfs://xxx:8020/path/to/random-5Gplus.dat to swift://xxx/5Gplus.dat
Invalid Response: Method COPY on 
http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
 failed, status code: 413, status line: HTTP/1.1 413 Request Entity Too Large  
COPY 
http://xxx:8080/v1/AUTH_fb7a8901dd8d4c8dba27f5e5d55a46a9/test/.distcp.tmp.attempt_local1097967418_0001_m_00_0
 => 413 : Request Entity Too LargeThe body of your request 
was too large for this server.
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.buildException(SwiftRestClient.java:1502)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.perform(SwiftRestClient.java:1403)
at 
org.apache.hadoop.fs.swift.http.SwiftRestClient.copyObject(SwiftRestClient.java:923)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.copyObject(SwiftNativeFileSystemStore.java:765)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystemStore.rename(SwiftNativeFileSystemStore.java:617)
at 
org.apache.hadoop.fs.swift.snative.SwiftNativeFileSystem.rename(SwiftNativeFileSystem.java:577)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.promoteTmpToTarget(RetriableFileCopyCommand.java:220)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:137)
at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:100)
at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:280)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:252)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

It looks like the problem actually occurs in the rename operation which happens 
after the copy.  The rename is implemented as a copy/delete, and this secondary 
copy looks like it's not done in a way that breaks up the file into smaller 
chunks.  

It looks like the following bug:

https://bugs.launchpad.net/sahara/+bug/1428941

It does not look like the fix for this is incorporated into hadoop's swift 
client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-22 Thread Andrew Purtell
On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk  wrote:

> On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe 
> wrote:
>
> > You mentioned that "most of our project will be focused on shell
> > scripts" I guess based on the existing test-patch code.  Allen did a
> > lot of good work in this area recently.  I am curious if you evaluated
> > languages such as Python or Node.js for this use-case.  Shell scripts
> > can get a little... tricky beyond a certain size.  On the other hand,
> > if we are standardizing on shell, which shell and which version?
> > Perhaps bash 3.5+?
> >
>
> I'll also add that shell is not helpful for a cross-platform set of
> tooling. I recently added a daemon to Apache Phoenix; an explicit
> requirement was Windows support. I ended up implementing a solution in
> python because that environment is platform-agnostic and still systems-y
> enough. I think this is something this project should seriously consider.
>

In my opinion, historically, test-patch hasn't needed to be cross platform
because the only first class development environment for Hadoop has been
Linux. Growing beyond this could absolutely be one focus of Yetus should
that be a consensus goal of the community. The seed of the project, though,
is today's test-patch, which is implemented in bash. That's where we are
today. Language "discussions" (smile) can and should be forward looking.


On Mon, Jun 22, 2015 at 1:03 PM, Nick Dimiduk  wrote:

> On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe 
> wrote:
>
> > You mentioned that "most of our project will be focused on shell
> > scripts" I guess based on the existing test-patch code.  Allen did a
> > lot of good work in this area recently.  I am curious if you evaluated
> > languages such as Python or Node.js for this use-case.  Shell scripts
> > can get a little... tricky beyond a certain size.  On the other hand,
> > if we are standardizing on shell, which shell and which version?
> > Perhaps bash 3.5+?
> >
>
> I'll also add that shell is not helpful for a cross-platform set of
> tooling. I recently added a daemon to Apache Phoenix; an explicit
> requirement was Windows support. I ended up implementing a solution in
> python because that environment is platform-agnostic and still systems-y
> enough. I think this is something this project should seriously consider.
>
> -n
>
> On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey  wrote:
> > > I'm going to try responding to several things at once here, so
> apologies
> > if
> > > I miss anyone and sorry for the long email. :)
> > >
> > >
> > > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran <
> ste...@hortonworks.com>
> > > wrote:
> > >
> > >> I think it's good to have a general build/test process projects can
> > share,
> > >> so +1 to pulling it out. You should get help from others.
> > >>
> > >> regarding incubation, it is a lot of work, especially for something
> > that's
> > >> more of an in-house tool than an artifact to release and redistribute.
> > >>
> > >> You can't just use apache labs or the build project's repo to work on
> > this?
> > >>
> > >> if you do want to incubate, we may want to nominate the hadoop project
> > as
> > >> the monitoring PMC, rather than incubator@.
> > >>
> > >> -steve
> > >>
> > >>
> > > Important note: we're proposing a board resolution that would directly
> > pull
> > > this code base out into a new TLP; there'd be no incubator, we'd just
> > > continue building community and start making releases.
> > >
> > > The proposed PMC believes the tooling we're talking about has direct
> > > applicability to projects well outside of the ASF. Lot's of other open
> > > source projects run on community contributions and have a general need
> > for
> > > better QA tools. Given that problem set and the presence of a community
> > > working to solve it, there's no reason this needs to be treated as an
> > > in-house build project. We certainly want to be useful to ASF projects
> > and
> > > getting them on-board given our current optimization for ASF infra will
> > > certainly be easier, but we're not limited to that (and our current
> > > prerequisites, a CI tool and jira or github, are pretty broadly
> > available).
> > >
> > >
> > > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk 
> > wrote:
> > >
> > >>
> > >> Since we're tossing out names, how about Apache Bootstrap? It's a
> > >> meta-project to help other projects get off the ground, after all.
> > >>
> > >
> > >
> > > There's already a web development framework named Bootstrap[1]. It's
> also
> > > used by several ASF projects, so I think it best to avoid the
> confusion.
> > >
> > > The name is, of course, up to the proposed PMC. As a bit of background,
> > the
> > > current name Yetus fulfills Allen's desire to have something shell
> > related
> > > and my desire to have a project that starts with Y (there are currently
> > no
> > > ASF projects that start with Y). The universe of names that fill in
> these
> > > two is very small, AFAICT. I did a brie

[jira] [Resolved] (HADOOP-12108) Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs

2015-06-22 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash resolved HADOOP-12108.
---
Resolution: Invalid

Thanks Aman! Steve is right. You do need to use quotes when there is already a 
file on the local file system which would match the wildcard

> Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs 
> 
>
> Key: HADOOP-12108
> URL: https://issues.apache.org/jira/browse/HADOOP-12108
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Aman Goyal
>Priority: Critical
>
> If you have following directories in your LOCAL file system 
> /data/hadoop/sample/00/contents1.txt
> /data/hadoop/sample/01/contents2.txt
> and following directories in hdfs : 
> /data/hadoop/sample/00/contents1.txt
> /data/hadoop/sample/01/contents2.txt
> /data/hadoop/sample/02/contents3.txt
> suppose you run the following hdfs ls command:
> hdfs dfs -ls -R /data/hadoop/sample/*
> the paths that are printed have a reference to local paths, and only 00 & 01 
> directories get listed. 
> this happens only when wildcard ( * ) character is used in input paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-22 Thread Nick Dimiduk
On Mon, Jun 22, 2015 at 12:43 PM, Colin P. McCabe 
wrote:

> You mentioned that "most of our project will be focused on shell
> scripts" I guess based on the existing test-patch code.  Allen did a
> lot of good work in this area recently.  I am curious if you evaluated
> languages such as Python or Node.js for this use-case.  Shell scripts
> can get a little... tricky beyond a certain size.  On the other hand,
> if we are standardizing on shell, which shell and which version?
> Perhaps bash 3.5+?
>

I'll also add that shell is not helpful for a cross-platform set of
tooling. I recently added a daemon to Apache Phoenix; an explicit
requirement was Windows support. I ended up implementing a solution in
python because that environment is platform-agnostic and still systems-y
enough. I think this is something this project should seriously consider.

-n

On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey  wrote:
> > I'm going to try responding to several things at once here, so apologies
> if
> > I miss anyone and sorry for the long email. :)
> >
> >
> > On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran 
> > wrote:
> >
> >> I think it's good to have a general build/test process projects can
> share,
> >> so +1 to pulling it out. You should get help from others.
> >>
> >> regarding incubation, it is a lot of work, especially for something
> that's
> >> more of an in-house tool than an artifact to release and redistribute.
> >>
> >> You can't just use apache labs or the build project's repo to work on
> this?
> >>
> >> if you do want to incubate, we may want to nominate the hadoop project
> as
> >> the monitoring PMC, rather than incubator@.
> >>
> >> -steve
> >>
> >>
> > Important note: we're proposing a board resolution that would directly
> pull
> > this code base out into a new TLP; there'd be no incubator, we'd just
> > continue building community and start making releases.
> >
> > The proposed PMC believes the tooling we're talking about has direct
> > applicability to projects well outside of the ASF. Lot's of other open
> > source projects run on community contributions and have a general need
> for
> > better QA tools. Given that problem set and the presence of a community
> > working to solve it, there's no reason this needs to be treated as an
> > in-house build project. We certainly want to be useful to ASF projects
> and
> > getting them on-board given our current optimization for ASF infra will
> > certainly be easier, but we're not limited to that (and our current
> > prerequisites, a CI tool and jira or github, are pretty broadly
> available).
> >
> >
> > On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk 
> wrote:
> >
> >>
> >> Since we're tossing out names, how about Apache Bootstrap? It's a
> >> meta-project to help other projects get off the ground, after all.
> >>
> >
> >
> > There's already a web development framework named Bootstrap[1]. It's also
> > used by several ASF projects, so I think it best to avoid the confusion.
> >
> > The name is, of course, up to the proposed PMC. As a bit of background,
> the
> > current name Yetus fulfills Allen's desire to have something shell
> related
> > and my desire to have a project that starts with Y (there are currently
> no
> > ASF projects that start with Y). The universe of names that fill in these
> > two is very small, AFAICT. I did a brief suitability search and didn't
> find
> > any blockers.
> >
> >
> >  On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer 
> >  wrote:
> >
> >>
> >> Since a couple of people have brought it up:
> >>
> >> I think the release question is probably one of the big question
> >> marks.  Other than tar balls, how does something like this actually get
> >> used downstream?
> >>
> >> For test-patch, in particular, I have a few thoughts on this:
> >>
> >> Short term:
> >>
> >> * Projects that want to move RIGHT NOW would modify their
> Jenkins
> >> jobs to checkout from the Yetus repo (preferably at a well known tag or
> >> branch) in one directory and their project repo in another directory.
> Then
> >> it’s just a matter of passing the correct flags to test-patch.  This is
> >> pretty much how I’ve been personally running test-patch for about 6
> months
> >> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already.
> >>
> >> * Create a stub version of test-patch that projects could check
> >> into their repo, replacing the existing test-patch.  This stub version
> >> would git clone from either ASF or github and then execute test-patch
> >> accordingly on demand.  With the correct smarts, it could make sure it
> has
> >> a cached version to prevent continual clones.
> >>
> >> Longer term:
> >>
> >> * I’ve been toying with the idea of (ab)using Java repos and
> >> packaging as a transportation layer, either in addition or in
> combination
> >> with something like a maven plugin.  Something like this would clearly
> be
> >> better for offline usage and/or to lower the network tra

Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Colin P. McCabe
+1 for creating a maintenance release with a more rapid release
cadence and more effort put into stability backports.  I think this
would really be great for the project.

Colin

On Mon, Jun 22, 2015 at 2:43 AM, Akira AJISAKA
 wrote:
> Hi everyone,
>
> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
> Hadoop developers at Yahoo!, Twitter, and other non-distributors work very
> hard to maintenance Hadoop by cherry-picking patches to their own branches.
>
> I want to share the work with the community. If we can cherry-pick bug fix
> patches and have more maintenance releases, it'd be very happy not only for
> users but also for developers who work very hard for stabilizing their own
> branches.
>
> To have more maintenance releases, I propose two changes:
>
> * Major/Minor/Trivial bug fixes can be cherry-picked
> * (Roughly) Monthly maintenance release
>
> I would like to start the work from branch-2.6. If the change will be
> accepted by the community, I'm willing to work for the maintenance, as a
> release manager.
>
> Best regards,
> Akira


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-22 Thread Colin P. McCabe
+1 for making this a separate project.  We've always struggled with a
lot of forks of the test-patch code and perhaps this project can help
create something that works well for multiple projects.

Bypassing the incubator seems kind of weird (I didn't know that was an
option) but I will let other people with more experience in the ASF
comment on that.

You mentioned that "most of our project will be focused on shell
scripts" I guess based on the existing test-patch code.  Allen did a
lot of good work in this area recently.  I am curious if you evaluated
languages such as Python or Node.js for this use-case.  Shell scripts
can get a little... tricky beyond a certain size.  On the other hand,
if we are standardizing on shell, which shell and which version?
Perhaps bash 3.5+?

Also, what will be the mechanism for customizing this for each
project?  Ideally the customizations needed would be small so we could
share the most code.

cheers,
Colin


On Tue, Jun 16, 2015 at 7:55 PM, Sean Busbey  wrote:
> I'm going to try responding to several things at once here, so apologies if
> I miss anyone and sorry for the long email. :)
>
>
> On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran 
> wrote:
>
>> I think it's good to have a general build/test process projects can share,
>> so +1 to pulling it out. You should get help from others.
>>
>> regarding incubation, it is a lot of work, especially for something that's
>> more of an in-house tool than an artifact to release and redistribute.
>>
>> You can't just use apache labs or the build project's repo to work on this?
>>
>> if you do want to incubate, we may want to nominate the hadoop project as
>> the monitoring PMC, rather than incubator@.
>>
>> -steve
>>
>>
> Important note: we're proposing a board resolution that would directly pull
> this code base out into a new TLP; there'd be no incubator, we'd just
> continue building community and start making releases.
>
> The proposed PMC believes the tooling we're talking about has direct
> applicability to projects well outside of the ASF. Lot's of other open
> source projects run on community contributions and have a general need for
> better QA tools. Given that problem set and the presence of a community
> working to solve it, there's no reason this needs to be treated as an
> in-house build project. We certainly want to be useful to ASF projects and
> getting them on-board given our current optimization for ASF infra will
> certainly be easier, but we're not limited to that (and our current
> prerequisites, a CI tool and jira or github, are pretty broadly available).
>
>
> On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk  wrote:
>
>>
>> Since we're tossing out names, how about Apache Bootstrap? It's a
>> meta-project to help other projects get off the ground, after all.
>>
>
>
> There's already a web development framework named Bootstrap[1]. It's also
> used by several ASF projects, so I think it best to avoid the confusion.
>
> The name is, of course, up to the proposed PMC. As a bit of background, the
> current name Yetus fulfills Allen's desire to have something shell related
> and my desire to have a project that starts with Y (there are currently no
> ASF projects that start with Y). The universe of names that fill in these
> two is very small, AFAICT. I did a brief suitability search and didn't find
> any blockers.
>
>
>  On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer 
>  wrote:
>
>>
>> Since a couple of people have brought it up:
>>
>> I think the release question is probably one of the big question
>> marks.  Other than tar balls, how does something like this actually get
>> used downstream?
>>
>> For test-patch, in particular, I have a few thoughts on this:
>>
>> Short term:
>>
>> * Projects that want to move RIGHT NOW would modify their Jenkins
>> jobs to checkout from the Yetus repo (preferably at a well known tag or
>> branch) in one directory and their project repo in another directory.  Then
>> it’s just a matter of passing the correct flags to test-patch.  This is
>> pretty much how I’ve been personally running test-patch for about 6 months
>> now. Under Jenkins, we’ve seen this work with NiFi (incubating) already.
>>
>> * Create a stub version of test-patch that projects could check
>> into their repo, replacing the existing test-patch.  This stub version
>> would git clone from either ASF or github and then execute test-patch
>> accordingly on demand.  With the correct smarts, it could make sure it has
>> a cached version to prevent continual clones.
>>
>> Longer term:
>>
>> * I’ve been toying with the idea of (ab)using Java repos and
>> packaging as a transportation layer, either in addition or in combination
>> with something like a maven plugin.  Something like this would clearly be
>> better for offline usage and/or to lower the network traffic.
>>
>
> It's important that the project follow ASF guidelines on publishing
> releases[2]. So long as we publish rel

Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Vinayakumar B
+1 for the idea of maintenance releases.

Considering the amount code changes done in trunk and branch-2,
cherry-picking may not be easy and straight forward in all issues.

I would love to help in cherry-picking the fixes and reviewing them.

I would also love to help in release process.


Regards,
Vinay

On Mon, Jun 22, 2015 at 9:49 PM, Allen Wittenauer  wrote:

>
> If 2.6 is the target, someone will have to verify that any
> cherry-picked patches actually work with JDK6 since the PMC voted to
> officially kill backward compatibility in a minor release. It’s going to be
> easier and probably smarter to fix 2.7 if that’s really desired. [1]
>
> Frankly, I’d rather see effort spent on stabilizing trunk and
> ditching the now broken branch-2.  We’re approaching the 4 year anniversary
> of 0.23.0’s release (which later begat 2.x, which is already past the 3
> year mark).  It’s hard to claim health when its been so long since a branch
> off of trunk was cut and turned into something official.
>
> [1] Kengo and I are hard at work getting multiJDK testing working in
> Yetus, but it’s not quite ready for prime time. :( It could certain help
> here, but… it’s not very stable yet.
>
> On Jun 22, 2015, at 7:50 AM, Karthik Kambatla  wrote:
>
> > Thanks for starting this thread, Akira.
> >
> > +1 to more maintenance releases. More stable upstream releases avoids
> > duplicating cherry-pick work across consumers/vendors, and shows the
> > maturity of the project to users.
> >
> > I see value in backporting blocker/critical issues, but have mixed
> feelings
> > about doing the same for major/minor/trivial issues. IMO, every commit
> has
> > non-zero potential to introduce other bugs. Depending on the kind of fix
> > (say, documentation), it might be okay to include these non-critical
> fixes.
> > One approach could be to allow all bug fixes for 2.x.1, blocker/critical
> > for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure
> > increasing stability of maintenance releases?
> >
> > I am also +1 to any committer picking up RM duties for a maintenance
> > release. It is healthy to have more people participate in the release
> > process, so long as we have some method to maintenance release madness.
> >
> > A committer (who is not yet a PMC member) could be a Release Manager, but
> > his vote is not binding for the release. I RM-ed the 2.5.x releases as a
> > committer. RM-ing a release and voting non-binding could be a good way to
> > remind the PMC to include the committer in PMC :)
> >
> > Cheers
> > Karthik
> >
> > On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa 
> wrote:
> >
> >> Hi Akira,
> >>
> >> Thank you for starting interesting topic. +1 on the idea of More
> >> Maintenance Releases for old branches. It would be good if this
> >> activity is more coupled with Apache Yetus for users.
> >>
> >> BTW, I don't know one of committers, who is not PMC, can be a release
> >> manager. Does anyone know about this?  It's described in detail as
> >> follows: http://hadoop.apache.org/bylaws#Decision+Making
> >>
> >>> Release Manager
> >>> A Release Manager (RM) is a committer who volunteers to produce a
> >> Release Candidate according to HowToRelease.
> >>>
> >>> Project Management Committee
> >>> Deciding what is distributed as products of the Apache Hadoop project.
> >> In particular all releases must be approved by the PMC
> >>
> >> Thanks,
> >> - Tsuyoshi
> >>
> >> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
> >>  wrote:
> >>> Hi everyone,
> >>>
> >>> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that
> Apache
> >>> Hadoop developers at Yahoo!, Twitter, and other non-distributors work
> >> very
> >>> hard to maintenance Hadoop by cherry-picking patches to their own
> >> branches.
> >>>
> >>> I want to share the work with the community. If we can cherry-pick bug
> >> fix
> >>> patches and have more maintenance releases, it'd be very happy not only
> >> for
> >>> users but also for developers who work very hard for stabilizing their
> >> own
> >>> branches.
> >>>
> >>> To have more maintenance releases, I propose two changes:
> >>>
> >>> * Major/Minor/Trivial bug fixes can be cherry-picked
> >>> * (Roughly) Monthly maintenance release
> >>>
> >>> I would like to start the work from branch-2.6. If the change will be
> >>> accepted by the community, I'm willing to work for the maintenance, as
> a
> >>> release manager.
> >>>
> >>> Best regards,
> >>> Akira
> >>
> >
> >
> >
> > --
> > Karthik Kambatla
> > Software Engineer, Cloudera Inc.
> > 
> > http://five.sentenc.es
>
>


Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Allen Wittenauer

If 2.6 is the target, someone will have to verify that any 
cherry-picked patches actually work with JDK6 since the PMC voted to officially 
kill backward compatibility in a minor release. It’s going to be easier and 
probably smarter to fix 2.7 if that’s really desired. [1]

Frankly, I’d rather see effort spent on stabilizing trunk and ditching 
the now broken branch-2.  We’re approaching the 4 year anniversary of 0.23.0’s 
release (which later begat 2.x, which is already past the 3 year mark).  It’s 
hard to claim health when its been so long since a branch off of trunk was cut 
and turned into something official.  

[1] Kengo and I are hard at work getting multiJDK testing working in Yetus, but 
it’s not quite ready for prime time. :( It could certain help here, but… it’s 
not very stable yet.

On Jun 22, 2015, at 7:50 AM, Karthik Kambatla  wrote:

> Thanks for starting this thread, Akira.
> 
> +1 to more maintenance releases. More stable upstream releases avoids
> duplicating cherry-pick work across consumers/vendors, and shows the
> maturity of the project to users.
> 
> I see value in backporting blocker/critical issues, but have mixed feelings
> about doing the same for major/minor/trivial issues. IMO, every commit has
> non-zero potential to introduce other bugs. Depending on the kind of fix
> (say, documentation), it might be okay to include these non-critical fixes.
> One approach could be to allow all bug fixes for 2.x.1, blocker/critical
> for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure
> increasing stability of maintenance releases?
> 
> I am also +1 to any committer picking up RM duties for a maintenance
> release. It is healthy to have more people participate in the release
> process, so long as we have some method to maintenance release madness.
> 
> A committer (who is not yet a PMC member) could be a Release Manager, but
> his vote is not binding for the release. I RM-ed the 2.5.x releases as a
> committer. RM-ing a release and voting non-binding could be a good way to
> remind the PMC to include the committer in PMC :)
> 
> Cheers
> Karthik
> 
> On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa  wrote:
> 
>> Hi Akira,
>> 
>> Thank you for starting interesting topic. +1 on the idea of More
>> Maintenance Releases for old branches. It would be good if this
>> activity is more coupled with Apache Yetus for users.
>> 
>> BTW, I don't know one of committers, who is not PMC, can be a release
>> manager. Does anyone know about this?  It's described in detail as
>> follows: http://hadoop.apache.org/bylaws#Decision+Making
>> 
>>> Release Manager
>>> A Release Manager (RM) is a committer who volunteers to produce a
>> Release Candidate according to HowToRelease.
>>> 
>>> Project Management Committee
>>> Deciding what is distributed as products of the Apache Hadoop project.
>> In particular all releases must be approved by the PMC
>> 
>> Thanks,
>> - Tsuyoshi
>> 
>> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
>>  wrote:
>>> Hi everyone,
>>> 
>>> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
>>> Hadoop developers at Yahoo!, Twitter, and other non-distributors work
>> very
>>> hard to maintenance Hadoop by cherry-picking patches to their own
>> branches.
>>> 
>>> I want to share the work with the community. If we can cherry-pick bug
>> fix
>>> patches and have more maintenance releases, it'd be very happy not only
>> for
>>> users but also for developers who work very hard for stabilizing their
>> own
>>> branches.
>>> 
>>> To have more maintenance releases, I propose two changes:
>>> 
>>> * Major/Minor/Trivial bug fixes can be cherry-picked
>>> * (Roughly) Monthly maintenance release
>>> 
>>> I would like to start the work from branch-2.6. If the change will be
>>> accepted by the community, I'm willing to work for the maintenance, as a
>>> release manager.
>>> 
>>> Best regards,
>>> Akira
>> 
> 
> 
> 
> -- 
> Karthik Kambatla
> Software Engineer, Cloudera Inc.
> 
> http://five.sentenc.es



Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Karthik Kambatla
Thanks for starting this thread, Akira.

+1 to more maintenance releases. More stable upstream releases avoids
duplicating cherry-pick work across consumers/vendors, and shows the
maturity of the project to users.

I see value in backporting blocker/critical issues, but have mixed feelings
about doing the same for major/minor/trivial issues. IMO, every commit has
non-zero potential to introduce other bugs. Depending on the kind of fix
(say, documentation), it might be okay to include these non-critical fixes.
One approach could be to allow all bug fixes for 2.x.1, blocker/critical
for 2.x.2, blocker for 2.x.3 (or something along those lines) to ensure
increasing stability of maintenance releases?

I am also +1 to any committer picking up RM duties for a maintenance
release. It is healthy to have more people participate in the release
process, so long as we have some method to maintenance release madness.

A committer (who is not yet a PMC member) could be a Release Manager, but
his vote is not binding for the release. I RM-ed the 2.5.x releases as a
committer. RM-ing a release and voting non-binding could be a good way to
remind the PMC to include the committer in PMC :)

Cheers
Karthik

On Mon, Jun 22, 2015 at 4:36 AM, Tsuyoshi Ozawa  wrote:

> Hi Akira,
>
> Thank you for starting interesting topic. +1 on the idea of More
> Maintenance Releases for old branches. It would be good if this
> activity is more coupled with Apache Yetus for users.
>
> BTW, I don't know one of committers, who is not PMC, can be a release
> manager. Does anyone know about this?  It's described in detail as
> follows: http://hadoop.apache.org/bylaws#Decision+Making
>
> > Release Manager
> > A Release Manager (RM) is a committer who volunteers to produce a
> Release Candidate according to HowToRelease.
> >
> > Project Management Committee
> > Deciding what is distributed as products of the Apache Hadoop project.
> In particular all releases must be approved by the PMC
>
> Thanks,
> - Tsuyoshi
>
> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
>  wrote:
> > Hi everyone,
> >
> > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
> > Hadoop developers at Yahoo!, Twitter, and other non-distributors work
> very
> > hard to maintenance Hadoop by cherry-picking patches to their own
> branches.
> >
> > I want to share the work with the community. If we can cherry-pick bug
> fix
> > patches and have more maintenance releases, it'd be very happy not only
> for
> > users but also for developers who work very hard for stabilizing their
> own
> > branches.
> >
> > To have more maintenance releases, I propose two changes:
> >
> > * Major/Minor/Trivial bug fixes can be cherry-picked
> > * (Roughly) Monthly maintenance release
> >
> > I would like to start the work from branch-2.6. If the change will be
> > accepted by the community, I'm willing to work for the maintenance, as a
> > release manager.
> >
> > Best regards,
> > Akira
>



-- 
Karthik Kambatla
Software Engineer, Cloudera Inc.

http://five.sentenc.es


Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Sean Busbey
More maintenance releases would be excellent.


If y'all are going to make more releases on the 2.6 line, please consider
backporting HADOOP-11710 as without it HBase is unusable on top of HDFS
encryption. It's been inconvenient that the fix is only available in a
non-production release line.

-Sean

On Mon, Jun 22, 2015 at 6:36 AM, Tsuyoshi Ozawa  wrote:

> Hi Akira,
>
> Thank you for starting interesting topic. +1 on the idea of More
> Maintenance Releases for old branches. It would be good if this
> activity is more coupled with Apache Yetus for users.
>
> BTW, I don't know one of committers, who is not PMC, can be a release
> manager. Does anyone know about this?  It's described in detail as
> follows: http://hadoop.apache.org/bylaws#Decision+Making
>
> > Release Manager
> > A Release Manager (RM) is a committer who volunteers to produce a
> Release Candidate according to HowToRelease.
> >
> > Project Management Committee
> > Deciding what is distributed as products of the Apache Hadoop project.
> In particular all releases must be approved by the PMC
>
> Thanks,
> - Tsuyoshi
>
> On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
>  wrote:
> > Hi everyone,
> >
> > In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
> > Hadoop developers at Yahoo!, Twitter, and other non-distributors work
> very
> > hard to maintenance Hadoop by cherry-picking patches to their own
> branches.
> >
> > I want to share the work with the community. If we can cherry-pick bug
> fix
> > patches and have more maintenance releases, it'd be very happy not only
> for
> > users but also for developers who work very hard for stabilizing their
> own
> > branches.
> >
> > To have more maintenance releases, I propose two changes:
> >
> > * Major/Minor/Trivial bug fixes can be cherry-picked
> > * (Roughly) Monthly maintenance release
> >
> > I would like to start the work from branch-2.6. If the change will be
> > accepted by the community, I'm willing to work for the maintenance, as a
> > release manager.
> >
> > Best regards,
> > Akira
>



-- 
Sean


[jira] [Created] (HADOOP-12108) Erroneous behavior of use of wildcard character ( * ) in ls command of hdfs

2015-06-22 Thread Aman Goyal (JIRA)
Aman Goyal created HADOOP-12108:
---

 Summary: Erroneous behavior of use of wildcard character ( * ) in 
ls command of hdfs 
 Key: HADOOP-12108
 URL: https://issues.apache.org/jira/browse/HADOOP-12108
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Aman Goyal
Priority: Critical


If you have following directories in your LOCAL file system 
/data/hadoop/sample/00/contents1.txt
/data/hadoop/sample/01/contents2.txt

and following directories in hdfs : 
/data/hadoop/sample/00/contents1.txt
/data/hadoop/sample/01/contents2.txt
/data/hadoop/sample/02/contents3.txt

suppose you run the following hdfs ls command:
hdfs dfs -ls -R /data/hadoop/sample/*

the paths that are printed have a reference to local paths, and only 00 & 01 
directories get listed. 

this happens only when wildcard (*) character is used in input paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Tsuyoshi Ozawa
Hi Akira,

Thank you for starting interesting topic. +1 on the idea of More
Maintenance Releases for old branches. It would be good if this
activity is more coupled with Apache Yetus for users.

BTW, I don't know one of committers, who is not PMC, can be a release
manager. Does anyone know about this?  It's described in detail as
follows: http://hadoop.apache.org/bylaws#Decision+Making

> Release Manager
> A Release Manager (RM) is a committer who volunteers to produce a Release 
> Candidate according to HowToRelease.
>
> Project Management Committee
> Deciding what is distributed as products of the Apache Hadoop project. In 
> particular all releases must be approved by the PMC

Thanks,
- Tsuyoshi

On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
 wrote:
> Hi everyone,
>
> In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
> Hadoop developers at Yahoo!, Twitter, and other non-distributors work very
> hard to maintenance Hadoop by cherry-picking patches to their own branches.
>
> I want to share the work with the community. If we can cherry-pick bug fix
> patches and have more maintenance releases, it'd be very happy not only for
> users but also for developers who work very hard for stabilizing their own
> branches.
>
> To have more maintenance releases, I propose two changes:
>
> * Major/Minor/Trivial bug fixes can be cherry-picked
> * (Roughly) Monthly maintenance release
>
> I would like to start the work from branch-2.6. If the change will be
> accepted by the community, I'm willing to work for the maintenance, as a
> release manager.
>
> Best regards,
> Akira


[DISCUSS] More Maintenance Releases

2015-06-22 Thread Akira AJISAKA

Hi everyone,

In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that 
Apache Hadoop developers at Yahoo!, Twitter, and other non-distributors 
work very hard to maintenance Hadoop by cherry-picking patches to their 
own branches.


I want to share the work with the community. If we can cherry-pick bug 
fix patches and have more maintenance releases, it'd be very happy not 
only for users but also for developers who work very hard for 
stabilizing their own branches.


To have more maintenance releases, I propose two changes:

* Major/Minor/Trivial bug fixes can be cherry-picked
* (Roughly) Monthly maintenance release

I would like to start the work from branch-2.6. If the change will be 
accepted by the community, I'm willing to work for the maintenance, as a 
release manager.


Best regards,
Akira