Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Kengo Seki
+1. From its user's viewpoint, recent improvements on test-patch made my
work really efficient.
For example, quick feedback due to avoiding unnecessary tests, automated
build environment setup due to Docker support, automated patch download
from JIRA, automated shellcheck and whitespace checker, etc.
I believe it is worth spreading these ideas as a TLP over other projects
having the same problems such as a long QA process.

2015-06-16 15:08 GMT+09:00 Chris Douglas cdoug...@apache.org:

 +1 A separate project sounds great. It'd be great to have more
 standard tooling across the ecosystem.

 As a practical matter, how should projects consume releases? -C

 On Mon, Jun 15, 2015 at 4:47 PM, Sean Busbey bus...@cloudera.com wrote:
  Oof. I had meant to push on this again but life got in the way and now
 the
  June board meeting is upon us. Sorry everyone. In the event that this
 ends
  up contentious, hopefully one of the copied communities can give us a
  branch to work in.
 
  I know everyone is busy, so here's the short version of this email: I'd
  like to move some of the code currently in Hadoop (test-patch) into a new
  TLP focused on QA tooling. I'm not sure what the best format for priming
  this conversation is. ORC filled in the incubator project proposal
  template, but I'm not sure how much that confused the issue. So to start,
  I'll just write what I'm hoping we can accomplish in general terms here.
 
  All software development projects that are community based (that is,
  accepting outside contributions) face a common QA problem for vetting
  in-coming contributions. Hadoop is fortunate enough to be sufficiently
  popular that the weight of the problem drove tool development (i.e.
  test-patch). That tool is generalizable enough that a bunch of other TLPs
  have adopted their own forks. Unfortunately, in most projects this kind
 of
  QA work is an enabler rather than a primary concern, so often the tooling
  is worked on ad-hoc and little shared improvements happen across
  projects. Since
  the tooling itself is never a primary concern, any made is rarely reused
  outside of ASF projects.
 
  Over the last couple months a few of us have been working on generalizing
  the tooling present in the Hadoop code base (because it was the most
 mature
  out of all those in the various projects) and it's reached a point where
 we
  think we can start bringing on other downstream users. This means we need
  to start establishing things like a release cadence and to grow the new
  contributors we have to handle more project responsibility. Personally, I
  think that means it's time to move out from under Hadoop to drive things
 as
  our own community. Eventually, I hope the community can help draw in a
  group of folks traditionally underrepresented in ASF projects, namely QA
  and operations folks.
 
  I think test-patch by itself has enough scope to justify a project.
 Having
  a solid set of build tools that are customizable to fit the norms of
  different software communities is a bunch of work. Making it work well in
  both the context of automated test systems like Jenkins and for
 individual
  developers is even more work. We could easily also take over maintenance
 of
  things like shelldocs, since test-patch is the primary consumer of that
  currently but it's generally useful tooling.
 
  In addition to test-patch, I think the proposed project has some future
  growth potential. Given some adoption of test-patch to prove utility, the
  project could build on the ties it makes to start building tools to help
  projects do their own longer-run testing. Note that I'm talking about the
  tools to build QA processes and not a particular set of tested
 components.
  Specifically, I think the ChaosMonkey work that's in HBase should be
  generalizable as a fault injection framework (either based on that code
 or
  something like it). Doing this for arbitrary software is obviously very
  difficult, and a part of easing that will be to make (and then favor)
  tooling to allow projects to have operational glue that looks the same.
  Namely, the shell work that's been done in hadoop-functions.sh would be a
  great foundational layer that could bring good daemon handling practices
 to
  a whole slew of software projects. In the event that these frameworks and
  tools get adopted by parts of the Hadoop ecosystem, that could make the
 job
  of i.e. Bigtop substantially easier.
 
  I've reached out to a few folks who have been involved in the current
  test-patch work or expressed interest in helping out on getting it used
 in
  other projects. Right now, the proposed PMC would be (alphabetical by
 last
  name):
 
  * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
  pmc, sqoop pmc, all around Jenkins expert)
  * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
  * Nick Dimiduk (hbase pmc, phoenix pmc)
  * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
  * Andrew Purtell  (ASF 

Re: Protocol Buffers version

2015-06-16 Thread Allen Wittenauer

On Jun 16, 2015, at 2:54 AM, Steve Loughran ste...@hortonworks.com wrote:

 
 One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does.
 
 to be ruthless, that's not enough reason to upgrade branch-2, due to the 
 transitive pain it makes all the way down.

Not in branch-2, but certainly in trunk.  

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Nick Dimiduk
I think this is a great idea! Having just gone through the process of
getting Phoenix up to speed with precommits, it would be really nice to
have a place to go other than fork/hack someone else's work. For the same
project, I recently integrated its first daemon service. This meant adding
a bunch of servicy Python code (multi platform support is required) which I
only sort of trust. Again, would be great to have an explicit resource for
this kind of thing in the ecosystem. I expect Calcite and Kylin will be
following along shortly.

Since we're tossing out names, how about Apache Bootstrap? It's a
meta-project to help other projects get off the ground, after all.

-n

On Monday, June 15, 2015, Sean Busbey bus...@cloudera.com wrote:

 Oof. I had meant to push on this again but life got in the way and now the
 June board meeting is upon us. Sorry everyone. In the event that this ends
 up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across
 projects. Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on generalizing
 the tooling present in the Hadoop code base (because it was the most mature
 out of all those in the various projects) and it's reached a point where we
 think we can start bringing on other downstream users. This means we need
 to start establishing things like a release cadence and to grow the new
 contributors we have to handle more project responsibility. Personally, I
 think that means it's time to move out from under Hadoop to drive things as
 our own community. Eventually, I hope the community can help draw in a
 group of folks traditionally underrepresented in ASF projects, namely QA
 and operations folks.

 I think test-patch by itself has enough scope to justify a project. Having
 a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
 pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
 * Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
 phoenix pmc)
 * Allen Wittenauer 

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Tsuyoshi Ozawa
+1 on the idea.

It would be great if tests about dependency management. multiple
branches, and distributed environment can be done in the project. One
discussion point is how Hadoop depends on Yetus, including the
development cycles. It's a good time to rethink what's can be done for
making Hadoop better.

Thanks,
- Tsuyoshi

On Tue, Jun 16, 2015 at 8:47 AM, Sean Busbey bus...@cloudera.com wrote:
 Oof. I had meant to push on this again but life got in the way and now the
 June board meeting is upon us. Sorry everyone. In the event that this ends
 up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across
 projects. Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on generalizing
 the tooling present in the Hadoop code base (because it was the most mature
 out of all those in the various projects) and it's reached a point where we
 think we can start bringing on other downstream users. This means we need
 to start establishing things like a release cadence and to grow the new
 contributors we have to handle more project responsibility. Personally, I
 think that means it's time to move out from under Hadoop to drive things as
 our own community. Eventually, I hope the community can help draw in a
 group of folks traditionally underrepresented in ASF projects, namely QA
 and operations folks.

 I think test-patch by itself has enough scope to justify a project. Having
 a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
 pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
 * Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
 phoenix pmc)
 * Allen Wittenauer (hadoop committer)

 That PMC gives us several members and a bunch of folks familiar with the
 ASF. Combined with the code already existing in Apache spaces, I think that
 gives us sufficient justification for a direct board proposal.

 The planned project name is Apache Yetus. It's an archaic genus of sea
 snail and most of our project will 

[jira] [Created] (HADOOP-12093) test-patch findbugs fails on a branch-based pre-commit runs

2015-06-16 Thread Sangjin Lee (JIRA)
Sangjin Lee created HADOOP-12093:


 Summary: test-patch findbugs fails on a branch-based pre-commit 
runs
 Key: HADOOP-12093
 URL: https://issues.apache.org/jira/browse/HADOOP-12093
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0
Reporter: Sangjin Lee


On our branch development JIRAs (YARN-2928), we are starting to see findbugs 
checks fail consistently. The relevant message:

{noformat}
findbugs baseline for /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build
  Running findbugs in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
/home/jenkins/tools/maven/latest/bin/mvn clean test findbugs:findbugs 
-DskipTests -DhadoopPatchProcess  
/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/YARN-2928FindBugsOutputhadoop-yarn-server-timelineservice.txt
 21
Exception in thread main java.io.FileNotFoundException: 
/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/patchprocess/YARN-2928FindbugsWarningshadoop-yarn-server-timelineservice.xml
 (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.init(FileInputStream.java:146)
at 
edu.umd.cs.findbugs.SortedBugCollection.progessMonitoredInputStream(SortedBugCollection.java:1231)
at 
edu.umd.cs.findbugs.SortedBugCollection.readXML(SortedBugCollection.java:308)
at 
edu.umd.cs.findbugs.SortedBugCollection.readXML(SortedBugCollection.java:295)
at edu.umd.cs.findbugs.workflow.Filter.main(Filter.java:712)
Pre-patch YARN-2928 findbugs is broken?
{noformat}

See YARN-3706 and YARN-3792 for instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Allen Wittenauer

Since a couple of people have brought it up:

I think the release question is probably one of the big question marks. 
 Other than tar balls, how does something like this actually get used 
downstream?

For test-patch, in particular, I have a few thoughts on this:

Short term:

* Projects that want to move RIGHT NOW would modify their Jenkins jobs 
to checkout from the Yetus repo (preferably at a well known tag or branch) in 
one directory and their project repo in another directory.  Then it’s just a 
matter of passing the correct flags to test-patch.  This is pretty much how 
I’ve been personally running test-patch for about 6 months now. Under Jenkins, 
we’ve seen this work with NiFi (incubating) already.

* Create a stub version of test-patch that projects could check into 
their repo, replacing the existing test-patch.  This stub version would git 
clone from either ASF or github and then execute test-patch accordingly on 
demand.  With the correct smarts, it could make sure it has a cached version to 
prevent continual clones.

Longer term:

* I’ve been toying with the idea of (ab)using Java repos and packaging 
as a transportation layer, either in addition or in combination with something 
like a maven plugin.  Something like this would clearly be better for offline 
usage and/or to lower the network traffic.


It’s probably worth pointing out that plugins can get sucked in from 
outside the Yetus dir structure, so project specific bits can remain in those 
projects.  This would mean that, e.g., if ambari decides they want to change 
the dependency ordering such that ambari-metrics always gets built first, 
that’s completely doable without the Yetus project getting involved.  This is 
particularly relevant for things like the Dockerfile where projects would 
almost certainly want to dictate their build and test time dependencies.  

[jira] [Created] (HADOOP-12094) TestCount Fails

2015-06-16 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HADOOP-12094:
--

 Summary: TestCount Fails
 Key: HADOOP-12094
 URL: https://issues.apache.org/jira/browse/HADOOP-12094
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Reporter: Akira AJISAKA


TestCount#processPathWithQuotasByQTVH and 
TestCount#processPathWithQuotasByStorageTypesHeader fails on trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-12094) TestCount fails

2015-06-16 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA resolved HADOOP-12094.

Resolution: Duplicate

 TestCount fails
 ---

 Key: HADOOP-12094
 URL: https://issues.apache.org/jira/browse/HADOOP-12094
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
Reporter: Akira AJISAKA
 Attachments: org.apache.hadoop.fs.shell.TestCount.txt


 TestCount#processPathWithQuotasByQTVH and 
 TestCount#processPathWithQuotasByStorageTypesHeader fails on trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Steve Loughran
I think it's good to have a general build/test process projects can share, so 
+1 to pulling it out. You should get help from others. 

regarding incubation, it is a lot of work, especially for something that's more 
of an in-house tool than an artifact to release and redistribute.

You can't just use apache labs or the build project's repo to work on this? 

if you do want to incubate, we may want to nominate the hadoop project as the 
monitoring PMC, rather than incubator@. 

-steve

 On 16 Jun 2015, at 17:59, Allen Wittenauer a...@altiscale.com wrote:
 
 
 Since a couple of people have brought it up:
 
   I think the release question is probably one of the big question marks. 
  Other than tar balls, how does something like this actually get used 
 downstream?
 
   For test-patch, in particular, I have a few thoughts on this:
 
 Short term:
 
   * Projects that want to move RIGHT NOW would modify their Jenkins jobs 
 to checkout from the Yetus repo (preferably at a well known tag or branch) in 
 one directory and their project repo in another directory.  Then it’s just a 
 matter of passing the correct flags to test-patch.  This is pretty much how 
 I’ve been personally running test-patch for about 6 months now. Under 
 Jenkins, we’ve seen this work with NiFi (incubating) already.
 
   * Create a stub version of test-patch that projects could check into 
 their repo, replacing the existing test-patch.  This stub version would git 
 clone from either ASF or github and then execute test-patch accordingly on 
 demand.  With the correct smarts, it could make sure it has a cached version 
 to prevent continual clones.
 
 Longer term:
 
   * I’ve been toying with the idea of (ab)using Java repos and packaging 
 as a transportation layer, either in addition or in combination with 
 something like a maven plugin.  Something like this would clearly be better 
 for offline usage and/or to lower the network traffic.
 
 
   It’s probably worth pointing out that plugins can get sucked in from 
 outside the Yetus dir structure, so project specific bits can remain in those 
 projects.  This would mean that, e.g., if ambari decides they want to change 
 the dependency ordering such that ambari-metrics always gets built first, 
 that’s completely doable without the Yetus project getting involved.  This is 
 particularly relevant for things like the Dockerfile where projects would 
 almost certainly want to dictate their build and test time dependencies.  



Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Sean Busbey
I'm going to try responding to several things at once here, so apologies if
I miss anyone and sorry for the long email. :)


On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran ste...@hortonworks.com
wrote:

 I think it's good to have a general build/test process projects can share,
 so +1 to pulling it out. You should get help from others.

 regarding incubation, it is a lot of work, especially for something that's
 more of an in-house tool than an artifact to release and redistribute.

 You can't just use apache labs or the build project's repo to work on this?

 if you do want to incubate, we may want to nominate the hadoop project as
 the monitoring PMC, rather than incubator@.

 -steve


Important note: we're proposing a board resolution that would directly pull
this code base out into a new TLP; there'd be no incubator, we'd just
continue building community and start making releases.

The proposed PMC believes the tooling we're talking about has direct
applicability to projects well outside of the ASF. Lot's of other open
source projects run on community contributions and have a general need for
better QA tools. Given that problem set and the presence of a community
working to solve it, there's no reason this needs to be treated as an
in-house build project. We certainly want to be useful to ASF projects and
getting them on-board given our current optimization for ASF infra will
certainly be easier, but we're not limited to that (and our current
prerequisites, a CI tool and jira or github, are pretty broadly available).


On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk ndimi...@apache.org wrote:


 Since we're tossing out names, how about Apache Bootstrap? It's a
 meta-project to help other projects get off the ground, after all.



There's already a web development framework named Bootstrap[1]. It's also
used by several ASF projects, so I think it best to avoid the confusion.

The name is, of course, up to the proposed PMC. As a bit of background, the
current name Yetus fulfills Allen's desire to have something shell related
and my desire to have a project that starts with Y (there are currently no
ASF projects that start with Y). The universe of names that fill in these
two is very small, AFAICT. I did a brief suitability search and didn't find
any blockers.


 On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer a...@altiscale.com
 wrote:


 Since a couple of people have brought it up:

 I think the release question is probably one of the big question
 marks.  Other than tar balls, how does something like this actually get
 used downstream?

 For test-patch, in particular, I have a few thoughts on this:

 Short term:

 * Projects that want to move RIGHT NOW would modify their Jenkins
 jobs to checkout from the Yetus repo (preferably at a well known tag or
 branch) in one directory and their project repo in another directory.  Then
 it’s just a matter of passing the correct flags to test-patch.  This is
 pretty much how I’ve been personally running test-patch for about 6 months
 now. Under Jenkins, we’ve seen this work with NiFi (incubating) already.

 * Create a stub version of test-patch that projects could check
 into their repo, replacing the existing test-patch.  This stub version
 would git clone from either ASF or github and then execute test-patch
 accordingly on demand.  With the correct smarts, it could make sure it has
 a cached version to prevent continual clones.

 Longer term:

 * I’ve been toying with the idea of (ab)using Java repos and
 packaging as a transportation layer, either in addition or in combination
 with something like a maven plugin.  Something like this would clearly be
 better for offline usage and/or to lower the network traffic.


It's important that the project follow ASF guidelines on publishing
releases[2]. So long as we publish releases to the distribution directory I
think we'd be fine having folks work off of the corresponding tag. I'm not
sure there's much reason to do that, however. A Jenkins job can just as
easily grab a release tarball as a git tag and we're not talking about a
large amount of stuff. The kind of build setup that Chris N mentioned is
also totally doable now that there's a build description DSL for Jenkins[3].

For individual developers, I don't see any reason we can't package things
up as a tool, similar to how findbugs or shellcheck work. We can make OS
packages (or homebrew for OS X) if we want to make stand alone installation
on developer machines real easy. Those same packages could be installed on
the ASF build machines, provided some ASF project wanted to make use of
Yetus.

Having releases will incur some turn around time for when folks want to see
fixes, but that's a trade off around release cadence we can work out longer
term.

I would like to have one or two projects that can work off of the bleeding
edge repo, but we'd have to get that to mesh with foundation policy. My gut
tells me we should be 

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Chris Douglas
+1 A separate project sounds great. It'd be great to have more
standard tooling across the ecosystem.

As a practical matter, how should projects consume releases? -C

On Mon, Jun 15, 2015 at 4:47 PM, Sean Busbey bus...@cloudera.com wrote:
 Oof. I had meant to push on this again but life got in the way and now the
 June board meeting is upon us. Sorry everyone. In the event that this ends
 up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across
 projects. Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on generalizing
 the tooling present in the Hadoop code base (because it was the most mature
 out of all those in the various projects) and it's reached a point where we
 think we can start bringing on other downstream users. This means we need
 to start establishing things like a release cadence and to grow the new
 contributors we have to handle more project responsibility. Personally, I
 think that means it's time to move out from under Hadoop to drive things as
 our own community. Eventually, I hope the community can help draw in a
 group of folks traditionally underrepresented in ASF projects, namely QA
 and operations folks.

 I think test-patch by itself has enough scope to justify a project. Having
 a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
 pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
 * Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
 phoenix pmc)
 * Allen Wittenauer (hadoop committer)

 That PMC gives us several members and a bunch of folks familiar with the
 ASF. Combined with the code already existing in Apache spaces, I think that
 gives us sufficient justification for a direct board proposal.

 The planned project name is Apache Yetus. It's an archaic genus of sea
 snail and most of our project will be focused on shell scripts.

 N.b.: this does not mean that the Hadoop community would _have_ to rely on
 the new TLP, but I hope that once we have a release 

Re: Apache Hadoop 2.7 Windows 7 x64 - Failing Tests TestKerberosAuthenticator

2015-06-16 Thread Neeraj Vaidya
Hi All,
I just cleared the kerberos tickets cached on my machine using klist purge, 
before starting the build.
This solved the issue for me. Hope that helps anyone else facing a similar 
issue.

Regards,
Neeraj


On Tue, 16/6/15, Neeraj Vaidya neeraj.vai...@yahoo.co.in wrote:

 Subject: Re: Apache Hadoop 2.7 Windows 7 x64 - Failing Tests 
TestKerberosAuthenticator
 To: common-dev@hadoop.apache.org common-dev@hadoop.apache.org
 Date: Tuesday, 16 June, 2015, 6:43 AM
 
 Hi,
 Can
 you please help me with my issue described below ? I am
 resenting this email as I probably sent the first one before
 my subscription to this list was confirmed. Sorry about
 that.
 
 Regards
 Neeraj
 
 On
 15/06/2015, at 4:05 PM, Neeraj Vaidya neeraj.vai...@yahoo.co.in
 wrote:
 
  Hi,
  
  I have been trying
 to build Hadoop 2.7 on my Windows 7 64-bit laptop. I have
 installed all the pre-requisites mentioned in the
 BUILDING.txt file.
  
  However, when my build reaches the tests
 for hadoop-auth module, it keeps failing with errors related
 to timeout in the Kerberos Authentication tests. See SNIPPET
 below. The surefire-report for this test is attached
 herewith.
  
  Can you
 please let me know if/where I am going wrong ? I have used
 the following command to build mvn package -Pdist
 -Pdocs -Psrc -Dtar
  
 
 SNIPPET OF ERROR
 PRINTED ON
 SCREEN
  
  Running
 org.apache.hadoop.security.authentication.client.TestAuthenticatedURL
  Tests run: 6, Failures: 0, Errors: 0,
 Skipped: 0, Time elapsed: 0.361 sec - in
 org.apache.hadoop.security.authentication.client.TestAuthenticatedURL
  Java HotSpot(TM) 64-Bit Server VM warning:
 ignoring option MaxPermSize=768m; support was removed in
 8.0
  Running
 org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator
  Tests run: 14, Failures: 0, Errors: 10,
 Skipped: 0, Time elapsed: 701.875 sec  FAILURE!
 - in
 org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator
  
 
 
testNotAuthenticated[0](org.apache.hadoop.security.authentication.client.TestKerberosAuthenticator)
 
 Time elapsed: 70.586 sec   ERROR!
  java.lang.Exception: test timed out after
 6 milliseconds
         at
 sun.security.krb5.Credentials.acquireDefaultNativeCreds(Native
 Method)
         at
 sun.security.krb5.Credentials.acquireDefaultCreds(Credentials.java:427)
         at
 sun.security.krb5.Credentials.acquireTGTFromCache(Credentials.java:295)
         at
 
com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:665)
         at
 com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
         at
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 Method)
         at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
         at
 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at
 java.lang.reflect.Method.invoke(Method.java:497)
         at
 javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
         at
 javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
         at
 javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
         at
 javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
         at
 java.security.AccessController.doPrivileged(Native
 Method)
         at
 javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
         at
 javax.security.auth.login.LoginContext.login(LoginContext.java:587)
         at
 sun.security.jgss.GSSUtil.login(GSSUtil.java:255)
         at
 sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:158)
         at
 sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:335)
         at
 sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:331)
         at
 java.security.AccessController.doPrivileged(Native
 Method)
         at
 sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:330)
         at
 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:145)
         at
 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
         at
 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
         at
 sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
         at
 sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
         at
 sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
         at
 
sun.security.jgss.spnego.SpNegoContext.GSS_initSecContext(SpNegoContext.java:875)
         at
 sun.security.jgss.spnego.SpNegoContext.initSecContext(SpNegoContext.java:317)
         at
 sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
         at
 

[jira] [Created] (HADOOP-12091) Issues with directories handling

2015-06-16 Thread Gil Vernik (JIRA)
Gil Vernik created HADOOP-12091:
---

 Summary: Issues with directories handling
 Key: HADOOP-12091
 URL: https://issues.apache.org/jira/browse/HADOOP-12091
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/swift
Reporter: Gil Vernik
Assignee: Gil Vernik


OpenStack Swift doesn't have notion of directories. In Swift everything is 
object, stored in some container, that belongs to account.
Current implementation contains many code that handle directories structure in 
Swift, in particular functions with code that zero length object is a directory.
While it might be true for certain cases, there are also many cases where 
directory handling cases problems and highly reduce performance.
For example, if Swift's container has dozens of objects and one of them has 0 
length size, than Swift driver thinks it's a directory and report it to
upper layer as directory. In consequence, this leads to various exceptions and 
crashes in client side / upper Hadoop layer.

The propose of this Jira topic is to make directories handling in driver as an 
optional and configurable. The driver will behave the same, but there
will be a configurable option that will disable directories handling and so 
everything will be objects, even those with 0 length size.

This will cover cases, where clients doesn't care about directories structures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-12092) Issues with sub-directories in Swift

2015-06-16 Thread Gil Vernik (JIRA)
Gil Vernik created HADOOP-12092:
---

 Summary: Issues with sub-directories in Swift
 Key: HADOOP-12092
 URL: https://issues.apache.org/jira/browse/HADOOP-12092
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/swift
Reporter: Gil Vernik
Assignee: Gil Vernik


OpenStack swift doesn't have notion of directories or sub-directories. In Swift 
everything is object, stored in container, that belongs to account.
Openstack Swift allows objects to have delimiters and than users can access and 
filter those objects using delimiter.

Very good explanation appear here 
http://docs.rackspace.com/files/api/v1/cf-devguide/content/Pseudo-Hierarchical_Folders_Directories-d1e1580.html

Current driver has many code that create nested directories as zero length 
objects. While it might be needed for some cases, in general this
is wrong when working with Swift and highly affects performance of the driver.

The goal of this Jira is too make sub-directories generation as a 
configurable option. There will be option that will allow to disable 
sub-directories  generation and this will highly improve the performance.

Example: client perform PUT account/container/a/b/c/d/e/f/g.txt and driver is 
configured not to use sub-directories in Swift, than only one object 
a/b/c/d/e/f/g.txt will be generated in the container.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: Hadoop-Common-trunk #1528

2015-06-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Common-trunk/1528/changes

Changes:

[arp] HDFS-8607. TestFileCorruption doesn't work as expected. (Contributed by 
Walter Su)

[vinodkv] HADOOP-12001. Fixed LdapGroupsMapping to include configurable Posix 
UID and GID attributes during the search. Contributed by Patrick White.

[vinodkv] HADOOP-12001. Moving CHANGES.txt up into 2.8.

[aajisaka] MAPREDUCE-6363. [NNBench] Lease mismatch error when running with 
multiple mappers. Contributed by Brahma Reddy Battula.

[aajisaka] MAPREDUCE-6396. TestPipeApplication fails by NullPointerException. 
Contributed by Brahma Reddy Battula.

[szetszwo] HDFS-8576.  Lease recovery should return true if the lease can be 
released and the file can be closed.  Contributed by J.Andreina

[szetszwo] HDFS-8540.  Mover should exit with NO_MOVE_BLOCK if no block can be 
moved.  Contributed by surendra singh lilhore

[szetszwo] Move HDFS-8540 to 2.8 in CHANGES.txt.

[szetszwo] HDFS-8361. Choose SSD over DISK in block placement.

[ozawa] YARN-3711. Documentation of ResourceManager HA should explain 
configurations about listen addresses. Contributed by Masatake Iwasaki.

[wheat9] HDFS-8592. SafeModeException never get unwrapped. Contributed by 
Haohui Mai.

[devaraj] YARN-3789. Improve logs for LeafQueue#activateApplications(). 
Contributed

--
[...truncated 5202 lines...]
Running org.apache.hadoop.crypto.TestCryptoStreamsNormal
Tests run: 14, Failures: 0, Errors: 0, Skipped: 8, Time elapsed: 6.804 sec - in 
org.apache.hadoop.crypto.TestCryptoStreamsNormal
Running org.apache.hadoop.crypto.random.TestOsSecureRandom
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.581 sec - in 
org.apache.hadoop.crypto.random.TestOsSecureRandom
Running org.apache.hadoop.crypto.random.TestOpensslSecureRandom
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.175 sec - in 
org.apache.hadoop.crypto.random.TestOpensslSecureRandom
Running org.apache.hadoop.crypto.TestCryptoStreamsWithJceAesCtrCryptoCodec
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.063 sec - 
in org.apache.hadoop.crypto.TestCryptoStreamsWithJceAesCtrCryptoCodec
Running org.apache.hadoop.crypto.TestOpensslCipher
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.198 sec - in 
org.apache.hadoop.crypto.TestOpensslCipher
Running org.apache.hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.699 sec - 
in org.apache.hadoop.crypto.TestCryptoStreamsWithOpensslAesCtrCryptoCodec
Running org.apache.hadoop.crypto.TestCryptoStreams
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.259 sec - 
in org.apache.hadoop.crypto.TestCryptoStreams
Running org.apache.hadoop.service.TestServiceLifecycle
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.372 sec - in 
org.apache.hadoop.service.TestServiceLifecycle
Running org.apache.hadoop.service.TestCompositeService
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.246 sec - in 
org.apache.hadoop.service.TestCompositeService
Running org.apache.hadoop.service.TestGlobalStateChangeListener
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.212 sec - in 
org.apache.hadoop.service.TestGlobalStateChangeListener
Running org.apache.hadoop.ha.TestActiveStandbyElectorRealZK
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.489 sec - in 
org.apache.hadoop.ha.TestActiveStandbyElectorRealZK
Running org.apache.hadoop.ha.TestHealthMonitor
Exception: java.lang.RuntimeException thrown from the UncaughtExceptionHandler 
in thread Health Monitor for DummyHAService #3
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.114 sec - in 
org.apache.hadoop.ha.TestHealthMonitor
Running org.apache.hadoop.ha.TestZKFailoverController
Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 43.268 sec - 
in org.apache.hadoop.ha.TestZKFailoverController
Running org.apache.hadoop.ha.TestZKFailoverControllerStress
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.151 sec - in 
org.apache.hadoop.ha.TestZKFailoverControllerStress
Running org.apache.hadoop.ha.TestActiveStandbyElector
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.933 sec - in 
org.apache.hadoop.ha.TestActiveStandbyElector
Running org.apache.hadoop.ha.TestSshFenceByTcpPort
Tests run: 4, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 3.71 sec - in 
org.apache.hadoop.ha.TestSshFenceByTcpPort
Running org.apache.hadoop.ha.TestHAAdmin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.445 sec - in 
org.apache.hadoop.ha.TestHAAdmin
Running org.apache.hadoop.ha.TestFailoverController
Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.507 sec - in 
org.apache.hadoop.ha.TestFailoverController
Running org.apache.hadoop.ha.TestShellCommandFencer
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, 

Build failed in Jenkins: Hadoop-common-trunk-Java8 #230

2015-06-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-common-trunk-Java8/230/changes

Changes:

[arp] HDFS-8607. TestFileCorruption doesn't work as expected. (Contributed by 
Walter Su)

[vinodkv] HADOOP-12001. Fixed LdapGroupsMapping to include configurable Posix 
UID and GID attributes during the search. Contributed by Patrick White.

[vinodkv] HADOOP-12001. Moving CHANGES.txt up into 2.8.

[aajisaka] MAPREDUCE-6363. [NNBench] Lease mismatch error when running with 
multiple mappers. Contributed by Brahma Reddy Battula.

[aajisaka] MAPREDUCE-6396. TestPipeApplication fails by NullPointerException. 
Contributed by Brahma Reddy Battula.

[szetszwo] HDFS-8576.  Lease recovery should return true if the lease can be 
released and the file can be closed.  Contributed by J.Andreina

[szetszwo] HDFS-8540.  Mover should exit with NO_MOVE_BLOCK if no block can be 
moved.  Contributed by surendra singh lilhore

[szetszwo] Move HDFS-8540 to 2.8 in CHANGES.txt.

[szetszwo] HDFS-8361. Choose SSD over DISK in block placement.

[ozawa] YARN-3711. Documentation of ResourceManager HA should explain 
configurations about listen addresses. Contributed by Masatake Iwasaki.

[wheat9] HDFS-8592. SafeModeException never get unwrapped. Contributed by 
Haohui Mai.

[devaraj] YARN-3789. Improve logs for LeafQueue#activateApplications(). 
Contributed

--
[...truncated 5580 lines...]
Running org.apache.hadoop.io.TestEnumSetWritable
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.431 sec - in 
org.apache.hadoop.io.TestEnumSetWritable
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestMapWritable
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.185 sec - in 
org.apache.hadoop.io.TestMapWritable
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestBooleanWritable
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.185 sec - in 
org.apache.hadoop.io.TestBooleanWritable
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestBytesWritable
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.183 sec - in 
org.apache.hadoop.io.TestBytesWritable
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestSequenceFile
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.373 sec - in 
org.apache.hadoop.io.TestSequenceFile
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestTextNonUTF8
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.162 sec - in 
org.apache.hadoop.io.TestTextNonUTF8
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestObjectWritableProtos
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.275 sec - in 
org.apache.hadoop.io.TestObjectWritableProtos
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.TestDefaultStringifier
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.328 sec - in 
org.apache.hadoop.io.TestDefaultStringifier
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.retry.TestRetryProxy
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.189 sec - in 
org.apache.hadoop.io.retry.TestRetryProxy
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.retry.TestDefaultRetryPolicy
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.337 sec - in 
org.apache.hadoop.io.retry.TestDefaultRetryPolicy
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.retry.TestFailoverProxy
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.035 sec - in 
org.apache.hadoop.io.retry.TestFailoverProxy
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.nativeio.TestNativeIO
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.587 sec - in 
org.apache.hadoop.io.nativeio.TestNativeIO
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.io.nativeio.TestSharedFileDescriptorFactory
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.186 sec - in 
org.apache.hadoop.io.nativeio.TestSharedFileDescriptorFactory
Java HotSpot(TM) 64-Bit Server VM warning: ignoring 

Re: Maven always detects changes - Is this a Docker 'feature'?

2015-06-16 Thread Steve Loughran
Your clocks are probably confused.

ant -diagnostics actually measures clock drift between 
System.currentTimeMillis() and the timestamps coming off the tmp dir. You 
should do the same with files touched in target/


 On 15 Jun 2015, at 23:31, Colin P. McCabe cmcc...@apache.org wrote:
 
 Hi Darrell,
 
 Sorry, I'm not familiar with this feature of Maven.  Perhaps try
 asking on the Apache Maven mailing list?
 
 best,
 Colin
 
 On Fri, May 22, 2015 at 8:34 AM, Darrell Taylor
 darrell.tay...@gmail.com wrote:
 Hi,
 
 Is it normal behaviour for maven to detect changes when I run tests with no
 changes?
 
 e.g.
 $ mvn test -Dtest=TestDFSShell -nsu -o
 ...
 [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @
 hadoop-hdfs ---
 [INFO] Changes detected - recompiling the module!
 [INFO] Compiling 576 source files to
 /home/darrell/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/classes
 ...
 
 Then I run the same command again without touching anything else and it
 compiles everything again.  It's getting rather tedious.
 
 I am running this from inside the docker container.
 
 Any help appreciated.
 
 Thanks
 Darrell.



[jira] [Resolved] (HADOOP-12091) Issues with directories handling in Swift

2015-06-16 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-12091.
-
Resolution: Duplicate

 Issues with directories handling in Swift
 -

 Key: HADOOP-12091
 URL: https://issues.apache.org/jira/browse/HADOOP-12091
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/swift
Reporter: Gil Vernik
Assignee: Gil Vernik

 OpenStack Swift doesn't have notion of directories. In Swift everything is 
 object, stored in some container, that belongs to account.
 Current implementation contains many code that handle directories structure 
 in Swift, in particular functions with code that zero length object is a 
 directory.
 While it might be true for certain cases, there are also many cases where 
 directory handling cases problems and highly reduce performance.
 For example, if Swift's container has dozens of objects and one of them has 0 
 length size, than Swift driver thinks it's a directory and report it to
 upper layer as directory. In consequence, this leads to various exceptions 
 and crashes in client side / upper Hadoop layer.
 The propose of this Jira topic is to make directories handling in driver as 
 an optional and configurable. The driver will behave the same, but there
 will be a configurable option that will disable directories handling and so 
 everything will be objects, even those with 0 length size.
 This will cover cases, where clients doesn't care about directories 
 structures.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Protocol Buffers version

2015-06-16 Thread Steve Loughran

 On 15 Jun 2015, at 22:31, Colin P. McCabe cmcc...@apache.org wrote:
 
 On Mon, Jun 15, 2015 at 7:24 AM, Allen Wittenauer a...@altiscale.com wrote:
 
 On Jun 12, 2015, at 1:03 PM, Alan Burlison alan.burli...@oracle.com wrote:
 
 On 14/05/2015 18:41, Chris Nauroth wrote:
 
 As a reminder though, the community probably would want to see a strong
 justification for the upgrade in terms of features or performance or
 something else.  Right now, I'm not seeing a significant benefit for us
 based on my reading of their release notes.  I think it's worthwhile to
 figure this out first.  Otherwise, there is a risk that any testing work
 turns out to be a wasted effort.
 
 One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1 does.

to be ruthless, that's not enough reason to upgrade branch-2, due to the 
transitive pain it makes all the way down.

 
 
That's a pretty good reason.
 
Some of us had a discussion at Summit about effectively forking 
 protobuf and making it an Apache TLP.  This would give us a chance to get 
 out from under Google's blind spot, guarantee better compatibility across 
 the ecosystem, etc, etc.
 
It is sounding more and more like that's really what needs to happen.
 
 I agree that it would be nice if the protobuf project avoided making
 backwards-incompatible API changes within a minor release.  But in
 practice, we have had the same issues with Jackson, Guava, jets3t, and
 other dependencies.  Nearly every important Hadoop dependency has made
 backwards-incompatible API changes within a minor release of the
 dependency... and that's one reason we are using such old versions of
 everything.  I don't think PB deserves to be singled out as much as it
 has been.

I think it does deserve as it was such an all-or-nothing change. Guava, well, 
we may keep it at 11.0, but we've made sure there are no classes used which 
aren't in the latest versions. Even where we depend on artifacts which need 
later versions (curator-2.7.1) we've addressed the version problem by verifying 
that you can actually rebuild curator with guava-11.0 with everything working 
(curator-x-discovery doesn't compile, but we don't use that). So we know that 
unless a bit of curator uses reflection, we can run it against 11.x. And if 
someone wants to use a later version of Guava + hadoop-common, they can swap it 
in and hadoop will still work. Which is important as on Java 8u45 + you do need 
a recent Guava.

In contrast, protobuf needed a co-ordinate update across everything, every 
project which had checked in their generated protobuf files had to rebuild and 
check in, which guarantees they could no longer work with protobuf 2.4

Jackson? its broken-ness wasn't so obvious: if we'd known I wouldn't have let 
it go updated. It's now on the risk list and I don't see us updating that for a 
long time.

  I think the work going on now to implement CLASSPATH
 isolation in Hadoop will really be beneficial here because we will be
 able to upgrade without worrying about these problems.


+1


Re: What is the limit to the number of properties set in the configuration object

2015-06-16 Thread Steve Loughran
they also get sent over the wire with things like job submissions, so can make 
things slower.

in my little grumpy project, https://github.com/steveloughran/grumpy , I 
actually stuck the groovy scripts into the config files as strings, so they'd 
be submitted as jobs; 

the mapper  reducer would simply read the config, parse it as a method under 
the mapper context, then run it

https://github.com/steveloughran/grumpy/blob/master/src/main/groovy/org/apache/hadoop/grumpy/scripted/ScriptedMapper.groovy


 On 15 Jun 2015, at 22:35, Colin P. McCabe cmcc...@apache.org wrote:
 
 Much like zombo.com, the only limit is yourself.
 
 But huge Configuration objects are going to be really inefficient, so
 I would look elsewhere for storing lots of data.
 
 best,
 Colin
 
 On Fri, Jun 12, 2015 at 7:30 PM, Sitaraman Vilayannur
 vrsitaramanietfli...@gmail.com wrote:
 Thanks Allen, what is the total size limit?
 Sitaraman
 
 
 On Fri, Jun 12, 2015 at 10:53 PM, Allen Wittenauer a...@altiscale.com 
 wrote:
 
 
 On Jun 12, 2015, at 12:37 AM, Sitaraman Vilayannur 
 vrsitaramanietfli...@gmail.com wrote:
 
 Hi,
 What is the limit on the number of properties that can be set using
 set(String s1, String s2) on the Configuration object for hadoop?
 Is this limit configurable if so what is the maximum that can be set?
 
It's a total size of the conf limit, not a number of limit.
 
In general, you shouldn't pack it full of stuff as calling
 Configuration is expensive.  Use a side-input/distributed cache file for
 mass quantities of bits.