Re: upstream jenkins build broken?

2015-03-11 Thread Sean Busbey
You could rely on a destructive git clean call instead of maven to do the
directory removal.

-- 
Sean
On Mar 11, 2015 4:11 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:

 Is there a maven plugin or setting we can use to simply remove
 directories that have no executable permissions on them?  Clearly we
 have the permission to do this from a technical point of view (since
 we created the directories as the jenkins user), it's simply that the
 code refuses to do it.

 Otherwise I guess we can just fix those tests...

 Colin

 On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu l...@cloudera.com wrote:
  Thanks a lot for looking into HDFS-7722, Chris.
 
  In HDFS-7722:
  TestDataNodeVolumeFailureXXX tests reset data dir permissions in
 TearDown().
  TestDataNodeHotSwapVolumes reset permissions in a finally clause.
 
  Also I ran mvn test several times on my machine and all tests passed.
 
  However, since in DiskChecker#checkDirAccess():
 
  private static void checkDirAccess(File dir) throws DiskErrorException {
if (!dir.isDirectory()) {
  throw new DiskErrorException(Not a directory: 
   + dir.toString());
}
 
checkAccessByFileMethods(dir);
  }
 
  One potentially safer alternative is replacing data dir with a regular
  file to stimulate disk failures.
 
  On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth cnaur...@hortonworks.com
 wrote:
  TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
  TestDataNodeVolumeFailureReporting, and
  TestDataNodeVolumeFailureToleration all remove executable permissions
 from
  directories like the one Colin mentioned to simulate disk failures at
 data
  nodes.  I reviewed the code for all of those, and they all appear to be
  doing the necessary work to restore executable permissions at the end of
  the test.  The only recent uncommitted patch I¹ve seen that makes
 changes
  in these test suites is HDFS-7722.  That patch still looks fine
 though.  I
  don¹t know if there are other uncommitted patches that changed these
 test
  suites.
 
  I suppose it¹s also possible that the JUnit process unexpectedly died
  after removing executable permissions but before restoring them.  That
  always would have been a weakness of these test suites, regardless of
 any
  recent changes.
 
  Chris Nauroth
  Hortonworks
  http://hortonworks.com/
 
 
 
 
 
 
  On 3/10/15, 1:47 PM, Aaron T. Myers a...@cloudera.com wrote:
 
 Hey Colin,
 
 I asked Andrew Bayer, who works with Apache Infra, what's going on with
 these boxes. He took a look and concluded that some perms are being set
 in
 those directories by our unit tests which are precluding those files
 from
 getting deleted. He's going to clean up the boxes for us, but we should
 expect this to keep happening until we can fix the test in question to
 properly clean up after itself.
 
 To help narrow down which commit it was that started this, Andrew sent
 me
 this info:
 
 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-
 Build/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/
 has
 500 perms, so I'm guessing that's the problem. Been that way since 9:32
 UTC
 on March 5th.
 
 --
 Aaron T. Myers
 Software Engineer, Cloudera
 
 On Tue, Mar 10, 2015 at 1:24 PM, Colin P. McCabe cmcc...@apache.org
 wrote:
 
  Hi all,
 
  A very quick (and not thorough) survey shows that I can't find any
  jenkins jobs that succeeded from the last 24 hours.  Most of them seem
  to be failing with some variant of this message:
 
  [ERROR] Failed to execute goal
  org.apache.maven.plugins:maven-clean-plugin:2.5:clean (default-clean)
  on project hadoop-hdfs: Failed to clean project: Failed to delete
 
 

 /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hdfs-pr
 oject/hadoop-hdfs/target/test/data/dfs/data/data3
  - [Help 1]
 
  Any ideas how this happened?  Bad disk, unit test setting wrong
  permissions?
 
  Colin
 
 
 
 
 
  --
  Lei (Eddy) Xu
  Software Engineer, Cloudera



Re: about CHANGES.txt

2015-03-13 Thread Sean Busbey
So long as you include the issue number, you can automate pulling the type
from jira directly instead of putting it in the message.

On Fri, Mar 13, 2015 at 4:49 PM, Yongjun Zhang yzh...@cloudera.com wrote:

 Hi,

 I found that changing CHANGES.txt when committing a jira is error prone
 because of the different sections in the file, and sometimes we forget
 about changing this file.

 After all, git log would indicate the history of a branch. I wonder if we
 could switch to a new method:

 1. When committing, ensure the message include the type of the jira, New
 Feature, Bug Fixes, Improvement etc.

 2. No longer need to make changes to CHANGES.txt for each commit

 3. Before releasing a branch, create the CHANGES.txt by using git log
 command for the given branch..

 Thanks.

 --Yongjun




-- 
Sean


Re: Reviving HADOOP-7435: Making Jenkins pre-commit build work with branches

2015-03-04 Thread Sean Busbey
+1

If we can make things look like HBase support for precommit testing on
branches (HBASE-12944), that would make it easier for new and occasional
contributors who might end up working in other ecosystem projects. AFAICT,
Jonathan's proposal for branch names in patch names does this.



On Wed, Mar 4, 2015 at 3:41 PM, Karthik Kambatla ka...@cloudera.com wrote:

 Thanks for reviving this on email, Vinod. Newer folks like me might not be
 aware of this JIRA/effort.

 This would be wonderful to have so (1) we know the status of release
 branches (branch-2, etc.) and also (2) feature branches (YARN-2928).
 Jonathan's or Matt's proposal for including branch name looks reasonable to
 me.

 If none has any objections, I think we can continue on JIRA and get this
 in.

 On Wed, Mar 4, 2015 at 1:20 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:

  Hi all,
 
  I'd like us to revive the effort at
  https://issues.apache.org/jira/browse/HADOOP-7435 to make precommit
  builds being able to work with branches. Having the Jenkins verify
 patches
  on branches is very useful even if there may be relaxed review oversight
 on
  the said-branch.
 
  Unless there are objections, I'd request help from Giri who already has a
  patch sitting there for more than a year before. This may need us to
  collectively agree on some convention - the last comment says that the
  branch patch name should be in some format for this to work.
 
  Thanks,
  +Vinod
 



 --
 Karthik Kambatla
 Software Engineer, Cloudera Inc.
 
 http://five.sentenc.es




-- 
Sean


Re: about CHANGES.txt

2015-03-18 Thread Sean Busbey
 should switch to using your way, and
 save
  committer's effort of taking care of CHANGES.txt (quite some save
 IMO).
  Hope more people can share their thoughts.
 
  Thanks.
 
  --Yongjun
 
  On Fri, Mar 13, 2015 at 4:45 PM, Allen Wittenauer a...@altiscale.com
  wrote:
 
 
  I think the general consensus is don’t include the changes.txt file
 in
  your commit. It won’t be correct for both branches if such a commit
 is
  destined for both. (No, the two branches aren’t the same.)
 
  No, git log isn’t more accurate.  The problems are:
 
  a) cherry picks
  b) branch mergers
  c) “whoops i missed something in that previous commit”
  d) no identification of what type of commit it was without hooking
 into
  JIRA anyway.
 
  This is why I prefer building the change log from JIRA.  We already
  build
  release notes from JIRA, BTW.  (Not that anyone appears to read them
  given
  the low quality of our notes…)  Anyway, here’s what I’ve been
  building/using as changes.txt and release notes:
 
  https://github.com/aw-altiscale/hadoop-release-metadata
 
  I try to update these every day. :)
 
  On Mar 13, 2015, at 4:07 PM, Yongjun Zhang yzh...@cloudera.com
  wrote:
 
  Thanks Esteban, I assume this report gets info purely from the jira
  database, but not git log of a branch, right?
 
  I hope we get the info from git log of a release branch because
  that'd
  be
  more accurate.
 
  --Yongjun
 
  On Fri, Mar 13, 2015 at 3:11 PM, Esteban Gutierrez 
  este...@cloudera.com
 
  wrote:
 
  JIRA already provides a report:
 
 
 
 
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12327179styleName=HtmlprojectId=12310240
 
 
  cheers,
  esteban.
 
 
 
 
  --
  Cloudera, Inc.
 
 
  On Fri, Mar 13, 2015 at 3:01 PM, Sean Busbey bus...@cloudera.com
 
  wrote:
 
  So long as you include the issue number, you can automate pulling
  the
  type
  from jira directly instead of putting it in the message.
 
  On Fri, Mar 13, 2015 at 4:49 PM, Yongjun Zhang 
  yzh...@cloudera.com
  wrote:
 
  Hi,
 
  I found that changing CHANGES.txt when committing a jira is
 error
  prone
  because of the different sections in the file, and sometimes we
  forget
  about changing this file.
 
  After all, git log would indicate the history of a branch. I
  wonder if
  we
  could switch to a new method:
 
  1. When committing, ensure the message include the type of the
  jira,
  New
  Feature, Bug Fixes, Improvement etc.
 
  2. No longer need to make changes to CHANGES.txt for each commit
 
  3. Before releasing a branch, create the CHANGES.txt by using
 git
  log
  command for the given branch..
 
  Thanks.
 
  --Yongjun
 
 
 
 
  --
  Sean
 
 
 
 
 
 
 
 




-- 
Sean


Re: committing HADOOP-11746 test-patch improvements

2015-04-22 Thread Sean Busbey
On Wed, Apr 22, 2015 at 2:10 AM, Allen Wittenauer a...@altiscale.com wrote:



 * There have been a few runs which seems to indicate that *something* is
 destroying the artifact directory in the middle of  runs…. which is very
 very odd and something I hadn’t seen in any of my testing.  In any case, I
 clearly need to add some safety code here to report back that something
 went awry and report back which node, console, etc this happened on.
 Someone more familiar with the Jenkins setup might be able to shed some
 light on why that might happen. All of these runs appear to be on H3, so
 might be related? Impacted issues with this have been:

 - HDFS-8200 (https://builds.apache.org/job/PreCommit-HDFS-Build/10335/)
 - HDFS-8147 (https://builds.apache.org/job/PreCommit-HDFS-Build/10338/)
 - YARN-3301 (https://builds.apache.org/job/PreCommit-YARN-Build/7441/)


From the HDFS precommit build:


 PATCHPROCESS=${WORKSPACE}/../patchprocess
 mkdir -p ${PATCHPROCESS}


Working on directories outside of the workspace for the job is not good,
though I'm not sure if that's the source of the issue. Do I need to
coordinate fixing this with anyone?

-- 
Sean


Re: Set minimum version of Hadoop 3 to JDK 8

2015-04-21 Thread Sean Busbey
A few options:

* Only change the builds for master to use jdk8
* build with both jdk7 and jdk8 by copying jobs
* build with both jdk7 and jdk8 using a jenkins matrix build

Robert, if you'd like help with any of these please send me a ping off-list.

On Tue, Apr 21, 2015 at 8:19 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:

 We don't want JDK 8 only code going into branch-2 line. Moving Jenkins to
 1.8 right-away will shield such code, how do we address that?

 Thanks,
 +Vinod

 On Apr 21, 2015, at 5:54 PM, Robert Kanter rkan...@cloudera.com wrote:

  Sure, I'll try to change the Jenkins builds to 1.8 first.
 
  On Tue, Apr 21, 2015 at 3:31 PM, Andrew Wang andrew.w...@cloudera.com
  wrote:
 
  Hey Robert,
 
  As a first step, could we try switching all our precommit and nightly
  builds over to use 1.8? This is a prerequisite for HADOOP-11858, and
 safe
  to do in any case since it'll still target 1.7.
 
  I'll note that HADOOP-10530 details the pain Steve went through
 switching
  us to JDK7. Might be some lessons learned about how to do this
 transition
  more smoothly.
 
  Thanks,
  Andrew
 
  On Tue, Apr 21, 2015 at 3:15 PM, Robert Kanter rkan...@cloudera.com
  wrote:
 
  + yarn-dev, hdfs-dev, mapred-dev
 
  On Tue, Apr 21, 2015 at 3:14 PM, Robert Kanter rkan...@cloudera.com
  wrote:
 
  Hi all,
 
  Moving forward on some of the discussions on Hadoop 3, I've created
  HADOOP-11858 to set the minimum version of Hadoop 3 to JDK 8.  I just
  wanted to let everyone know in case there's some reason we shouldn't
 go
  ahead with this.
 
  thanks
  - Robert
 
 
 




-- 
Sean


Re: upstream jenkins build broken?

2015-06-06 Thread Sean Busbey
Hi Folks!

After working on test-patch with other folks for the last few months, I
think we've reached the point where we can make the fastest progress
towards the goal of a general use pre-commit patch tester by spinning
things into a project focused on just that. I think we have a mature enough
code base and a sufficient fledgling community, so I'm going to put
together a tlp proposal.

Thanks for the feedback thus far from use within Hadoop. I hope we can
continue to make things more useful.

-Sean

On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey bus...@cloudera.com wrote:

 HBase's dev-support folder is where the scripts and support files live.
 We've only recently started adding anything to the maven builds that's
 specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd
 add in more if we ran into the same permissions problems y'all are having.

 There's also our precommit job itself, though it isn't large[2]. AFAIK, we
 don't properly back this up anywhere, we just notify each other of changes
 on a particular mail thread[3].

 [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687
 [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all
 read because I just finished fixing mvn site running out of permgen)
 [3]: http://s.apache.org/NT0


 On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth cnaur...@hortonworks.com
 wrote:

 Sure, thanks Sean!  Do we just look in the dev-support folder in the HBase
 repo?  Is there any additional context we need to be aware of?

 Chris Nauroth
 Hortonworks
 http://hortonworks.com/






 On 3/11/15, 2:44 PM, Sean Busbey bus...@cloudera.com wrote:

 +dev@hbase
 
 HBase has recently been cleaning up our precommit jenkins jobs to make
 them
 more robust. From what I can tell our stuff started off as an earlier
 version of what Hadoop uses for testing.
 
 Folks on either side open to an experiment of combining our precommit
 check
 tooling? In principle we should be looking for the same kinds of things.
 
 Naturally we'll still need different jenkins jobs to handle different
 resource needs and we'd need to figure out where stuff eventually lives,
 but that could come later.
 
 On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth cnaur...@hortonworks.com
 
 wrote:
 
  The only thing I'm aware of is the failOnError option:
 
 
 
 http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro
 rs
  .html
 
 
  I prefer that we don't disable this, because ignoring different kinds
 of
  failures could leave our build directories in an indeterminate state.
 For
  example, we could end up with an old class file on the classpath for
 test
  runs that was supposedly deleted.
 
  I think it's worth exploring Eddy's suggestion to try simulating
 failure
  by placing a file where the code expects to see a directory.  That
 might
  even let us enable some of these tests that are skipped on Windows,
  because Windows allows access for the owner even after permissions have
  been stripped.
 
  Chris Nauroth
  Hortonworks
  http://hortonworks.com/
 
 
 
 
 
 
  On 3/11/15, 2:10 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:
 
  Is there a maven plugin or setting we can use to simply remove
  directories that have no executable permissions on them?  Clearly we
  have the permission to do this from a technical point of view (since
  we created the directories as the jenkins user), it's simply that the
  code refuses to do it.
  
  Otherwise I guess we can just fix those tests...
  
  Colin
  
  On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu l...@cloudera.com wrote:
   Thanks a lot for looking into HDFS-7722, Chris.
  
   In HDFS-7722:
   TestDataNodeVolumeFailureXXX tests reset data dir permissions in
  TearDown().
   TestDataNodeHotSwapVolumes reset permissions in a finally clause.
  
   Also I ran mvn test several times on my machine and all tests
 passed.
  
   However, since in DiskChecker#checkDirAccess():
  
   private static void checkDirAccess(File dir) throws
 DiskErrorException {
 if (!dir.isDirectory()) {
   throw new DiskErrorException(Not a directory: 
+ dir.toString());
 }
  
 checkAccessByFileMethods(dir);
   }
  
   One potentially safer alternative is replacing data dir with a
 regular
   file to stimulate disk failures.
  
   On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth
  cnaur...@hortonworks.com wrote:
   TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
   TestDataNodeVolumeFailureReporting, and
   TestDataNodeVolumeFailureToleration all remove executable
 permissions
  from
   directories like the one Colin mentioned to simulate disk failures
 at
  data
   nodes.  I reviewed the code for all of those, and they all appear
 to be
   doing the necessary work to restore executable permissions at the
 end
  of
   the test.  The only recent uncommitted patch I¹ve seen that makes
  changes
   in these test suites is HDFS-7722.  That patch still looks fine
  though.  I
   don¹t know

[DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-06 Thread Sean Busbey
Sorry for the resend. I figured this deserves a [DISCUSS] flag.



On Sat, Jun 6, 2015 at 10:39 PM, Sean Busbey bus...@cloudera.com wrote:

 Hi Folks!

 After working on test-patch with other folks for the last few months, I
 think we've reached the point where we can make the fastest progress
 towards the goal of a general use pre-commit patch tester by spinning
 things into a project focused on just that. I think we have a mature enough
 code base and a sufficient fledgling community, so I'm going to put
 together a tlp proposal.

 Thanks for the feedback thus far from use within Hadoop. I hope we can
 continue to make things more useful.

 -Sean

 On Wed, Mar 11, 2015 at 5:16 PM, Sean Busbey bus...@cloudera.com wrote:

 HBase's dev-support folder is where the scripts and support files live.
 We've only recently started adding anything to the maven builds that's
 specific to jenkins[1]; so far it's diagnostic stuff, but that's where I'd
 add in more if we ran into the same permissions problems y'all are having.

 There's also our precommit job itself, though it isn't large[2]. AFAIK,
 we don't properly back this up anywhere, we just notify each other of
 changes on a particular mail thread[3].

 [1]: https://github.com/apache/hbase/blob/master/pom.xml#L1687
 [2]: https://builds.apache.org/job/PreCommit-HBASE-Build/ (they're all
 read because I just finished fixing mvn site running out of permgen)
 [3]: http://s.apache.org/NT0


 On Wed, Mar 11, 2015 at 4:51 PM, Chris Nauroth cnaur...@hortonworks.com
 wrote:

 Sure, thanks Sean!  Do we just look in the dev-support folder in the
 HBase
 repo?  Is there any additional context we need to be aware of?

 Chris Nauroth
 Hortonworks
 http://hortonworks.com/






 On 3/11/15, 2:44 PM, Sean Busbey bus...@cloudera.com wrote:

 +dev@hbase
 
 HBase has recently been cleaning up our precommit jenkins jobs to make
 them
 more robust. From what I can tell our stuff started off as an earlier
 version of what Hadoop uses for testing.
 
 Folks on either side open to an experiment of combining our precommit
 check
 tooling? In principle we should be looking for the same kinds of things.
 
 Naturally we'll still need different jenkins jobs to handle different
 resource needs and we'd need to figure out where stuff eventually lives,
 but that could come later.
 
 On Wed, Mar 11, 2015 at 4:34 PM, Chris Nauroth 
 cnaur...@hortonworks.com
 wrote:
 
  The only thing I'm aware of is the failOnError option:
 
 
 
 http://maven.apache.org/plugins/maven-clean-plugin/examples/ignoring-erro
 rs
  .html
 
 
  I prefer that we don't disable this, because ignoring different kinds
 of
  failures could leave our build directories in an indeterminate state.
 For
  example, we could end up with an old class file on the classpath for
 test
  runs that was supposedly deleted.
 
  I think it's worth exploring Eddy's suggestion to try simulating
 failure
  by placing a file where the code expects to see a directory.  That
 might
  even let us enable some of these tests that are skipped on Windows,
  because Windows allows access for the owner even after permissions
 have
  been stripped.
 
  Chris Nauroth
  Hortonworks
  http://hortonworks.com/
 
 
 
 
 
 
  On 3/11/15, 2:10 PM, Colin McCabe cmcc...@alumni.cmu.edu wrote:
 
  Is there a maven plugin or setting we can use to simply remove
  directories that have no executable permissions on them?  Clearly we
  have the permission to do this from a technical point of view (since
  we created the directories as the jenkins user), it's simply that the
  code refuses to do it.
  
  Otherwise I guess we can just fix those tests...
  
  Colin
  
  On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu l...@cloudera.com wrote:
   Thanks a lot for looking into HDFS-7722, Chris.
  
   In HDFS-7722:
   TestDataNodeVolumeFailureXXX tests reset data dir permissions in
  TearDown().
   TestDataNodeHotSwapVolumes reset permissions in a finally clause.
  
   Also I ran mvn test several times on my machine and all tests
 passed.
  
   However, since in DiskChecker#checkDirAccess():
  
   private static void checkDirAccess(File dir) throws
 DiskErrorException {
 if (!dir.isDirectory()) {
   throw new DiskErrorException(Not a directory: 
+ dir.toString());
 }
  
 checkAccessByFileMethods(dir);
   }
  
   One potentially safer alternative is replacing data dir with a
 regular
   file to stimulate disk failures.
  
   On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth
  cnaur...@hortonworks.com wrote:
   TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
   TestDataNodeVolumeFailureReporting, and
   TestDataNodeVolumeFailureToleration all remove executable
 permissions
  from
   directories like the one Colin mentioned to simulate disk failures
 at
  data
   nodes.  I reviewed the code for all of those, and they all appear
 to be
   doing the necessary work to restore executable permissions at the
 end
  of
   the test.  The only

Re: Where to put some code examples?

2015-06-23 Thread Sean Busbey
Could they go under dev-support?

On Tue, Jun 23, 2015 at 4:29 PM, Ray Chiang rchi...@cloudera.com wrote:

 So, as far as I can see, Hadoop has the main developer area for core Hadoop
 code, unit tests in the test directories, user scripts (like
 hadoop/mapred/yarn), and build scripts.

 I've got some utilities that are really for Hadoop contributors.  These
 serve two purposes:

1. These are just generally useful as private API examples
2. They have some utility for developer purposes (e.g. the random .jhist
generator I'm working on for MAPREDUCE-6376)

 Does anyone have suggestions for where such code bits (and possibly
 corresponding scripts) should go?

 -Ray




-- 
Sean


Re: [DISCUSS] More Maintenance Releases

2015-06-23 Thread Sean Busbey
On Tue, Jun 23, 2015 at 12:30 PM, Andrew Wang andrew.w...@cloudera.com
wrote:

 There's no reason we have to choose 2.6 xor 2.7. If we have willing RMs and
 enough PMCs who will vote on releases, there's no reason we can't maintain
 both.

 However, based on the discussion at Hadoop Summit with Yahoo and Twitter,
 their interest is primarily in 2.6, and Daryn mentioned the need to get 2.6
 stable before they can move to 2.7. So, if we want to help out these big
 users, it seems like we should focus on maintaining 2.6.

 Allen also brought up the issue of JDK6. I see a few options (ranked best
 to worst in my eyes):

 * Add multi-JDK support to test-patch
 * Keep using JDK7 for precommit, and keep an eye on a nightly JDK6 run
 * Drop support for JDK6 in 2.6.x, since no one is using it anymore

 #3 is the most easiest and probably fine for 95% of users, but doing a big
 compat break is not how I'd want to kick off a stable release line. #2
 isn't too bad if we don't want to wait for #1.



I believe work on adding multi jdk is on-going for patch test. Should be
high on
the list once we get our dev branch up.


-- 
Sean


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-24 Thread Sean Busbey
Hi Folks!

Work in a feature branch is now being tracked by HADOOP-12111.

On Thu, Jun 18, 2015 at 10:07 PM, Sean Busbey bus...@cloudera.com wrote:

 It looks like we have consensus.

 I'll start drafting up a proposal for the next board meeting (July 15th).
 Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track
 that we did due diligence on whatever we pick.

 In the mean time, Hadoop PMC would y'all be willing to host us in a branch
 so that we can start prepping things now? We would want branch commit
 rights for the proposed new PMC.


 -Sean


 On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey bus...@cloudera.com wrote:

 Oof. I had meant to push on this again but life got in the way and now
 the June board meeting is upon us. Sorry everyone. In the event that this
 ends up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across projects. 
 Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on generalizing
 the tooling present in the Hadoop code base (because it was the most mature
 out of all those in the various projects) and it's reached a point where we
 think we can start bringing on other downstream users. This means we need
 to start establishing things like a release cadence and to grow the new
 contributors we have to handle more project responsibility. Personally, I
 think that means it's time to move out from under Hadoop to drive things as
 our own community. Eventually, I hope the community can help draw in a
 group of folks traditionally underrepresented in ASF projects, namely QA
 and operations folks.

 I think test-patch by itself has enough scope to justify a project.
 Having a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
 pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
 * Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
 phoenix pmc)
 * Allen Wittenauer (hadoop committer)

 That PMC gives us several members and a bunch of folks familiar

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-18 Thread Sean Busbey
It looks like we have consensus.

I'll start drafting up a proposal for the next board meeting (July 15th).
Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track
that we did due diligence on whatever we pick.

In the mean time, Hadoop PMC would y'all be willing to host us in a branch
so that we can start prepping things now? We would want branch commit
rights for the proposed new PMC.


-Sean


On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey bus...@cloudera.com wrote:

 Oof. I had meant to push on this again but life got in the way and now the
 June board meeting is upon us. Sorry everyone. In the event that this ends
 up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across projects. 
 Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on generalizing
 the tooling present in the Hadoop code base (because it was the most mature
 out of all those in the various projects) and it's reached a point where we
 think we can start bringing on other downstream users. This means we need
 to start establishing things like a release cadence and to grow the new
 contributors we have to handle more project responsibility. Personally, I
 think that means it's time to move out from under Hadoop to drive things as
 our own community. Eventually, I hope the community can help draw in a
 group of folks traditionally underrepresented in ASF projects, namely QA
 and operations folks.

 I think test-patch by itself has enough scope to justify a project. Having
 a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
 pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
 * Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
 phoenix pmc)
 * Allen Wittenauer (hadoop committer)

 That PMC gives us several members and a bunch of folks familiar with the
 ASF. Combined with the code already existing in Apache spaces, I think that
 gives us sufficient justification for a direct board proposal.

 The planned

Re: [DISCUSS] More Maintenance Releases

2015-06-22 Thread Sean Busbey
More maintenance releases would be excellent.


If y'all are going to make more releases on the 2.6 line, please consider
backporting HADOOP-11710 as without it HBase is unusable on top of HDFS
encryption. It's been inconvenient that the fix is only available in a
non-production release line.

-Sean

On Mon, Jun 22, 2015 at 6:36 AM, Tsuyoshi Ozawa oz...@apache.org wrote:

 Hi Akira,

 Thank you for starting interesting topic. +1 on the idea of More
 Maintenance Releases for old branches. It would be good if this
 activity is more coupled with Apache Yetus for users.

 BTW, I don't know one of committers, who is not PMC, can be a release
 manager. Does anyone know about this?  It's described in detail as
 follows: http://hadoop.apache.org/bylaws#Decision+Making

  Release Manager
  A Release Manager (RM) is a committer who volunteers to produce a
 Release Candidate according to HowToRelease.
 
  Project Management Committee
  Deciding what is distributed as products of the Apache Hadoop project.
 In particular all releases must be approved by the PMC

 Thanks,
 - Tsuyoshi

 On Mon, Jun 22, 2015 at 6:43 PM, Akira AJISAKA
 ajisa...@oss.nttdata.co.jp wrote:
  Hi everyone,
 
  In Hadoop Summit, I joined HDFS BoF and heard from Jason Lowe that Apache
  Hadoop developers at Yahoo!, Twitter, and other non-distributors work
 very
  hard to maintenance Hadoop by cherry-picking patches to their own
 branches.
 
  I want to share the work with the community. If we can cherry-pick bug
 fix
  patches and have more maintenance releases, it'd be very happy not only
 for
  users but also for developers who work very hard for stabilizing their
 own
  branches.
 
  To have more maintenance releases, I propose two changes:
 
  * Major/Minor/Trivial bug fixes can be cherry-picked
  * (Roughly) Monthly maintenance release
 
  I would like to start the work from branch-2.6. If the change will be
  accepted by the community, I'm willing to work for the maintenance, as a
  release manager.
 
  Best regards,
  Akira




-- 
Sean


Re: F 6/19: Jenkins clogged up

2015-06-19 Thread Sean Busbey
Thanks for hte heads up.

On Fri, Jun 19, 2015 at 1:43 PM, Chris Nauroth cnaur...@hortonworks.com
wrote:

 Hi everyone,

 I was just in contact with Apache infrastructure.  Jenkins wasn't running
 jobs for a while, so there is a large backlog in the queue now (over 200
 jobs).  Infra has fixed the problems, so jobs are running now, but our
 test-patch runs might have to sit in the queue a long time today.

 --Chris Nauroth




-- 
Sean


Re: Pre-integration tests failing

2015-06-24 Thread Sean Busbey
On Tue, Jun 23, 2015 at 10:43 PM, Alan Burlison alan.burli...@oracle.com
wrote:

 On 24/06/2015 04:22, Sean Busbey wrote:

  Probably not (barring maven attempting to grab SNAPSHOT versions of other
 modules while building).

 What are the machine specs like? the complete unit test set requires a
 fair
 bit of machine power (i.e. more than my laptop can handle).


 The Linux machine is pretty old, it's a 4-core Opteron with 8Gb mem. I
 haven't attempted test runs on Solaris yet as I know they won't complete
 successfully.


I would try things out on a heavier machine then. I know that I've gotten
clean test runs when using a proper server, but never have on my 2 core /
8GB mem laptop.

This is an area where we could do a better job of setting expectations for
contributors, but I'm not sure we have good stats about what kind of build
hardware is needed for a full build. Hopefully it's less than the H*
builds.apache machines. :)


-- 
Sean


Re: Where to put some code examples?

2015-06-24 Thread Sean Busbey
On Wed, Jun 24, 2015 at 2:10 AM, Ray Chiang rchi...@cloudera.com wrote:

 Thanks, dev-support sounds good.  The only question I have is that there
 isn't a pom.xml there now.  Is that something we'd want to have there?  And
 should it at least be linked to the main build via some option, like -Pdev?


I was working under the assumption that they'd be independent project poms
that a dev would have to actively change directories to use.

If they're hooked into the main build, I'd say add a new module instead. We
already have a few foo-examples modules, so maybe flag it as
foo-internal-examples to distinguish from things downstream users should
be looking to for guidance.

-- 
Sean


Re: Protocol Buffers version

2015-06-15 Thread Sean Busbey
Anyone have a read on how the protobuf folks would feel about that? Apache
has a history of not accepting projects that are non-amicable forks.

On Mon, Jun 15, 2015 at 9:24 AM, Allen Wittenauer a...@altiscale.com wrote:


 On Jun 12, 2015, at 1:03 PM, Alan Burlison alan.burli...@oracle.com
 wrote:

  On 14/05/2015 18:41, Chris Nauroth wrote:
 
  As a reminder though, the community probably would want to see a strong
  justification for the upgrade in terms of features or performance or
  something else.  Right now, I'm not seeing a significant benefit for us
  based on my reading of their release notes.  I think it's worthwhile to
  figure this out first.  Otherwise, there is a risk that any testing work
  turns out to be a wasted effort.
 
  One reason at least: PB 2.5.0 has no support for Solaris SPARC. 2.6.1
 does.


 That's a pretty good reason.

 Some of us had a discussion at Summit about effectively forking
 protobuf and making it an Apache TLP.  This would give us a chance to get
 out from under Google's blind spot, guarantee better compatibility across
 the ecosystem, etc, etc.

 It is sounding more and more like that's really what needs to
 happen.




-- 
Sean


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-16 Thread Sean Busbey
I'm going to try responding to several things at once here, so apologies if
I miss anyone and sorry for the long email. :)


On Tue, Jun 16, 2015 at 3:44 PM, Steve Loughran ste...@hortonworks.com
wrote:

 I think it's good to have a general build/test process projects can share,
 so +1 to pulling it out. You should get help from others.

 regarding incubation, it is a lot of work, especially for something that's
 more of an in-house tool than an artifact to release and redistribute.

 You can't just use apache labs or the build project's repo to work on this?

 if you do want to incubate, we may want to nominate the hadoop project as
 the monitoring PMC, rather than incubator@.

 -steve


Important note: we're proposing a board resolution that would directly pull
this code base out into a new TLP; there'd be no incubator, we'd just
continue building community and start making releases.

The proposed PMC believes the tooling we're talking about has direct
applicability to projects well outside of the ASF. Lot's of other open
source projects run on community contributions and have a general need for
better QA tools. Given that problem set and the presence of a community
working to solve it, there's no reason this needs to be treated as an
in-house build project. We certainly want to be useful to ASF projects and
getting them on-board given our current optimization for ASF infra will
certainly be easier, but we're not limited to that (and our current
prerequisites, a CI tool and jira or github, are pretty broadly available).


On Tue, Jun 16, 2015 at 10:13 AM, Nick Dimiduk ndimi...@apache.org wrote:


 Since we're tossing out names, how about Apache Bootstrap? It's a
 meta-project to help other projects get off the ground, after all.



There's already a web development framework named Bootstrap[1]. It's also
used by several ASF projects, so I think it best to avoid the confusion.

The name is, of course, up to the proposed PMC. As a bit of background, the
current name Yetus fulfills Allen's desire to have something shell related
and my desire to have a project that starts with Y (there are currently no
ASF projects that start with Y). The universe of names that fill in these
two is very small, AFAICT. I did a brief suitability search and didn't find
any blockers.


 On Tue, Jun 16, 2015 at 11:59 AM, Allen Wittenauer a...@altiscale.com
 wrote:


 Since a couple of people have brought it up:

 I think the release question is probably one of the big question
 marks.  Other than tar balls, how does something like this actually get
 used downstream?

 For test-patch, in particular, I have a few thoughts on this:

 Short term:

 * Projects that want to move RIGHT NOW would modify their Jenkins
 jobs to checkout from the Yetus repo (preferably at a well known tag or
 branch) in one directory and their project repo in another directory.  Then
 it’s just a matter of passing the correct flags to test-patch.  This is
 pretty much how I’ve been personally running test-patch for about 6 months
 now. Under Jenkins, we’ve seen this work with NiFi (incubating) already.

 * Create a stub version of test-patch that projects could check
 into their repo, replacing the existing test-patch.  This stub version
 would git clone from either ASF or github and then execute test-patch
 accordingly on demand.  With the correct smarts, it could make sure it has
 a cached version to prevent continual clones.

 Longer term:

 * I’ve been toying with the idea of (ab)using Java repos and
 packaging as a transportation layer, either in addition or in combination
 with something like a maven plugin.  Something like this would clearly be
 better for offline usage and/or to lower the network traffic.


It's important that the project follow ASF guidelines on publishing
releases[2]. So long as we publish releases to the distribution directory I
think we'd be fine having folks work off of the corresponding tag. I'm not
sure there's much reason to do that, however. A Jenkins job can just as
easily grab a release tarball as a git tag and we're not talking about a
large amount of stuff. The kind of build setup that Chris N mentioned is
also totally doable now that there's a build description DSL for Jenkins[3].

For individual developers, I don't see any reason we can't package things
up as a tool, similar to how findbugs or shellcheck work. We can make OS
packages (or homebrew for OS X) if we want to make stand alone installation
on developer machines real easy. Those same packages could be installed on
the ASF build machines, provided some ASF project wanted to make use of
Yetus.

Having releases will incur some turn around time for when folks want to see
fixes, but that's a trade off around release cadence we can work out longer
term.

I would like to have one or two projects that can work off of the bleeding
edge repo, but we'd have to get that to mesh with foundation policy. My gut
tells me we should be 

Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-15 Thread Sean Busbey
thank you for making a more digestible version Allen. :)

If you're interested in soliciting feedback from other projects, I created
ASF short links to this thread in common-dev and hbase:


* http://s.apache.org/yetus-discuss-hadoop
* http://s.apache.org/yetus-discuss-hbase

While I agree that it's important to get feedback from ASF projects that
might find this useful, I can say that recently I've been involved in the
non-ASF project YCSB and both the pretest and better shell stuff would be
immensely useful over there.

On Mon, Jun 15, 2015 at 10:36 PM, Allen Wittenauer a...@altiscale.com wrote:


 I'm clearly +1 on this idea.  As part of the rewrite in Hadoop of
 test-patch, it was amazing to see how far and wide this bit of code as
 spread.  So I see consolidating everyone's efforts as a huge win for a
 large number of projects.  (esp considering how many I saw suffering from a
 variety of identified bugs! )

 But….

 I think it's important for people involved in those other projects
 to speak up and voice an opinion as to whether this is useful.

 To summarize:

 In the short term, a single location to get/use a precommit patch
 tester rather than everyone building/supporting their own in their spare
 time.

  FWIW, we've already got the code base modified to be pluggable.
 We've written some basic/simple plugins that support Hadoop, HBase, Tajo,
 Tez, Pig, and Flink.  For HBase and Flink, this does include their custom
 checks.  Adding support for other project shouldn't be hard.  Simple
 projects take almost no time after seeing the basic pattern.

 I think it's worthwhile highlighting that means support for both
 JIRA and GitHub as well as Ant and Maven from the same code base.

 Longer term:

 Well, we clearly have ideas of things that we want to do. Adding
 more features to test-patch (review board? gradle?) is obvious. But what
 about teasing apart and generalizing some of the other shell bits from
 projects? A common library for building CLI tools to fault injection to
 release documentation creation tools to …  I'd even like to see us get as
 advanced as a run this program to auto-generate daemon stop/start bits.

 I had a few chats with people about this idea at Hadoop Summit.
 What's truly exciting are the ideas that people had once they realized what
 kinds of problems we're trying to solve.  It's always amazing the problems
 that projects have that could be solved by these types of solutions.  Let's
 stop hiding our cool toys in this area.

 So, what feedback and ideas do you have in this area?  Are you a
 yay or a nay?


 On Jun 15, 2015, at 4:47 PM, Sean Busbey bus...@cloudera.com wrote:

  Oof. I had meant to push on this again but life got in the way and now
 the
  June board meeting is upon us. Sorry everyone. In the event that this
 ends
  up contentious, hopefully one of the copied communities can give us a
  branch to work in.
 
  I know everyone is busy, so here's the short version of this email: I'd
  like to move some of the code currently in Hadoop (test-patch) into a new
  TLP focused on QA tooling. I'm not sure what the best format for priming
  this conversation is. ORC filled in the incubator project proposal
  template, but I'm not sure how much that confused the issue. So to start,
  I'll just write what I'm hoping we can accomplish in general terms here.
 
  All software development projects that are community based (that is,
  accepting outside contributions) face a common QA problem for vetting
  in-coming contributions. Hadoop is fortunate enough to be sufficiently
  popular that the weight of the problem drove tool development (i.e.
  test-patch). That tool is generalizable enough that a bunch of other TLPs
  have adopted their own forks. Unfortunately, in most projects this kind
 of
  QA work is an enabler rather than a primary concern, so often the tooling
  is worked on ad-hoc and little shared improvements happen across
  projects. Since
  the tooling itself is never a primary concern, any made is rarely reused
  outside of ASF projects.
 
  Over the last couple months a few of us have been working on generalizing
  the tooling present in the Hadoop code base (because it was the most
 mature
  out of all those in the various projects) and it's reached a point where
 we
  think we can start bringing on other downstream users. This means we need
  to start establishing things like a release cadence and to grow the new
  contributors we have to handle more project responsibility. Personally, I
  think that means it's time to move out from under Hadoop to drive things
 as
  our own community. Eventually, I hope the community can help draw in a
  group of folks traditionally underrepresented in ASF projects, namely QA
  and operations folks.
 
  I think test-patch by itself has enough scope to justify a project.
 Having
  a solid set of build tools that are customizable to fit the norms

Re: set up jenkins test for branch-2

2015-06-15 Thread Sean Busbey
On Mon, Jun 15, 2015 at 1:39 PM, Yongjun Zhang yzh...@cloudera.com wrote:

 Thanks Sean and Allen!

 I was not aware of that there is already a way to trigger branch-2 test.
 Good to know.

 There are multiple solutions here:

 1. When posting patches, we can post two versions of patches, one for
 trunk, and one for branch-2, so to trigger two pre-commit jenkins test for
 both branches. This would help catching issue before committing. However,
 it's going to be more workload on the testing machines, so alternatively,
 we can probably defer the branch-2 testing until committing to trunk and
 before branch-2 testing.

 2. Only post patch for trunk, we cherry-pick to branch-2 when committing.
 We can set up periodic jenkins test (such as nightly) for branch-2 to catch
 problem periodically. But that's basically being ignored by us as Allen
 pointed out.



I wouldn't worry about the load on the test machines. If you do, a third
option
is for committers to use test-patch on their local machine after making the
branch-2 version of the patch.

-- 
Sean


Re: set up jenkins test for branch-2

2015-06-14 Thread Sean Busbey
pre-commit will already test on branch-2 provided you follow the patch
naming guidelines.

there is also a branch-2 specific jenkins job:
https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-branch2/

I'd suggest starting by looking at that job and filing jiras to address
whatever the failures are. May 14th was the last time is wasn't marked as
failed and that build was unstable, so there's probably a good deal of work.

On Sun, Jun 14, 2015 at 3:00 PM, Yongjun Zhang yzh...@cloudera.com wrote:

 Hi,

 We touched this topic before but it was put on hold. I'd like to bring it
 to our attention again.

 From time to time we saw changes that work fine in trunk but not branch-2,
 and we don't catch the issue in a timely manner. The difference between
 trunk and branch-2 is sufficient to justify periodic jenkins test and even
 pre-commit test for branch-2.

 I created https://issues.apache.org/jira/browse/INFRA-9226 earlier but I'm
 not sure who are the right folks to take care of it.

 Any one could help follow-up?

 Thanks a lot and best regards,

 --Yongjun




-- 
Sean


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-06-15 Thread Sean Busbey
Oof. I had meant to push on this again but life got in the way and now the
June board meeting is upon us. Sorry everyone. In the event that this ends
up contentious, hopefully one of the copied communities can give us a
branch to work in.

I know everyone is busy, so here's the short version of this email: I'd
like to move some of the code currently in Hadoop (test-patch) into a new
TLP focused on QA tooling. I'm not sure what the best format for priming
this conversation is. ORC filled in the incubator project proposal
template, but I'm not sure how much that confused the issue. So to start,
I'll just write what I'm hoping we can accomplish in general terms here.

All software development projects that are community based (that is,
accepting outside contributions) face a common QA problem for vetting
in-coming contributions. Hadoop is fortunate enough to be sufficiently
popular that the weight of the problem drove tool development (i.e.
test-patch). That tool is generalizable enough that a bunch of other TLPs
have adopted their own forks. Unfortunately, in most projects this kind of
QA work is an enabler rather than a primary concern, so often the tooling
is worked on ad-hoc and little shared improvements happen across
projects. Since
the tooling itself is never a primary concern, any made is rarely reused
outside of ASF projects.

Over the last couple months a few of us have been working on generalizing
the tooling present in the Hadoop code base (because it was the most mature
out of all those in the various projects) and it's reached a point where we
think we can start bringing on other downstream users. This means we need
to start establishing things like a release cadence and to grow the new
contributors we have to handle more project responsibility. Personally, I
think that means it's time to move out from under Hadoop to drive things as
our own community. Eventually, I hope the community can help draw in a
group of folks traditionally underrepresented in ASF projects, namely QA
and operations folks.

I think test-patch by itself has enough scope to justify a project. Having
a solid set of build tools that are customizable to fit the norms of
different software communities is a bunch of work. Making it work well in
both the context of automated test systems like Jenkins and for individual
developers is even more work. We could easily also take over maintenance of
things like shelldocs, since test-patch is the primary consumer of that
currently but it's generally useful tooling.

In addition to test-patch, I think the proposed project has some future
growth potential. Given some adoption of test-patch to prove utility, the
project could build on the ties it makes to start building tools to help
projects do their own longer-run testing. Note that I'm talking about the
tools to build QA processes and not a particular set of tested components.
Specifically, I think the ChaosMonkey work that's in HBase should be
generalizable as a fault injection framework (either based on that code or
something like it). Doing this for arbitrary software is obviously very
difficult, and a part of easing that will be to make (and then favor)
tooling to allow projects to have operational glue that looks the same.
Namely, the shell work that's been done in hadoop-functions.sh would be a
great foundational layer that could bring good daemon handling practices to
a whole slew of software projects. In the event that these frameworks and
tools get adopted by parts of the Hadoop ecosystem, that could make the job
of i.e. Bigtop substantially easier.

I've reached out to a few folks who have been involved in the current
test-patch work or expressed interest in helping out on getting it used in
other projects. Right now, the proposed PMC would be (alphabetical by last
name):

* Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc, jclouds
pmc, sqoop pmc, all around Jenkins expert)
* Sean Busbey (ASF member, accumulo pmc, hbase pmc)
* Nick Dimiduk (hbase pmc, phoenix pmc)
* Chris Nauroth (ASF member, incubator pmc, hadoop pmc)
* Andrew Purtell  (ASF member, incubator pmc, bigtop pmc, hbase pmc,
phoenix pmc)
* Allen Wittenauer (hadoop committer)

That PMC gives us several members and a bunch of folks familiar with the
ASF. Combined with the code already existing in Apache spaces, I think that
gives us sufficient justification for a direct board proposal.

The planned project name is Apache Yetus. It's an archaic genus of sea
snail and most of our project will be focused on shell scripts.

N.b.: this does not mean that the Hadoop community would _have_ to rely on
the new TLP, but I hope that once we have a release that can be evaluated
there'd be enough benefit to strongly encourage it.

This has mostly been focused on scope and community issues, and I'd love to
talk through any feedback on that. Additionally, are there any other points
folks want to make sure are covered before we have a resolution?

On Sat, Jun 6, 2015

Re: JIRA admin question: Moving a bug from Hadoop to YARN

2015-05-27 Thread Sean Busbey
you might have to do things in two steps

* move from HADOOP tracker to YARN tracker
* convert from YARN jira to subtask of YARN-3719

On Wed, May 27, 2015 at 10:52 AM, Ted Yu yuzhih...@gmail.com wrote:

 When you click on More button, you should see an action called Move.

 You can move the existing JIRA.

 FYI

 On Wed, May 27, 2015 at 8:47 AM, Alan Burlison alan.burli...@oracle.com
 wrote:

  HADOOP-11952
  Native compilation on Solaris fails on Yarn due to use of FTS
 
  is actually a YARN bug, not a Hadoop one and should be moved under the
  top-level Solaris/YARN Jira:
 
  YARN-3719 Improve Solaris support in YARN
 
  Is that possible or do I have to close the current bug and open a fresh
  one against YARN, copying everything across manually?
 
  Thanks,
 
  --
  Alan Burlison
  --
 




-- 
Sean


edit rights on the hadoop wiki

2015-05-27 Thread Sean Busbey
Hi!

Can I get edit rights on the Hadoop wiki?

-- 
Sean


Re: Does repository work correctly?

2015-07-06 Thread Sean Busbey
there appears to be an outage currently.

https://issues.apache.org/jira/browse/INFRA-9934

On Mon, Jul 6, 2015 at 3:02 PM, Colin P. McCabe cmcc...@apache.org wrote:

 I am getting the same error now.  Did we ever find the root cause of
 this problem?

 cmccabe@keter:~/hadoop2 git push
 Counting objects: 76, done.
 Delta compression using up to 4 threads.
 Compressing objects: 100% (12/12), done.
 Writing objects: 100% (15/15), 1.24 KiB | 0 bytes/s, done.
 Total 15 (delta 10), reused 0 (delta 0)
 remote: You are not authorized to edit this repository.
 remote:
 To https://git-wip-us.apache.org/repos/asf/hadoop.git
  ! [remote rejected] trunk - trunk (pre-receive hook declined)
 error: failed to push some refs to
 'https://git-wip-us.apache.org/repos/asf/hadoop.git'

 Colin


 On Tue, Feb 24, 2015 at 6:24 PM, Tsuyoshi Ozawa oz...@apache.org wrote:
  It works well now! Maybe someone fix this issue...thank you very much!
 
  - Tsuyoshi
 
  On Wed, Feb 25, 2015 at 11:03 AM, Tsuyoshi Ozawa oz...@apache.org
 wrote:
  Hi,
 
  I tried to commit YARN-3247, but it failed because of an error with
  message You are not authorized to edit this repository.
  I confirmed that it worked 10 hours ago. Can someone commit well now?
 
  Username for 'https://git-wip-us.apache.org': ozawa
  Password for 'https://oz...@git-wip-us.apache.org':
  Counting objects: 19, done.
  Delta compression using up to 4 threads.
  Compressing objects: 100% (14/14), done.
  Writing objects: 100% (19/19), 3.01 KiB | 0 bytes/s, done.
  Total 19 (delta 8), reused 0 (delta 0)
  remote: You are not authorized to edit this repository.
  remote:
  To https://git-wip-us.apache.org/repos/asf/hadoop.git
   ! [remote rejected] trunk - trunk (pre-receive hook declined)
  error: failed to push some refs to
  'https://git-wip-us.apache.org/repos/asf/hadoop.git'
 
  Thanks,
  - Tsuyoshi




-- 
Sean


Re: Planning Hadoop 2.6.1 release

2015-08-05 Thread Sean Busbey
If we haven't frozen yet, HDFS-8850 is a straight forward fix that is
currently only in 2.8+ and would benefit 2.6 and 2.7.

On Wed, Aug 5, 2015 at 2:56 PM, Junping Du j...@hortonworks.com wrote:

 I would like to nominate YARN-3832 as 2.6.1 candidate which is critical
 and I also saw it happened recently on a 2.6.0 cluster. Hope this is not
 too late.

 Thanks,

 Junping
 
 From: Rich Haase rha...@pandora.com
 Sent: Thursday, August 06, 2015 1:52 AM
 To: mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
 Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org
 Subject: Re: Planning Hadoop 2.6.1 release

 +1 to add those fixes.


 Rich Haase | Sr. Software Engineer | Pandora
 m 303.887.1146 | rha...@pandora.com




 On 8/5/15, 11:42 AM, Wangda Tan wheele...@gmail.com wrote:

 Can we add following two fixes to 2.6.1?
 
 https://issues.apache.org/jira/browse/YARN-2922 and
 https://issues.apache.org/jira/browse/YARN-3487.
 
 They're not fatal issue, but they can cause lots of issue in a large
 cluster.
 
 Thanks,
 Wangda
 
 
 On Mon, Aug 3, 2015 at 1:21 PM, Sangjin Lee sj...@apache.org wrote:
 
  See my later update in the thread. HDFS-7704 is in the list.
 
  Thanks,
  Sangjin
 
  On Mon, Aug 3, 2015 at 1:19 PM, Vinod Kumar Vavilapalli 
  vino...@hortonworks.com wrote:
 
   Makes sense, it was caused by HDFS-7704 which got into 2.7.0 only and
 is
   not part of the candidate list. Removed HDFS-7916 from the list.
  
   Thanks
   +Vinod
  
On Jul 24, 2015, at 6:32 PM, Sangjin Lee sj...@apache.org wrote:
   
Out of the JIRAs we proposed, please remove HDFS-7916. I don't
 think it
applies to 2.6.
   
Thanks,
Sangjin
  
  
 




-- 
Sean


Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions

2015-07-16 Thread Sean Busbey
On Thu, Jul 16, 2015 at 9:17 AM, Karthik Kambatla ka...@cloudera.com
wrote:

 On Thu, Jul 16, 2015 at 4:59 AM, Steve Loughran ste...@hortonworks.com
 wrote:


  -any change to the signature of an API, including exception types  text
  -changes to wire formats
 

 These two should hold for minor releases also, no?


At the risk of derailing this thread, no definitely not. any change would
include backwards compatible additions / changes. Using this stricter
restriction is great for patch releases, since it means that a user can
safely move onto a newer patch release with the assurance that if some
regression should show up they can move back to an earlier patch release
without risk that changes in their application since upgrading won't work
due to reliance on an addition.


-- 
Sean


Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions

2015-07-15 Thread Sean Busbey
Why not just have the discussion here? It seems integral to the matter of
having more maintenance releases on those versions.

On Wed, Jul 15, 2015 at 11:39 AM, Karthik Kambatla ka...@cloudera.com
wrote:

 I believe there was general consensus to do more maintenance releases, as
 witnessed in the other thread.

 There have been discussions on what should go into 2.x.1, 2.x.2, etc., but
 I don't think we have a clear proposal. It would be nice to put that
 together, so committers know where all to commit patches to. Otherwise,
 release managers will have to look through branch-2 and pull in the fixes.
 Either approach is fine by me, but would be nice to start a [DISCUSS]
 thread on how to go about this.

 Any takers?

 On Wed, Jul 15, 2015 at 8:57 AM, Sangjin Lee sjl...@gmail.com wrote:

  Strong +1 for having a 2.6.1 release. I understand Vinod has been trying
 to
  get that effort going but it's been stalled a little bit. It would be
 good
  to rekindle that effort.
 
  Companies with big hadoop 2.x deployments (including mine) have always
  tried to stabilize a 2.x release by testing/collecting/researching
 critical
  issues on the release. Each would come up with its own set of fixes to
  backport. We would also communicate it via offline channels. During the
  hadoop summit, we thought it would be great if we all came together and
  create a public stability/bugfix release on top of 2.x (2.6.1 for 2.6 for
  example) with all the critical issues fixed.
 
  Thanks,
  Sangjin
 
 
  On Tue, Jul 14, 2015 at 10:42 PM, Tsuyoshi Ozawa oz...@apache.org
 wrote:
 
   Thank you for the notification. Trying to back port bug fixes.
  
   - Tsuyoshi
  
   On Wed, Jul 15, 2015 at 3:45 AM, Sean Busbey bus...@cloudera.com
  wrote:
Hi Hadoopers!
   
Over in HBase we've been discussing the impact of our dependencies on
  our
downstream users. As our most fundamental dependency, Hadoop plays a
  big
role in the operational cost of running an HBase instance.
   
Currently the HBase 1.y release line supports Hadoop 2.4, 2.5, and
   2.6[1].
We don't drop Hadoop minor release lines in minor releases so we are
unlikely remove anything from this set until HBase 2.0, probably at
 the
   end
of 2015 / start of 2016 (and currently we plan to continue supporting
  at
least 2.4 for HBase 2.0 [2]). Lately we've been discussing updating
 our
shipped binaries to Hadoop 2.6, following some stability testing by
  part
   of
our community[3]. Unfortunately, 2.6.0 in particular has a couple of
  bugs
that could destroy HBase clusters should users decide to turn on HDFS
encryption[4]. Our installation instructions tell folks to replace
  these
jars with the version of Hadoop they are actually running, but not
 all
users follow those instructions so we want to minimize the pain for
  them.
   
Regular maintenance releases are key to keeping operational burdens
 low
   for
our downstream users; we don't want them to be forced to choose
 between
living with broken systems and stomaching the risk of upgrades across
minor/major version numbers. Looking back over the three
 aforementioned
Hadoop versions, 2.6 hasn't had a patch release since 2.6.0 came out
 in
   Nov
2014, when 2.5 had its last patch release as well. Hadoop 2.4 looks
 to
   be a
year without a release[5]. On our discussion of shipping Hadoop 2.6
binaries, one of your PMC members mentioned that with continued work
 on
   the
2.7 line y'all weren't planning any additional releases of the
 earlier
minor versions[6].
   
The HBase community requests that Hadoop pick up making bug-fix-only
   patch
releases again on a regular schedule[7]. Preferably on the 2.6 line
 and
preferably monthly. We realize that given the time gap since 2.6.0 it
   will
likely take a big to get 2.6.1 together, but after that it should
 take
   much
less effort to continue.
   
[1]: http://hbase.apache.org/book.html#hadoop
[2]: http://s.apache.org/ReP
[3]: HBASE-13339
[4]: HADOOP-11674 and HADOOP-11710
[5]: http://hadoop.apache.org/releases.html
[6]: http://s.apache.org/MTY
[7]: http://s.apache.org/ViP
   
--
Sean
  
 



 --
 Karthik Kambatla
 Software Engineer, Cloudera Inc.
 
 http://five.sentenc.es




-- 
Sean


Re: [DISCUSS] Additional maintenance releases for Hadoop 2.y versions

2015-07-15 Thread Sean Busbey
Why not just include all backwards compatible bug fixes?

Alternatively, why not appoint a Release Manager for the minor release line
and then allow them to arbitrate when there's disagreement about inclusion?
This has worked well in the HBase community.

On Wed, Jul 15, 2015 at 3:49 PM, Karthik Kambatla ka...@cloudera.com
wrote:

 As I proposed in the other thread, how about we adopting the following
 model:

 x.y.1 releases have all Blocker, Critical, Major bug fixes applied to the
 next minor release.
 x.y.2 releases have all Blocker, Critical bug fixes applied to the next
 minor release.
 x.y.3 releases have all Blocker bug fixes applied to next minor release.

 Here I am assuming there are no security-fix-only or other urgent releases.

 We could apply this approach for 2.7.x onwards, and do an adhoc 2.6
 release.

 On Wed, Jul 15, 2015 at 12:59 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:

  Yeah, I started a thread while back on this one (
  http://markmail.org/message/sbykjn5xgnksh6wg) and had many offline
  discussions re 2.6.1.
 
  The biggest problem I found offline was about what bug-fixes are
  acceptable and what aren’t for everyone wishing to consume 2.6.1. Given
 the
  number of bug-fixes that went into 2.7.x and into branch-2.8, figuring
 out
  a set of patches that is acceptable for everyone is a huge challenge
 which
  kind of stalled my attempts.
 
  Thanks
  +Vinod
 
 
   On Jul 15, 2015, at 8:57 AM, Sangjin Lee sjl...@gmail.com wrote:
  
   Strong +1 for having a 2.6.1 release. I understand Vinod has been
 trying
  to
   get that effort going but it's been stalled a little bit. It would be
  good
   to rekindle that effort.
  
   Companies with big hadoop 2.x deployments (including mine) have always
   tried to stabilize a 2.x release by testing/collecting/researching
  critical
   issues on the release. Each would come up with its own set of fixes to
   backport. We would also communicate it via offline channels. During the
   hadoop summit, we thought it would be great if we all came together and
   create a public stability/bugfix release on top of 2.x (2.6.1 for 2.6
 for
   example) with all the critical issues fixed.
  
   Thanks,
   Sangjin
  
  
   On Tue, Jul 14, 2015 at 10:42 PM, Tsuyoshi Ozawa oz...@apache.org
  wrote:
  
   Thank you for the notification. Trying to back port bug fixes.
  
   - Tsuyoshi
  
   On Wed, Jul 15, 2015 at 3:45 AM, Sean Busbey bus...@cloudera.com
  wrote:
   Hi Hadoopers!
  
   Over in HBase we've been discussing the impact of our dependencies on
  our
   downstream users. As our most fundamental dependency, Hadoop plays a
  big
   role in the operational cost of running an HBase instance.
  
   Currently the HBase 1.y release line supports Hadoop 2.4, 2.5, and
   2.6[1].
   We don't drop Hadoop minor release lines in minor releases so we are
   unlikely remove anything from this set until HBase 2.0, probably at
 the
   end
   of 2015 / start of 2016 (and currently we plan to continue supporting
  at
   least 2.4 for HBase 2.0 [2]). Lately we've been discussing updating
 our
   shipped binaries to Hadoop 2.6, following some stability testing by
  part
   of
   our community[3]. Unfortunately, 2.6.0 in particular has a couple of
  bugs
   that could destroy HBase clusters should users decide to turn on HDFS
   encryption[4]. Our installation instructions tell folks to replace
  these
   jars with the version of Hadoop they are actually running, but not
 all
   users follow those instructions so we want to minimize the pain for
  them.
  
   Regular maintenance releases are key to keeping operational burdens
 low
   for
   our downstream users; we don't want them to be forced to choose
 between
   living with broken systems and stomaching the risk of upgrades across
   minor/major version numbers. Looking back over the three
 aforementioned
   Hadoop versions, 2.6 hasn't had a patch release since 2.6.0 came out
 in
   Nov
   2014, when 2.5 had its last patch release as well. Hadoop 2.4 looks
 to
   be a
   year without a release[5]. On our discussion of shipping Hadoop 2.6
   binaries, one of your PMC members mentioned that with continued work
 on
   the
   2.7 line y'all weren't planning any additional releases of the
 earlier
   minor versions[6].
  
   The HBase community requests that Hadoop pick up making bug-fix-only
   patch
   releases again on a regular schedule[7]. Preferably on the 2.6 line
 and
   preferably monthly. We realize that given the time gap since 2.6.0 it
   will
   likely take a big to get 2.6.1 together, but after that it should
 take
   much
   less effort to continue.
  
   [1]: http://hbase.apache.org/book.html#hadoop
   [2]: http://s.apache.org/ReP
   [3]: HBASE-13339
   [4]: HADOOP-11674 and HADOOP-11710
   [5]: http://hadoop.apache.org/releases.html
   [6]: http://s.apache.org/MTY
   [7]: http://s.apache.org/ViP
  
   --
   Sean
  
 
 


 --
 Karthik Kambatla
 Software Engineer, Cloudera Inc

Re: [Test-Patch TLP] consensus on naming

2015-07-12 Thread Sean Busbey
sure. what did you have in mind?

Last time it was discussed we were going to wait to overhaul the repo until
we have a new repo to move to.

On Sat, Jul 11, 2015 at 7:40 AM, Steve Loughran ste...@hortonworks.com
wrote:

 +1,

 could you structure the source tree/build so that adding new modules is
 easy?


  On 10 Jul 2015, at 06:08, Kengo Seki sek...@gmail.com wrote:
 
  +1 for Yetus. Simple and distinctive.
 
  On Friday, July 10, 2015, Kengo Seki sek...@gmail.com wrote:
 
 
 
  On Wednesday, July 8, 2015, Tsuyoshi Ozawa oz...@apache.org
  javascript:_e(%7B%7D,'cvml','oz...@apache.org'); wrote:
 
  Hi Sean,
 
  +1 for Yetus since it sounds good name to me.
 
  Thanks
  - Tsuyoshi
 
  On Wed, Jul 8, 2015 at 2:42 PM, Sean Busbey bus...@cloudera.com
 wrote:
  Hi folks!
 
  It's almost time for the July board meeting, so we need to get the
 ball
  rolling on the proposal for a new TLP focused on QA[1].
 
  Once issue outstanding from the original discussion is consensus on a
  name.
  We need to get some consensus together so that I can start verifying
  that
  the name is usable for an ASF project via a podling name search jira
  [2].
 
  A brief review, so far the only stated naming preferences are Allen
 W's
  desire for something shell related and my desire for a project name
  beginning with the letter Y.
 
  We have a proposed name of Yetus; it's an archaic genus of sea snail
 and
  from my initial review should be usable.
 
  Any other strong feelings on naming? Any specific objections to Yetus?
 
  [1]: http://s.apache.org/yetus-discuss-hadoop
  [2]: https://issues.apache.org/jira/browse/PODLINGNAMESEARCH/
 
  --
  Sean
 
 




-- 
Sean


[DISCUSS] Additional maintenance releases for Hadoop 2.y versions

2015-07-14 Thread Sean Busbey
Hi Hadoopers!

Over in HBase we've been discussing the impact of our dependencies on our
downstream users. As our most fundamental dependency, Hadoop plays a big
role in the operational cost of running an HBase instance.

Currently the HBase 1.y release line supports Hadoop 2.4, 2.5, and 2.6[1].
We don't drop Hadoop minor release lines in minor releases so we are
unlikely remove anything from this set until HBase 2.0, probably at the end
of 2015 / start of 2016 (and currently we plan to continue supporting at
least 2.4 for HBase 2.0 [2]). Lately we've been discussing updating our
shipped binaries to Hadoop 2.6, following some stability testing by part of
our community[3]. Unfortunately, 2.6.0 in particular has a couple of bugs
that could destroy HBase clusters should users decide to turn on HDFS
encryption[4]. Our installation instructions tell folks to replace these
jars with the version of Hadoop they are actually running, but not all
users follow those instructions so we want to minimize the pain for them.

Regular maintenance releases are key to keeping operational burdens low for
our downstream users; we don't want them to be forced to choose between
living with broken systems and stomaching the risk of upgrades across
minor/major version numbers. Looking back over the three aforementioned
Hadoop versions, 2.6 hasn't had a patch release since 2.6.0 came out in Nov
2014, when 2.5 had its last patch release as well. Hadoop 2.4 looks to be a
year without a release[5]. On our discussion of shipping Hadoop 2.6
binaries, one of your PMC members mentioned that with continued work on the
2.7 line y'all weren't planning any additional releases of the earlier
minor versions[6].

The HBase community requests that Hadoop pick up making bug-fix-only patch
releases again on a regular schedule[7]. Preferably on the 2.6 line and
preferably monthly. We realize that given the time gap since 2.6.0 it will
likely take a big to get 2.6.1 together, but after that it should take much
less effort to continue.

[1]: http://hbase.apache.org/book.html#hadoop
[2]: http://s.apache.org/ReP
[3]: HBASE-13339
[4]: HADOOP-11674 and HADOOP-11710
[5]: http://hadoop.apache.org/releases.html
[6]: http://s.apache.org/MTY
[7]: http://s.apache.org/ViP

-- 
Sean


Re: Github integration for Hadoop

2015-10-29 Thread Sean Busbey
It looks like there's pretty good consensus. Why do we need a VOTE thread?

Perhaps better for someone to submit a patch with proposed text for
hte contribution guide[1]?

-Sean

[1]: http://wiki.apache.org/hadoop/HowToContribute

On Thu, Oct 29, 2015 at 2:01 PM, Xiaoyu Yao  wrote:
> +1, should we start a vote on this?
>
>
>
>
> On 10/29/15, 11:54 AM, "Ashish"  wrote:
>
>>+1
>>
>>On Thu, Oct 29, 2015 at 11:51 AM, Mingliang Liu  wrote:
>>> +1 (non-binding)
>>>
>>> Mingliang Liu
>>> Member of Technical Staff - HDFS,
>>> Hortonworks Inc.
>>> m...@hortonworks.com
>>>
>>>
>>>
 On Oct 29, 2015, at 10:55 AM, Hitesh Shah  wrote:

 +1 on supporting patch contributions through github pull requests.

 — Hitesh

 On Oct 29, 2015, at 10:47 AM, Owen O'Malley  wrote:

> All,
>  For code & patch review, many of the newer projects are using the Github
> pull request integration. You can read about it here:
>
> https://blogs.apache.org/infra/entry/improved_integration_between_apache_and
>
> It basically lets you:
> * have mirroring between comments on pull requests and jira
> * lets you close pull requests
> * have mirroring between pull request comments and the Apache mail lists
>
> Thoughts?
> .. Owen


>>>
>>
>>
>>
>>--
>>thanks
>>ashish
>>
>>Blog: http://www.ashishpaliwal.com/blog
>>My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>



-- 
Sean


Re: Github integration for Hadoop

2015-11-13 Thread Sean Busbey
Hi Colin!

If Yetus is working on an issue and can't tell what the intended branch is
it points folks to project specific contribution guides.

For Hadoop, the patch naming for specific branches should be covered in
this section of Hadoop's contribution guide:

http://wiki.apache.org/hadoop/HowToContribute#Naming_your_patch

Yetus will actually support a little bit more than that guide suggests. If
a project doesn't define a URL to point people at for help in naming
patches we default to this guide:

https://yetus.apache.org/documentation/latest/precommit-patchnames/



On Fri, Nov 13, 2015 at 8:05 PM, Colin P. McCabe  wrote:

> Thanks, Allen, I wasn't aware that Yetus now supported testing for
> other branches.  Is there documentation about how to name the branch
> so it gets tested?
>
> best,
> Colin
>
> On Fri, Nov 13, 2015 at 7:52 AM, Allen Wittenauer 
> wrote:
> >
> >> On Nov 12, 2015, at 10:55 AM, Colin P. McCabe 
> wrote:
> >>
> >> gerrit has a button on the UI to cherry-pick to different branches.
> >> The button creates separate "gerrit changes" which you can then
> >> commit.  Eventually we could hook those up to Jenkins-- something
> >> which we've never been able to do for different branches with the
> >> patch-file-based workflow.
> >
> >
> > If you’re saying what I think you’re saying, people have been
> able to submit patches via JIRA patch file attachment to major branches for
> a few months now. Yetus closes the loop and supports pretty much any branch
> or git hash.  (Github PRs also go to their respective branch or git hash as
> well.)
>



-- 
Sean


Re: Github integration for Hadoop

2015-11-03 Thread Sean Busbey
On Sun, Nov 1, 2015 at 12:52 PM, Allen Wittenauer  wrote:
>
>> On Nov 1, 2015, at 6:05 AM, Tsuyoshi Ozawa  wrote:
>>
>> Thank you for starting this discussion. It's good for us to rethink
>> our workflow to grow community.
>> However, at the moment, my concern is that we can put more pressure on
>> Yetus community if we move the main workflows into github.
>>
>> Allen and Sean, what do you think? Is it good timing for Yetus community?
>
>
> There’s nothing for Yetus to do here; support for github PRs has been 
> there since August. I regularly use it to pull down my own PRs against my own 
> hadoop branch to test with.
>
> The only work is for someone to modify the Jenkins job that runs 
> test-patch to either support both JIRA and GH or to create another job.  
> That’s on the Hadoop community to do.

Just one additional Yetus point here: no new jenkins job is needed if
folks leave a comment with a link to their PR and set the jira status
to "patch available".

-- 
Sean


Re: Github integration for Hadoop

2015-10-30 Thread Sean Busbey
On Fri, Oct 30, 2015 at 1:22 PM, Allen Wittenauer  wrote:
>
>> * Have we tried our precommit on PRs yet? Does it work for multiple
>> branches? Is there a way to enforce rebase+squash vs. merge on the PR,
>> since, per Allen, Yetus requires one commit to work?
>
>
> I don’t know about the Jenkins-side of things (e.g., how does Jenkins 
> trigger a build?).  As far as Yetus is concerned, here’s the functionality 
> that has been written:
>
> * Pull patches from Github by PR #
> * Comment on patches in Github, given credentials
> * Comment on specific lines in Github, given credentials
> * Test patches against the branch/repo that the pull request is 
> against
> * GH<->JIRA intelligence such that if a PR mentions an issue as the 
> leading text in the subject line or an issue mentions a PR in the comments, 
> pull from github and put a comment in both places (again, given credentials)


Jenkins builds are all driven off of the "Precommit Admin" job that
does a JQL search for enabled projects with open patches:

https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-Admin/

I believe with the current integration, that means we'll find an test
any github PRs that are mentioned in a jira ticket that is in
PATCH_AVAILABLE status.

At some point we'll need an exemplar jenkins trigger job that just
activates on open PRs, at least for ASF projects. But no such job
exists now.


-- 
Sean


Re: IMPORTANT: automatic changelog creation

2015-07-08 Thread Sean Busbey
On Jul 8, 2015 2:13 AM, Tsuyoshi Ozawa oz...@apache.org wrote:

 +1, thanks Allen and Andrew for taking lots effort!

  Is there any possibility that, we can restrict someone from editing the
  issue in jira once its marked as closed after release?

 Vinay's comment looks considerable for us to me. What do you think?


Mistakes happen, even during the release process.

Presuming the set of folks who can edit closed tickets is already
restricted to contributors, why not assume any edits are the community
making things more accurate?

-- 
Sean


Re: [DISCUSS] project for pre-commit patch testing (was Re: upstream jenkins build broken?)

2015-07-11 Thread Sean Busbey
As mentioned on HADOOP-12111, there is now an incubator-style proposal:
http://wiki.apache.org/incubator/YetusProposal

On Wed, Jun 24, 2015 at 9:41 AM, Sean Busbey bus...@cloudera.com wrote:

 Hi Folks!

 Work in a feature branch is now being tracked by HADOOP-12111.

 On Thu, Jun 18, 2015 at 10:07 PM, Sean Busbey bus...@cloudera.com wrote:

 It looks like we have consensus.

 I'll start drafting up a proposal for the next board meeting (July 15th).
 Once we work out the name I'll submit a PODLINGNAMESEARCH jira to track
 that we did due diligence on whatever we pick.

 In the mean time, Hadoop PMC would y'all be willing to host us in a
 branch so that we can start prepping things now? We would want branch
 commit rights for the proposed new PMC.


 -Sean


 On Mon, Jun 15, 2015 at 6:47 PM, Sean Busbey bus...@cloudera.com wrote:

 Oof. I had meant to push on this again but life got in the way and now
 the June board meeting is upon us. Sorry everyone. In the event that this
 ends up contentious, hopefully one of the copied communities can give us a
 branch to work in.

 I know everyone is busy, so here's the short version of this email: I'd
 like to move some of the code currently in Hadoop (test-patch) into a new
 TLP focused on QA tooling. I'm not sure what the best format for priming
 this conversation is. ORC filled in the incubator project proposal
 template, but I'm not sure how much that confused the issue. So to start,
 I'll just write what I'm hoping we can accomplish in general terms here.

 All software development projects that are community based (that is,
 accepting outside contributions) face a common QA problem for vetting
 in-coming contributions. Hadoop is fortunate enough to be sufficiently
 popular that the weight of the problem drove tool development (i.e.
 test-patch). That tool is generalizable enough that a bunch of other TLPs
 have adopted their own forks. Unfortunately, in most projects this kind of
 QA work is an enabler rather than a primary concern, so often the tooling
 is worked on ad-hoc and little shared improvements happen across projects. 
 Since
 the tooling itself is never a primary concern, any made is rarely reused
 outside of ASF projects.

 Over the last couple months a few of us have been working on
 generalizing the tooling present in the Hadoop code base (because it was
 the most mature out of all those in the various projects) and it's reached
 a point where we think we can start bringing on other downstream users.
 This means we need to start establishing things like a release cadence and
 to grow the new contributors we have to handle more project responsibility.
 Personally, I think that means it's time to move out from under Hadoop to
 drive things as our own community. Eventually, I hope the community can
 help draw in a group of folks traditionally underrepresented in ASF
 projects, namely QA and operations folks.

 I think test-patch by itself has enough scope to justify a project.
 Having a solid set of build tools that are customizable to fit the norms of
 different software communities is a bunch of work. Making it work well in
 both the context of automated test systems like Jenkins and for individual
 developers is even more work. We could easily also take over maintenance of
 things like shelldocs, since test-patch is the primary consumer of that
 currently but it's generally useful tooling.

 In addition to test-patch, I think the proposed project has some future
 growth potential. Given some adoption of test-patch to prove utility, the
 project could build on the ties it makes to start building tools to help
 projects do their own longer-run testing. Note that I'm talking about the
 tools to build QA processes and not a particular set of tested components.
 Specifically, I think the ChaosMonkey work that's in HBase should be
 generalizable as a fault injection framework (either based on that code or
 something like it). Doing this for arbitrary software is obviously very
 difficult, and a part of easing that will be to make (and then favor)
 tooling to allow projects to have operational glue that looks the same.
 Namely, the shell work that's been done in hadoop-functions.sh would be a
 great foundational layer that could bring good daemon handling practices to
 a whole slew of software projects. In the event that these frameworks and
 tools get adopted by parts of the Hadoop ecosystem, that could make the job
 of i.e. Bigtop substantially easier.

 I've reached out to a few folks who have been involved in the current
 test-patch work or expressed interest in helping out on getting it used in
 other projects. Right now, the proposed PMC would be (alphabetical by last
 name):

 * Andrew Bayer (ASF member, incubator pmc, bigtop pmc, flume pmc,
 jclouds pmc, sqoop pmc, all around Jenkins expert)
 * Sean Busbey (ASF member, accumulo pmc, hbase pmc)
 * Nick Dimiduk (hbase pmc, phoenix pmc)
 * Chris Nauroth (ASF member, incubator pmc, hadoop pmc

[Test-Patch TLP] consensus on naming

2015-07-07 Thread Sean Busbey
Hi folks!

It's almost time for the July board meeting, so we need to get the ball
rolling on the proposal for a new TLP focused on QA[1].

Once issue outstanding from the original discussion is consensus on a name.
We need to get some consensus together so that I can start verifying that
the name is usable for an ASF project via a podling name search jira [2].

A brief review, so far the only stated naming preferences are Allen W's
desire for something shell related and my desire for a project name
beginning with the letter Y.

We have a proposed name of Yetus; it's an archaic genus of sea snail and
from my initial review should be usable.

Any other strong feelings on naming? Any specific objections to Yetus?

[1]: http://s.apache.org/yetus-discuss-hadoop
[2]: https://issues.apache.org/jira/browse/PODLINGNAMESEARCH/

-- 
Sean


Re: [YETUS] let's start getting some releases out the door

2015-08-28 Thread Sean Busbey
just  to provide an alternative, maybe one that needs a new thread, is
there a reason that a version-info-plugin in Yetus is a better idea than
trying to get it into the Maven project itself?

On Fri, Aug 28, 2015 at 10:09 AM, Sean Busbey bus...@cloudera.com wrote:

 How about the Version-Info maven plugin?


 https://github.com/apache/hadoop/blob/trunk/hadoop-maven-plugins//src/main/java/org/apache/hadoop/maven/plugin/versioninfo/VersionInfoMojo.java

 It seems generally useful outside of Hadoop and it helps projects be
 specific when referencing their versions in other places (like if we add a
 maven plugin that runs releasedocmaker, or when building their api docs).

 On Thu, Aug 27, 2015 at 3:53 PM, Allen Wittenauer a...@altiscale.com
 wrote:


 On Aug 27, 2015, at 8:21 AM, Sean Busbey bus...@cloudera.com wrote:
 
  Allen, you've been chugging away at the build support changes in
  test-patch[4]. That looks like it's going to change a bunch of stuff.
  Should we wait for it to land to have an initial release?

 I think so, yes.  It’s getting very very close to finished.
 Adding gradle  scala support requires a lot of deep changes with some
 major API changes to go with them.   I’d rather get those in sooner rather
 than later, since those changes open the door to add even more build tools
 and compiled language support in a much easier fashion.

  Thoughts? Concerns? Other things folks want to make sure get done? Maybe
  the build question for test-patch[5]? ;)

 HADOOP-12298 is definitely a blocker.  Otherwise builds break and
 the output is weird with mvn site.




 --
 Sean




-- 
Sean


Re: [YETUS] let's start getting some releases out the door

2015-08-28 Thread Sean Busbey
How about the Version-Info maven plugin?

https://github.com/apache/hadoop/blob/trunk/hadoop-maven-plugins//src/main/java/org/apache/hadoop/maven/plugin/versioninfo/VersionInfoMojo.java

It seems generally useful outside of Hadoop and it helps projects be
specific when referencing their versions in other places (like if we add a
maven plugin that runs releasedocmaker, or when building their api docs).

On Thu, Aug 27, 2015 at 3:53 PM, Allen Wittenauer a...@altiscale.com wrote:


 On Aug 27, 2015, at 8:21 AM, Sean Busbey bus...@cloudera.com wrote:
 
  Allen, you've been chugging away at the build support changes in
  test-patch[4]. That looks like it's going to change a bunch of stuff.
  Should we wait for it to land to have an initial release?

 I think so, yes.  It’s getting very very close to finished. Adding
 gradle  scala support requires a lot of deep changes with some major API
 changes to go with them.   I’d rather get those in sooner rather than
 later, since those changes open the door to add even more build tools and
 compiled language support in a much easier fashion.

  Thoughts? Concerns? Other things folks want to make sure get done? Maybe
  the build question for test-patch[5]? ;)

 HADOOP-12298 is definitely a blocker.  Otherwise builds break and
 the output is weird with mvn site.




-- 
Sean


[YETUS] Yetus TLP approved

2015-09-17 Thread Sean Busbey
Hi Folks!

At yesterday's ASF board meeting the Apache Yetus TLP was approved. There's
still some ASF Infra work to get done[1] before we can start transitioning
our mailing list, jira, and code over.

Thanks to all the folks in Hadoop who've helped us along this process. I
look forward to our communities maintaining a healthy working relationship
in the future.

[1]: https://issues.apache.org/jira/browse/INFRA-10447

-- 
Sean


Re: [YETUS] Yetus TLP approved

2015-09-18 Thread Sean Busbey
The Apache Yetus dev list is now active:

http://mail-archives.apache.org/mod_mbox/yetus-dev/

The first post there has a pointer to the rest of the project
resources and the status of getting things set up.

On Thu, Sep 17, 2015 at 10:59 AM, Sean Busbey <bus...@cloudera.com> wrote:
> Hi Folks!
>
> At yesterday's ASF board meeting the Apache Yetus TLP was approved. There's
> still some ASF Infra work to get done[1] before we can start transitioning
> our mailing list, jira, and code over.
>
> Thanks to all the folks in Hadoop who've helped us along this process. I
> look forward to our communities maintaining a healthy working relationship
> in the future.
>
> [1]: https://issues.apache.org/jira/browse/INFRA-10447
>
> --
> Sean



-- 
Sean


Re: continuing releases on Apache Hadoop 2.6.x

2015-11-20 Thread Sean Busbey
Early december would be great, presuming the RC process doesn't take too
long. By then it'll already have over a month since the 2.6.2 release and
I'm sure the folks contributing the 18 patches we already have in would
like to see their work out there.

On Fri, Nov 20, 2015 at 7:51 AM, Junping Du  wrote:

> +1. Early Dec sounds too early for 2.6.3 release given we only have 18
> patches since recently release 2.6.2.
> We should nominate more fixes and wait a while for the feedback on 2.6.2.
>
> Thanks,
>
> Junping
> 
> From: Vinod Vavilapalli 
> Sent: Thursday, November 19, 2015 11:34 PM
> To: yarn-...@hadoop.apache.org
> Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> mapreduce-...@hadoop.apache.org
> Subject: Re: continuing releases on Apache Hadoop 2.6.x
>
> I see 18 JIRAs across the sub-projects as of now in 2.6.3. Seems like we
> will have a reasonable number of fixes if we start an RC early december.
>
> In the mean while, we should also review 2.7.3 and 2.8.0 blocker /
> critical list and see if it makes sense to backport any of those into 2.6.3.
>
> +Vinod
>
>
> On Nov 17, 2015, at 5:10 PM, Sangjin Lee > wrote:
>
> I'd like to pick up this email discussion again. It is time that we started
> thinking about the next release in the 2.6.x line. IMO we want to walk the
> balance between maintaining a reasonable release cadence and getting a good
> amount of high-quality fixes. The timeframe is a little tricky as the
> holidays are approaching. If we have enough fixes accumulated in
> branch-2.6, some time early December might be a good target for cutting the
> first release candidate. Once we miss that window, I think we are looking
> at next January. I'd like to hear your thoughts on this.
>
>


-- 
Sean


Re: hadoop-build-tools/src/main/resources/META-INF/

2016-06-20 Thread Sean Busbey
file a jira please and I'll take a look.

On Fri, Jun 17, 2016 at 4:10 PM, Xiao Chen <x...@cloudera.com> wrote:
> Thanks Steve for reporting the issue and Sean for the suggestion. This is
> indeed from HADOOP-12893 (blush).
>
> I'm no maven expert so appreciate any recommendations.
>
> The reason for the current way is that for the L to be patched into a jar,
> it seems that maven remote resource plugin (which named itself to be the
> typical Apache licensing way) requires the files to be under
> src/main/resources. This was mentioned in their example, and I wasn't able
> to trick it to pack things not in there. I wish there were more examples to
> help in our case.
>
> So, in HADOOP-12893 I put a step to copy the L into
> hadoop-build-tools/src/main/resources dir, to allow it get packed into the
> jar. I thought about symlink but don't think it's a good way for Windows
> builds.
>
> It's not committed because we don't want an extra copy of L, we could list
> it in .gitignore.
>
>
> P.S. Tried a bit with Sean's suggestion of making it under
> target/generated-sources, but couldn't get the plugin to include it. I'm
> happy to try out more elegant solutions if you have any suggestions.
>
> Thanks!
>
>
> -Xiao
>
> On Fri, Jun 17, 2016 at 7:34 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> If it's generated and we're following The Maven Way, it should be in
>> target. probably in target/generated-sources
>>
>> On Fri, Jun 17, 2016 at 9:33 AM, Steve Loughran <ste...@hortonworks.com>
>> wrote:
>> >
>> > I see (presumably from the licensing work), that I'm now getting
>> > hadoop-build-tools/src/main/resources/META-INF/ as an untracked directory.
>> >
>> > If this is generated, should it be in the source tree? And if so, should
>> > it be committed, or listed in .gitignore?
>> >
>> > -
>> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>>
>>
>>
>> --
>> busbey
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: hadoop-build-tools/src/main/resources/META-INF/

2016-06-17 Thread Sean Busbey
If it's generated and we're following The Maven Way, it should be in
target. probably in target/generated-sources

On Fri, Jun 17, 2016 at 9:33 AM, Steve Loughran  wrote:
>
> I see (presumably from the licensing work), that I'm now getting  
> hadoop-build-tools/src/main/resources/META-INF/ as an untracked directory.
>
> If this is generated, should it be in the source tree? And if so, should it 
> be committed, or listed in .gitignore?
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] 2.7.3 release?

2016-03-29 Thread Sean Busbey
It's been about 2 months since 2.7.2 came out, and there are currently
135 issues in resolved status with a fixVersion of 2.7.3:

  * 32 Common
  * 52 HDFS
  * 33 YARN
  * 18 MAPREDUCE

Could we move towards a release soon?

-- 
busbey


Re: 2.7.3 release plan

2016-04-07 Thread Sean Busbey
On Wed, Apr 6, 2016 at 6:26 PM, Colin McCabe  wrote:
> In general, the only bundled native component I can see is lz4.  I guess
> debatably we should add tree.h to the NOTICE file as well, since it came
> from BSD and is licensed under that license.
>
> Please keep in mind bundling means "included in the source tree", NOT
> "downloaded during the build process."  "mvn package" dumps a ton of
> jars in the build directory, but these dependencies aren't considered
> bundled since their source does not appear in our git repo.  Similarly,
> linking against a library doesn't make it "bundled", nor does dlopening
> symbols in that library.
>
> The big omission is that we have a lot of Javascript source files in our
> source tree that do not appear in LICENSE or NOTICE.  I agree that we
> should address those before making a new release.
>
> best,
>
>

Each artifact that the PMC publishes must abide by the ASF licensing
policy. That includes

* Source release artifact
* any convenience binary artifacts places on dist.apache
* any convenience jars put into the ASF Nexus repository

That likely means that we bundle much more than just what's in the source tree.

(Though this sounds like we're getting off topic for the 2.7.3 release plan.)

-- 
busbey


Re: Branch policy question

2016-03-22 Thread Sean Busbey
VOTE threads tend to get more eyes than random JIRAs.

On Tue, Mar 22, 2016 at 1:23 PM, Andrew Wang 
wrote:

> A branch sounds fine, but how are we going to get 3 +1's to merge it? If
> it's hard to find one reviewer, seems even harder to find two.
>
> On Tue, Mar 22, 2016 at 10:56 AM, Allen Wittenauer <
> allenwittena...@yahoo.com.invalid> wrote:
>
> >
> > > On Mar 22, 2016, at 10:49 AM, larry mccay 
> wrote:
> > >
> > > That sounds like a reasonable approach and valid use of branches to me.
> > >
> > > Perhaps a set of functional tests could be provided/identified that
> would
> > > help the review process by showing backward compatibility along with
> new
> > > extensions for things like dynamic commands?
> > >
> >
> > This is going into trunk, so no need for backward compatibility.
> >
>



-- 
busbey


Re: [DISCUSS] 2.7.3 release?

2016-03-29 Thread Sean Busbey
Could we train up some additional volunteers to act as release
managers? We shouldn't put all the weight of running maintenance
releases on just you Vinod.

On Tue, Mar 29, 2016 at 12:33 PM, Vinod Kumar Vavilapalli
<vino...@apache.org> wrote:
> Sorry, I got busy offline on various things for a couple of months - so 2.7.3 
> suffered.
>
> 2.8.0 also suffered, but because javadocs etc are completely broken on 
> branch-2, and I had spent quite a bit of time narrowing down all the 
> backports needed, sigh.
>
> Will spend some time this week and the next.
>
> +Vinod
>
>> On Mar 29, 2016, at 9:18 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> It's been about 2 months since 2.7.2 came out, and there are currently
>> 135 issues in resolved status with a fixVersion of 2.7.3:
>>
>>  * 32 Common
>>  * 52 HDFS
>>  * 33 YARN
>>  * 18 MAPREDUCE
>>
>> Could we move towards a release soon?
>>
>> --
>> busbey
>>
>



-- 
busbey


Re: 2.7.3 release plan

2016-03-31 Thread Sean Busbey
As of 2 days ago, there were already 135 jiras associated with 2.7.3,
if *any* of them end up introducing a regression the inclusion of
HDFS-8791 means that folks will have cluster downtime in order to back
things out. If that happens to any substantial number of downstream
folks, or any particularly vocal downstream folks, then it is very
likely we'll lose the remaining trust of operators for rolling out
maintenance releases. That's a pretty steep cost.

Please do not include HDFS-8791 in any 2.6.z release. Folks having to
be aware that an upgrade from e.g. 2.6.5 to 2.7.2 will fail is an
unreasonable burden.

I agree that this fix is important, I just think we should either cut
a version of 2.8 that includes it or find a way to do it that gives an
operational path for rolling downgrade.

On Thu, Mar 31, 2016 at 10:10 AM, Junping Du <j...@hortonworks.com> wrote:
> Thanks for bringing up this topic, Sean.
> When I released our latest Hadoop release 2.6.4, the patch of HDFS-8791 
> haven't been committed in so that's why we didn't discuss this earlier.
> I remember in JIRA discussion, we treated this layout change as a Blocker bug 
> that fixing a significant performance regression before but not a normal 
> performance improvement. And I believe HDFS community already did their best 
> with careful and patient to deliver the fix and other related patches (like 
> upgrade fix in HDFS-8578). Take an example of HDFS-8578, you can see 30+ 
> rounds patch review back and forth by senior committers, not to mention the 
> outstanding performance test data in HDFS-8791.
> I would trust our HDFS committers' judgement to land HDFS-8791 on 2.7.3. 
> However, that needs Vinod's final confirmation who serves as RM for 
> branch-2.7. In addition, I didn't see any blocker issue to bring it into 
> 2.6.5 now.
> Just my 2 cents.
>
> Thanks,
>
> Junping
>
> 
> From: Sean Busbey <bus...@cloudera.com>
> Sent: Thursday, March 31, 2016 2:57 PM
> To: hdfs-...@hadoop.apache.org
> Cc: Hadoop Common; yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
> Subject: Re: 2.7.3 release plan
>
> A layout change in a maintenance release sounds very risky. I saw some
> discussion on the JIRA about those risks, but the consensus seemed to
> be "we'll leave it up to the 2.6 and 2.7 release managers." I thought
> we did RMs per release rather than per branch? No one claiming to be a
> release manager ever spoke up AFAICT.
>
> Should this change be included? Should it go into a special 2.8
> release as mentioned in the ticket?
>
> On Thu, Mar 31, 2016 at 1:45 AM, Akira AJISAKA
> <ajisa...@oss.nttdata.co.jp> wrote:
>> Thank you Vinod!
>>
>> FYI: 2.7.3 will be a bit special release.
>>
>> HDFS-8791 bumped up the datanode layout version,
>> so rolling downgrade from 2.7.3 to 2.7.[0-2]
>> is impossible. We can rollback instead.
>>
>> https://issues.apache.org/jira/browse/HDFS-8791
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
>>
>> Regards,
>> Akira
>>
>>
>> On 3/31/16 08:18, Vinod Kumar Vavilapalli wrote:
>>>
>>> Hi all,
>>>
>>> Got nudged about 2.7.3. Was previously waiting for 2.6.4 to go out (which
>>> did go out mid February). Got a little busy since.
>>>
>>> Following up the 2.7.2 maintenance release, we should work towards a
>>> 2.7.3. The focus obviously is to have blocker issues [1], bug-fixes and *no*
>>> features / improvements.
>>>
>>> I hope to cut an RC in a week - giving enough time for outstanding blocker
>>> / critical issues. Will start moving out any tickets that are not blockers
>>> and/or won’t fit the timeline - there are 3 blockers and 15 critical tickets
>>> outstanding as of now.
>>>
>>> Thanks,
>>> +Vinod
>>>
>>> [1] 2.7.3 release blockers:
>>> https://issues.apache.org/jira/issues/?filter=12335343
>>>
>>
>
>
>
> --
> busbey



-- 
busbey


Re: 2.7.3 release plan

2016-03-31 Thread Sean Busbey
A layout change in a maintenance release sounds very risky. I saw some
discussion on the JIRA about those risks, but the consensus seemed to
be "we'll leave it up to the 2.6 and 2.7 release managers." I thought
we did RMs per release rather than per branch? No one claiming to be a
release manager ever spoke up AFAICT.

Should this change be included? Should it go into a special 2.8
release as mentioned in the ticket?

On Thu, Mar 31, 2016 at 1:45 AM, Akira AJISAKA
 wrote:
> Thank you Vinod!
>
> FYI: 2.7.3 will be a bit special release.
>
> HDFS-8791 bumped up the datanode layout version,
> so rolling downgrade from 2.7.3 to 2.7.[0-2]
> is impossible. We can rollback instead.
>
> https://issues.apache.org/jira/browse/HDFS-8791
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
>
> Regards,
> Akira
>
>
> On 3/31/16 08:18, Vinod Kumar Vavilapalli wrote:
>>
>> Hi all,
>>
>> Got nudged about 2.7.3. Was previously waiting for 2.6.4 to go out (which
>> did go out mid February). Got a little busy since.
>>
>> Following up the 2.7.2 maintenance release, we should work towards a
>> 2.7.3. The focus obviously is to have blocker issues [1], bug-fixes and *no*
>> features / improvements.
>>
>> I hope to cut an RC in a week - giving enough time for outstanding blocker
>> / critical issues. Will start moving out any tickets that are not blockers
>> and/or won’t fit the timeline - there are 3 blockers and 15 critical tickets
>> outstanding as of now.
>>
>> Thanks,
>> +Vinod
>>
>> [1] 2.7.3 release blockers:
>> https://issues.apache.org/jira/issues/?filter=12335343
>>
>



-- 
busbey


Re: ASF OS X Build Infrastructure

2016-05-20 Thread Sean Busbey
Some talk about the MSDN-for-committers program recently passed by on a private
list. It's still active, it just changed homes within Microsoft. The
info should still be in the committer repo. If something is amiss
please let me know and I'll pipe up to the folks already plugged in to
confirming it's active.

On Fri, May 20, 2016 at 12:13 PM, Chris Nauroth
 wrote:
> It's very disappointing to see that vanish.  I'm following up to see if I
> can learn more about what happened or if I can do anything to help
> reinstate it.
>
> --Chris Nauroth
>
>
>
>
> On 5/20/16, 6:11 AM, "Steve Loughran"  wrote:
>
>>
>>> On 20 May 2016, at 10:40, Lars Francke  wrote:
>>>

 Regarding lack of personal access to anything but Linux, I'll take
this as
 an opportunity to remind everyone that ASF committers (not just
limited to
 Hadoop committers) are entitled to a free MSDN license, which can get
you
 a Windows VM for validating Windows issues and any patches that touch
 cross-platform concerns, like the native code.  Contributors who are
not
 committers still might struggle to get access to Windows, but all of us
 reviewing and committing patches do have access.

>>>
>>> Actually, from all I can tell this MSDN offer has been discontinued for
>>> now. All the information has been removed from the committers repo. Do
>>>you
>>> have any more up to date information on this?
>>>
>>
>>
>>That's interesting.
>>
>>I did an SVN update and it went away..looks like something happened on
>>April 26
>>
>>No idea, though the svn log has a bit of detail
>>
>>-
>>To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
>>For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>>
>>
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Compatibility guidelines for toString overrides

2016-05-12 Thread Sean Busbey
As a downstream user of Hadoop, it would be much clearer if the
toString functions included the appropriate annotations to say they're
non-public, evolving, or whatever.

Most downstream users of Hadoop aren't going to remember in-detail
exceptions to the java API compatibility rules, once they see that a
class is labeled Public/Stable, they're going to presume that applies
to all non-private members.

On Thu, May 12, 2016 at 9:32 AM, Colin McCabe  wrote:
> Hi all,
>
> Recently a discussion came up on HADOOP-13028 about the wisdom of
> overloading S3AInputStream#toString to output statistics information.
> It's a difficult judgement for me to make, since I'm not aware of any
> compatibility guidelines for InputStream#toString.  Do we have
> compatibility guidelines for toString functions?
>
> It seems like the output of toString functions is usually used as a
> debugging aid, rather than as a stable format suitable for UI display or
> object serialization.  Clearly, there are a few cases where we might
> want to specifically declare that a toString method is a stable API.
> However, I think if we attempt to treat the toString output of all
> public classes as stable, we will have greatly increased the API
> surface.  Should we formalize this and declare that toString functions
> are @Unstable, Evolving unless declared otherwise?
>
> best,
> Colin
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge feature branch HADOOP-12930

2016-05-11 Thread Sean Busbey
+1 (non-binding)

reviewed everything, filed an additional subtask for a very trivial
typo in the docs. should be fine to make a full issue after close and
then fix.

tried merging locally, tried running through new shell tests (both
with and without bats installed), tried making an example custom
command (valid and malformed). everything looks great.

On Mon, May 9, 2016 at 1:26 PM, Allen Wittenauer  wrote:
>
> Hey gang!
>
> I’d like to call a vote to run for 7 days (ending May 16 at 13:30 PT) 
> to merge the HADOOP-12930 feature branch into trunk. This branch was 
> developed exclusively by me as per the discussion two months ago as a way to 
> make what would be a rather large patch hopefully easier to review.  The vast 
> majority of the branch is code movement in the same file, additional license 
> headers, maven assembly hooks for distribution, and variable renames. Not a 
> whole lot of new code, but a big diff file none-the-less.
>
> This branch modifies the ‘hadoop’, ‘hdfs’, ‘mapred’, and ‘yarn’ 
> commands to allow for subcommands to be added or modified at runtime.  This 
> allows for individual users or entire sites to tweak the execution 
> environment to suit their local needs.  For example, it has been a practice 
> for some locations to change the distcp jar out for a custom one.  Using this 
> functionality, it is possible that the ‘hadoop distcp’ command could run the 
> local version without overwriting the bundled jar and for existing 
> documentation (read: results from Internet searches) to work as written 
> without modification. This has the potential to be a huge win, especially for:
>
> * advanced end users looking to supplement the Apache Hadoop 
> experience
> * operations teams that may be able to leverage existing 
> documentation without having to remain local “exception” docs
> * development groups wanting an easy way to trial 
> experimental features
>
> Additionally, this branch includes the following, related changes:
>
> * Adds the first unit tests for the ‘hadoop’ command
> * Adds the infrastructure for hdfs script testing and the 
> first unit test for the ‘hdfs’ command
> * Modifies the hadoop-tools components to be dynamic rather 
> than hard coded
> * Renames the shell profiles for hdfs, mapred, and yarn to be 
> consistent with other bundled profiles, including the ones introduced in this 
> branch
>
> Documentation, including a ‘hello world’-style example, is in the 
> UnixShellGuide markdown file.  (Of course!)
>
>  I am at ApacheCon this week if anyone wants to discuss in-depth.
>
> Thanks!
>
> P.S.,
>
> There are still two open sub-tasks.  These are blocked by other 
> issues so that we may add unit testing to the shell code in those respective 
> areas.  I’ll covert to full issues after HADOOP-12930 is closed.
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Yes/No newbie question on contributing

2016-07-26 Thread Sean Busbey
The current HowToContribute guide expressly tells folks that they
should ensure all the tests run and pass before and after their
change.

Sounds like we're due for an update if the expectation is now that
folks should be using -DskipTests and runs on particular modules.
Maybe we could instruct folks on running the same checks we'll do in
the automated precommit builds?

On Tue, Jul 26, 2016 at 1:47 PM, Vinod Kumar Vavilapalli
 wrote:
> The short answer is that it is expected to pass without any errors.
>
> On branch-2.x, that command passes cleanly without any errors though it takes 
> north of 10 minutes. Note that I run it with -DskipTests - you don’t want to 
> wait for all the unit tests to run, that’ll take too much time. I expect 
> trunk to be the same too.
>
> Which branch are you running this against? What errors are you seeing? If it 
> is unit-tests you are talking about, you can instead run with skipTests, run 
> only specific tests or all tests in the module you are touching, make sure 
> they pass and then let Jenkins infrastructure run the remaining tests when 
> you submit the patch.
>
> +Vinod
>
>> On Jul 26, 2016, at 11:41 AM, Martin Rosse  wrote:
>>
>> Hi,
>>
>> In the How To Contribute doc, it says:
>>
>> "Try getting the project to build and test locally before writing code"
>>
>> So, just to be 100% certain before I keep troubleshooting things, this
>> means I should be able to run
>>
>> mvn clean install -Pdist -Dtar
>>
>> without getting any failures or errors at all...none...zero, right?
>>
>> I am surprised at how long this is taking as errors keep cropping up.
>> Should I just expect it to really take many hours (already at 10+) to work
>> through these issues? I am setting up a dev environment on an Ubuntu 14.04
>> 64-bit desktop from the AWS marketplace running on EC2.
>>
>> It would seem it's an obvious YES answer, but given the time investment
>> I've been making I just wanted to be absolutely sure before continuing.
>>
>> I thought it possible that maybe some errors, depending on their nature,
>> can be overlooked, and that other developers may be doing that in practice.
>> And hence perhaps I should as well to save time. Yes or No??
>>
>> Thank you,
>>
>> Martin
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-21 Thread Sean Busbey
thanks for bringing this up! big +1 on upgrading dependencies for 3.0.

I have an updated patch for HADOOP-11804 ready to post this week. I've
been updating HBase's master branch to try to make use of it, but
could use some other reviews.

On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> Hi developers,
>
> I'd like to discuss how to make an advance towards dependency
> management in Apache Hadoop trunk code since there has been lots work
> about updating dependencies in parallel. Summarizing recent works and
> activities as follows:
>
> 0) Currently, we have merged minimum update dependencies for making
> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> 1) After that, some people suggest that we should update the other
> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>
> Main problems we try to solve in the activities above is as follows:
>
> * 1) tries to solve dependency hell between user-level jar and
> system(Hadoop)-level jar.
> * 2) tries to solve updating old libraries.
>
> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> to separate class loader between client-side dependencies and
> server-side dependencies in Hadoop, so we can the change policy of
> updating libraries after doing 2). We can also decide which libraries
> can be shaded after 2).
>
> Hence, IMHO, a straight way we should go to is doing 2 at first.
> After that, we can update both client-side and server-side
> dependencies based on new policy(maybe we should discuss what kind of
> incompatibility is acceptable, and the others are not).
>
> Thoughts?
>
> Thanks,
> - Tsuyoshi
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Official Docker Image at release time

2016-07-21 Thread Sean Busbey
Folks might want to take a look at

https://issues.apache.org/jira/browse/HBASE-12721

and its associated review board posting of aggregate work. One of hte
HBase community members has been creating infra (geared towards test
ATM) for spinning up clusters of docker images. from what I
understand, right now that includes both hdfs, hbase, and yarn
clusters.

On Thu, Jul 21, 2016 at 5:07 AM, Tsuyoshi Ozawa  wrote:
> Hi Kai,
>
> Thanks for your positive feedback!
>
>> I think also providing Dockerfile in some flexible form (like template
> file or fetching configuration from env variables?)
> is useful because Docker image has fixed configuration and there is no room
> unless overriding with FROM.
>
> Sounds good idea. What do you think about the difference of roles
> between master and workers? Should we prepare separate Dockerfiles for each
> role?
>
> Thanks,
> - Tsuyoshi
>
> On Tuesday, 19 July 2016, Sasaki Kai  wrote:
>
>> Hi Tsuyoshi
>>
>> Official docker image of Hadoop sounds very good to me.
>> In my use case, we usually use Docker image of Hadoop as PoC or
>> development cluster because it is easy to deploy and modify.
>> Currently we use below Docker image or Dockerfile to launch our
>> development cluster.
>> https://hub.docker.com/r/sequenceiq/hadoop-docker/
>>
>> But it does not use the latest Hadoop package and dependencies. There is
>> often some time lag to catch up and updating.
>>
>> So there is a reason to provide Docker image and Dockerfile from
>> community. It enables developers to try Hadoop easily.
>> I think also providing Dockerfile in some flexible form (like template
>> file or fetching configuration from env variables?)
>> is useful because Docker image has fixed configuration and there is no
>> room unless overriding with FROM.
>> I assume developers who want to try new Hadoop also would like to modify
>> configuration or dependencies.
>> This can be achieved by flexible Dockerfile which uses configuration
>> passed from environment variables.
>>
>> Thanks
>>
>> Kai Sasaki
>>
>>
>> On Jul 19, 2016, at 4:46 PM, Tsuyoshi Ozawa > > wrote:
>>
>> Hi developers,
>>
>> Klaus mentioned the availability of an official docker image of Apache
>> Hadoop. Is it time that we start to distribute an official docker
>> image at release time?
>>
>>
>> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E
>>
>> Thoughts?
>>
>> Thanks,
>> - Tsuyoshi
>>
>> -
>> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> 
>> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>> 
>>
>>
>>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-21 Thread Sean Busbey
> Longer-term, I assume the 2.x line is not ending with 2.8. So we'd still
> have the issue of things committed for 2.9.0 that will be appearing for the
> first time in 3.0.0-alpha1. Assuming a script exists to fix up 2.9 JIRAs,
> it's only incrementally more work to also fix up 2.8 and other unreleased
> versions too.

This. this is why I've been a bit confused on folks not wanting to
invest the time in getting multiple major branches working correctly.
With my "Hadoop community" hat on, I see that we struggle with
maintaining multiple maintenance lines just within 2.y and I worry how
we're going to do once 3.y is going.

With my downstream "HBase community" hat on, I'm pretty sure I still
need there to still be 2.y releases. AFAIK the HBase community hasn't
made any plans to e.g. abandon Hadoop 2.y versions in our next major
release and we'd be very sad if all future features we needed added to
e.g. HDFS forced our users to upgrade across a major version.


On Thu, Jul 21, 2016 at 2:33 PM, Andrew Wang  wrote:
> I really, really want a 3.0.0-alpha1 ASAP, since it's basically impossible
> for downstreams to test incompat changes and new features without a release
> artifact. I've been doing test builds, and branch-3.0.0-alpha1 is ready for
> an RC besides possibly this fix version issue.
>
> I'm not too worried about splitting community bandwidth, for the following
> reasons:
>
> * 3.0.0-alpha1 is very explicitly an alpha, which means no quality or
> compatibility guarantees. It needs less vetting than a 2.x release.
> * Given that 3.0.0 is still in alpha, there aren't many true showstopper
> bugs. Most blockers I see are also apply to both 2.x as well as 3.0.0.
> * Community bandwidth isn't zero-sum. This particularly applies to people
> working on features that are only present in trunk, like EC, shell script
> rewrite, etc.
>
> Longer-term, I assume the 2.x line is not ending with 2.8. So we'd still
> have the issue of things committed for 2.9.0 that will be appearing for the
> first time in 3.0.0-alpha1. Assuming a script exists to fix up 2.9 JIRAs,
> it's only incrementally more work to also fix up 2.8 and other unreleased
> versions too.
>
> Best,
> Andrew
>
> On Thu, Jul 21, 2016 at 11:53 AM, Vinod Kumar Vavilapalli <
> vino...@apache.org> wrote:
>
>> The L & N fixes just went out, I’m working to push out 2.7.3 - running
>> into a Nexus issue. Once that goes out, I’ll immediately do a 2.8.0.
>>
>> Like I requested before in one of the 3.x threads, can we just line up
>> 3.0.0-alpha1 right behind 2.8.0?
>>
>> That simplifies most of this confusion, we can avoid splitting the
>> bandwidth from the community on fixing blockers / vetting these concurrent
>> releases. Waiting a little more for 3.0.0 alpha to avoid most of this is
>> worth it, IMO.
>>
>> Thanks
>> +Vinod
>>
>> > On Jul 21, 2016, at 11:34 AM, Andrew Wang 
>> wrote:
>> >
>> > Hi all,
>> >
>> > Since we're planning to spin releases off of both branch-2 and trunk, the
>> > changelog for 3.0.0-alpha1 based on JIRA information isn't accurate. This
>> > is because historically, we've only set 2.x fix versions, and 2.8.0 and
>> > 2.9.0 and etc have not been released. So there's a whole bunch of changes
>> > which will show up for the first time in 3.0.0-alpha1.
>> >
>> > I think I can write a script to (carefully) add 3.0.0-alpha1 to these
>> > JIRAs, but I figured I'd give a heads up here in case anyone felt
>> > differently. I can also update the HowToCommit page to match.
>> >
>> > Thanks,
>> > Andrew
>>
>>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Sean Busbey
My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)

On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
<a...@effectivemachines.com> wrote:
>
> Does any of this work actually help processes that sit outside of YARN?
>
>> On Jul 21, 2016, at 12:29 PM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
>>
>> I have an updated patch for HADOOP-11804 ready to post this week. I've
>> been updating HBase's master branch to try to make use of it, but
>> could use some other reviews.
>>
>> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa <oz...@apache.org> wrote:
>>> Hi developers,
>>>
>>> I'd like to discuss how to make an advance towards dependency
>>> management in Apache Hadoop trunk code since there has been lots work
>>> about updating dependencies in parallel. Summarizing recent works and
>>> activities as follows:
>>>
>>> 0) Currently, we have merged minimum update dependencies for making
>>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
>>> 1) After that, some people suggest that we should update the other
>>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
>>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
>>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>>>
>>> Main problems we try to solve in the activities above is as follows:
>>>
>>> * 1) tries to solve dependency hell between user-level jar and
>>> system(Hadoop)-level jar.
>>> * 2) tries to solve updating old libraries.
>>>
>>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
>>> to separate class loader between client-side dependencies and
>>> server-side dependencies in Hadoop, so we can the change policy of
>>> updating libraries after doing 2). We can also decide which libraries
>>> can be shaded after 2).
>>>
>>> Hence, IMHO, a straight way we should go to is doing 2 at first.
>>> After that, we can update both client-side and server-side
>>> dependencies based on new policy(maybe we should discuss what kind of
>>> incompatibility is acceptable, and the others are not).
>>>
>>> Thoughts?
>>>
>>> Thanks,
>>> - Tsuyoshi
>>>
>>> -
>>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>>
>>
>>
>>
>> --
>> busbey
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-21 Thread Sean Busbey
On Thu, Jul 21, 2016 at 4:32 PM, Vinod Kumar Vavilapalli
 wrote:
>> I really, really want a 3.0.0-alpha1 ASAP, since it's basically impossible 
>> for downstreams to test incompat changes and new features without a release 
>> artifact. I've been doing test builds, and branch-3.0.0-alpha1 is ready for 
>> an RC besides possibly this fix version issue.
>
> Not arguing against the need for an alpha release, the question is if it can 
> wait till after 2.8 gets done.
>
> Orthogonally, do we have a report of the incompatible changes? Like the one I 
> generated for some of the branch-2 releases using late jdiff work from Li Lu 
> etc. We should do this and fix any inadvertant incompatibilities. Without 
> seeing this list of incompatibilities, why even make an alpha release and 
> force downstream components to discover issues what can be identified through 
> running reports.
>

I can come up with this, atleast for Source / Binary API compatibility,
provided folks don't mind if I use the Java API Compliance Checker[1]
instead of jdiff.

I'm already familiar with quickly using it, esp with Audience
Annotations from my work in HBase.

Do you want this check from some particular branch-2 release? It
matters since the releases along branch-2 have themselves had some
noise[2].

[1]: https://github.com/lvc/japi-compliance-checker
[2]: http://abi-laboratory.pro/java/tracker/timeline/hadoop/

-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-21 Thread Sean Busbey
they've been puppetizing the Y! hosted machines, so it's probably
related. Can someone point me to the exact jenkins job? I htink I
fixed this for HBase once already.

On Thu, Jul 21, 2016 at 3:31 PM, Allen Wittenauer
 wrote:
>
> I've already asked on builds and infra if it was related to some other 
> changes they made.  As usual, radio silence.
>
>  On the flip side, the PPC machines are making builds and in many ways might 
> be more valuable.  e.g., pointing out that the leveldb changes in YARN have 
> pretty much completely broken it out of the box on non-x86.
>
>
>> On Jul 21, 2016, at 1:01 PM, Akira Ajisaka  
>> wrote:
>>
>> Hi folks,
>>
>> qbt for trunk+JDK8 on Linux/x86 is failing by
>>
>> executable '/home/jenkins/tools/maven/latest3//bin/mvn' for 'maven' does not 
>> exist.
>>
>> Would someone fix this?
>>
>> Regards,
>> Akira
>>
>> On 7/21/16 00:41, Apache Jenkins Server wrote:
>>> For more details, see 
>>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/101/
>>>
>>> [Jul 20, 2016 7:18:32 AM] (vvasudev) YARN-5309. Fix SSLFactory truststore 
>>> reloader thread leak in
>>> [Jul 20, 2016 12:42:51 PM] (stevel) HADOOP-13212 Provide an option to set 
>>> the socket buffers in
>>> [Jul 20, 2016 3:36:36 PM] (vinodkv) YARN-5340. Fixed a race condition in 
>>> RollingLevelDBTimelineStore that
>>> [Jul 20, 2016 5:44:11 PM] (aajisaka) HDFS-10425. Clean up NNStorage and 
>>> TestSaveNamespace. Contributed by
>>> [Jul 20, 2016 11:51:01 PM] (aajisaka) YARN-4883. Make consistent operation 
>>> name in AdminService. Contributed
>>> [Jul 20, 2016 11:57:55 PM] (wang) HADOOP-13383. Update release notes for 
>>> 3.0.0-alpha1.
>>> [Jul 21, 2016 3:38:20 AM] (lei) HADOOP-12928. Update netty to 3.10.5.Final 
>>> to sync with zookeeper. (lei)
>>> [Jul 21, 2016 6:50:47 AM] (rohithsharmaks) YARN-1126. Add validation of 
>>> users input nodes-states options to nodes
>>> [Jul 21, 2016 7:17:27 AM] (rohithsharmaks) YARN-5092. 
>>> TestRMDelegationTokens fails intermittently. Contributed by
>>>
>>>
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-21 Thread Sean Busbey
Okay, I have something hacked together as of this build:

https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109

It's terrible, but it will get us through when infra uses puppet
across the H* hosts. I can fix it again after that. (the puppet use
might also break all the precommit jobs, just a heads up).

On Thu, Jul 21, 2016 at 4:54 PM, Akira Ajisaka
<ajisa...@oss.nttdata.co.jp> wrote:
> Failing jenkins job
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/
>
> -Akira
>
>
> On 7/21/16 14:43, Sean Busbey wrote:
>>
>> they've been puppetizing the Y! hosted machines, so it's probably
>> related. Can someone point me to the exact jenkins job? I htink I
>> fixed this for HBase once already.
>>
>> On Thu, Jul 21, 2016 at 3:31 PM, Allen Wittenauer
>> <a...@effectivemachines.com> wrote:
>>>
>>>
>>> I've already asked on builds and infra if it was related to some other
>>> changes they made.  As usual, radio silence.
>>>
>>>  On the flip side, the PPC machines are making builds and in many ways
>>> might be more valuable.  e.g., pointing out that the leveldb changes in YARN
>>> have pretty much completely broken it out of the box on non-x86.
>>>
>>>
>>>> On Jul 21, 2016, at 1:01 PM, Akira Ajisaka <ajisa...@oss.nttdata.co.jp>
>>>> wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> qbt for trunk+JDK8 on Linux/x86 is failing by
>>>>
>>>> executable '/home/jenkins/tools/maven/latest3//bin/mvn' for 'maven' does
>>>> not exist.
>>>>
>>>> Would someone fix this?
>>>>
>>>> Regards,
>>>> Akira
>>>>
>>>> On 7/21/16 00:41, Apache Jenkins Server wrote:
>>>>>
>>>>> For more details, see
>>>>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/101/
>>>>>
>>>>> [Jul 20, 2016 7:18:32 AM] (vvasudev) YARN-5309. Fix SSLFactory
>>>>> truststore reloader thread leak in
>>>>> [Jul 20, 2016 12:42:51 PM] (stevel) HADOOP-13212 Provide an option to
>>>>> set the socket buffers in
>>>>> [Jul 20, 2016 3:36:36 PM] (vinodkv) YARN-5340. Fixed a race condition
>>>>> in RollingLevelDBTimelineStore that
>>>>> [Jul 20, 2016 5:44:11 PM] (aajisaka) HDFS-10425. Clean up NNStorage and
>>>>> TestSaveNamespace. Contributed by
>>>>> [Jul 20, 2016 11:51:01 PM] (aajisaka) YARN-4883. Make consistent
>>>>> operation name in AdminService. Contributed
>>>>> [Jul 20, 2016 11:57:55 PM] (wang) HADOOP-13383. Update release notes
>>>>> for 3.0.0-alpha1.
>>>>> [Jul 21, 2016 3:38:20 AM] (lei) HADOOP-12928. Update netty to
>>>>> 3.10.5.Final to sync with zookeeper. (lei)
>>>>> [Jul 21, 2016 6:50:47 AM] (rohithsharmaks) YARN-1126. Add validation of
>>>>> users input nodes-states options to nodes
>>>>> [Jul 21, 2016 7:17:27 AM] (rohithsharmaks) YARN-5092.
>>>>> TestRMDelegationTokens fails intermittently. Contributed by
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>>
>>>>
>>>>
>>>> -
>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>
>>
>>
>>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2016-07-21 Thread Sean Busbey
Infra's going to start trying out the puppet stuff on H2 in about an
hour to get a better idea of what might break when the full roll out
happens. If anyone notices something amiss, just give me a ping (or
try to update labels to use "yahoo-not-h2" instead of "Hadoop")

On Thu, Jul 21, 2016 at 6:49 PM, Sean Busbey <bus...@cloudera.com> wrote:
> Okay, I have something hacked together as of this build:
>
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/109
>
> It's terrible, but it will get us through when infra uses puppet
> across the H* hosts. I can fix it again after that. (the puppet use
> might also break all the precommit jobs, just a heads up).
>
> On Thu, Jul 21, 2016 at 4:54 PM, Akira Ajisaka
> <ajisa...@oss.nttdata.co.jp> wrote:
>> Failing jenkins job
>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/
>>
>> -Akira
>>
>>
>> On 7/21/16 14:43, Sean Busbey wrote:
>>>
>>> they've been puppetizing the Y! hosted machines, so it's probably
>>> related. Can someone point me to the exact jenkins job? I htink I
>>> fixed this for HBase once already.
>>>
>>> On Thu, Jul 21, 2016 at 3:31 PM, Allen Wittenauer
>>> <a...@effectivemachines.com> wrote:
>>>>
>>>>
>>>> I've already asked on builds and infra if it was related to some other
>>>> changes they made.  As usual, radio silence.
>>>>
>>>>  On the flip side, the PPC machines are making builds and in many ways
>>>> might be more valuable.  e.g., pointing out that the leveldb changes in 
>>>> YARN
>>>> have pretty much completely broken it out of the box on non-x86.
>>>>
>>>>
>>>>> On Jul 21, 2016, at 1:01 PM, Akira Ajisaka <ajisa...@oss.nttdata.co.jp>
>>>>> wrote:
>>>>>
>>>>> Hi folks,
>>>>>
>>>>> qbt for trunk+JDK8 on Linux/x86 is failing by
>>>>>
>>>>> executable '/home/jenkins/tools/maven/latest3//bin/mvn' for 'maven' does
>>>>> not exist.
>>>>>
>>>>> Would someone fix this?
>>>>>
>>>>> Regards,
>>>>> Akira
>>>>>
>>>>> On 7/21/16 00:41, Apache Jenkins Server wrote:
>>>>>>
>>>>>> For more details, see
>>>>>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/101/
>>>>>>
>>>>>> [Jul 20, 2016 7:18:32 AM] (vvasudev) YARN-5309. Fix SSLFactory
>>>>>> truststore reloader thread leak in
>>>>>> [Jul 20, 2016 12:42:51 PM] (stevel) HADOOP-13212 Provide an option to
>>>>>> set the socket buffers in
>>>>>> [Jul 20, 2016 3:36:36 PM] (vinodkv) YARN-5340. Fixed a race condition
>>>>>> in RollingLevelDBTimelineStore that
>>>>>> [Jul 20, 2016 5:44:11 PM] (aajisaka) HDFS-10425. Clean up NNStorage and
>>>>>> TestSaveNamespace. Contributed by
>>>>>> [Jul 20, 2016 11:51:01 PM] (aajisaka) YARN-4883. Make consistent
>>>>>> operation name in AdminService. Contributed
>>>>>> [Jul 20, 2016 11:57:55 PM] (wang) HADOOP-13383. Update release notes
>>>>>> for 3.0.0-alpha1.
>>>>>> [Jul 21, 2016 3:38:20 AM] (lei) HADOOP-12928. Update netty to
>>>>>> 3.10.5.Final to sync with zookeeper. (lei)
>>>>>> [Jul 21, 2016 6:50:47 AM] (rohithsharmaks) YARN-1126. Add validation of
>>>>>> users input nodes-states options to nodes
>>>>>> [Jul 21, 2016 7:17:27 AM] (rohithsharmaks) YARN-5092.
>>>>>> TestRMDelegationTokens fails intermittently. Contributed by
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -
>>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> -
>>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>>
>>>>
>>>>
>>>> -
>>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>>
>>>
>>>
>>>
>>
>
>
>
> --
> busbey



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-26 Thread Sean Busbey
Yes, the Java API Compliance Checker allows specifying Annotations to
pare down where incompatible changes happen. It was added some time
ago based on feedback from the Apache HBase project.

The limitations I've found are: 1) at least earlier versions only
supported annotations at the class level (which prevents things like
VisibleForTesting), 2) sometimes it will include more restricted
classes if they are used in less restrictive APIs (e.g. if a IA.Public
class makes use of a IA.Private class in a method signature).

At least when we've used it in HBase, these limitations have been very
easy to spot an explain in a small amount of text. I expect I will be
able to do the same with Hadoop. If we'd like to automate this, the
author has been very responsive to feature requests thus far.


On Mon, Jul 25, 2016 at 3:47 PM, Vinod Kumar Vavilapalli
 wrote:
> Actually, I wouldn’t trust this report as it stands today at all.
>
> I quickly glanced the report, looking for what it highlights as
> incompatible. But the ones that I saw have private / visible for testing
> annotations. Other than acting as useless evidence for those lashing out on
> branch-2, this won’t do much good.
>
> Whenever we start working towards switching to this tool, it should
> incorporate the same exclude-annotations logic that the jdiff code-path does
> today. Do you think that is possible?
>
> Thanks
> +Vinod
>
> On Jul 22, 2016, at 4:53 PM, Vinod Kumar Vavilapalli 
> wrote:
>
> Do you want this check from some particular branch-2 release? It
> matters since the releases along branch-2 have themselves had some
> noise[2].
>
> [1]: https://github.com/lvc/japi-compliance-checker
> 
> [2]: http://abi-laboratory.pro/java/tracker/timeline/hadoop/
> 
>
> --
> busbey
>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Setting JIRA fix versions for 3.0.0 releases

2016-07-26 Thread Sean Busbey
Just so I don't waste time chasing my tail, should I interpret this
email and the associated JIRA as the PMC preferring I not spend
volunteer time providing a compatibility breakdown as previously
discussed?

On Mon, Jul 25, 2016 at 7:54 PM, Wangda Tan <wheele...@gmail.com> wrote:
> I just filed ticket https://issues.apache.org/jira/browse/HADOOP-13423 to
> track running JDIFF on trunk and analyze results for Hadoop-common. I will
> work on that and keep the JIRA and this thread updated. We need to do the
> same work for YARN/MR/HDFS.
>
> On Mon, Jul 25, 2016 at 5:47 PM, Wangda Tan <wheele...@gmail.com> wrote:
>>
>> I agree with what Vinod mentioned: we need to revisit all incompatible
>> changes and revert unnecessary ones. Even if we don't have any compatibility
>> guarantees between 2.x and 3.x. But make user to be less frustrated while
>> trying 3.x is always a better option to me.
>>
>> To achieve this we need to run jdiff for trunk and look at results. I
>> would suggest to do this before 3.0.0-alpha1 release.
>>
>> In addition, for people who will try this 3-alpha1 release artifacts, a
>> guide about migration from 2.x to 3.x will be very helpful, and it can also
>> help for people to better understand what have changed (Just like
>> http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html)
>>
>> Thoughts?
>>
>> Thanks,
>> Wangda
>>
>>
>> On Thu, Jul 21, 2016 at 2:41 PM, Sean Busbey <bus...@cloudera.com> wrote:
>>>
>>> On Thu, Jul 21, 2016 at 4:32 PM, Vinod Kumar Vavilapalli
>>> <vino...@apache.org> wrote:
>>> >> I really, really want a 3.0.0-alpha1 ASAP, since it's basically
>>> >> impossible for downstreams to test incompat changes and new features 
>>> >> without
>>> >> a release artifact. I've been doing test builds, and branch-3.0.0-alpha1 
>>> >> is
>>> >> ready for an RC besides possibly this fix version issue.
>>> >
>>> > Not arguing against the need for an alpha release, the question is if
>>> > it can wait till after 2.8 gets done.
>>> >
>>> > Orthogonally, do we have a report of the incompatible changes? Like the
>>> > one I generated for some of the branch-2 releases using late jdiff work 
>>> > from
>>> > Li Lu etc. We should do this and fix any inadvertant incompatibilities.
>>> > Without seeing this list of incompatibilities, why even make an alpha
>>> > release and force downstream components to discover issues what can be
>>> > identified through running reports.
>>> >
>>>
>>> I can come up with this, atleast for Source / Binary API compatibility,
>>> provided folks don't mind if I use the Java API Compliance Checker[1]
>>> instead of jdiff.
>>>
>>> I'm already familiar with quickly using it, esp with Audience
>>> Annotations from my work in HBase.
>>>
>>> Do you want this check from some particular branch-2 release? It
>>> matters since the releases along branch-2 have themselves had some
>>> noise[2].
>>>
>>> [1]: https://github.com/lvc/japi-compliance-checker
>>> [2]: http://abi-laboratory.pro/java/tracker/timeline/hadoop/
>>>
>>> --
>>> busbey
>>>
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>>
>>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] 2.6.x line releases

2016-07-20 Thread Sean Busbey
The HBase community would like more 2.6.z releases.

On Wed, Jul 20, 2016 at 2:00 PM, Ravi Prakash  wrote:
> We for one are not using 2.6.*
>
> On Tue, Jul 19, 2016 at 1:21 PM, Sangjin Lee  wrote:
>
>> It's been a while since we had a release on the 2.6.x line. Is it time to
>> get ready for a 2.6.5 release? Are folks using and relying on releases on
>> 2.6.x? If there is enough interest, I could take that on. Let me know.
>>
>> I also want to gauge the community's interest in maintaining the 2.6.x
>> line. How long do we maintain this line? What would be a sensible EOL
>> policy? Note that as the main code lines start diverging (java version,
>> features, etc.), the cost of maintaining multiple release lines does go up.
>> I'd love to hear your thoughts.
>>
>> Regards,
>> Sangjin
>>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Official Docker Image at release time

2016-07-19 Thread Sean Busbey
FWIW, there is a "blessed" apache area on docker hub now, and it's
just an INFRA request to point out the needed Dockerfile in the repo.

PMCs can also request write access to bintray hosting of docker images
for PMC members.

Info on INFRA-8441, example on INFRA-12019.

A Docker image that starts up a pseudo distributed instance would be
great for new folks. I don't know if it's worth the investment to
build images that do more complex multi-instance deployments (though
there is some fledgling work in HBase for tooling that would do this
for us).

On Tue, Jul 19, 2016 at 2:46 AM, Tsuyoshi Ozawa  wrote:
> Hi developers,
>
> Klaus mentioned the availability of an official docker image of Apache
> Hadoop. Is it time that we start to distribute an official docker
> image at release time?
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E
>
> Thoughts?
>
> Thanks,
> - Tsuyoshi
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Jenkins Node Labelling Documentation

2016-08-04 Thread Sean Busbey
On Thu, Aug 4, 2016 at 4:16 PM, Gav <gmcdon...@apache.org> wrote:
>
>
> On Fri, Aug 5, 2016 at 3:14 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> > Why? yahoo-not-h2 is really not required since H2 is the same as all the
>> > other H* nodes.
>>
>> The yahoo-not-h2 label exists because the H2 node was misconfigured
>> for a long time and would fail builds as a result.
>
>
> Yes I know, but now its not, so is no longer needed.
>
>>
>> What label will
>> jobs taht are currently configured to avoid H2 be migrated to? Will
>> they be migrated automatically?
>
>
> Currently I'm asking that projects make the move themselves. Most jobs would
> be fine as they have
> multiple labels, so just need to drop the yahoo-not-h2 label to give them
> access to H2. If, when I drop the label I
> see jobs with it in use, I'll remove it.
>

I don't see a label I can move to that covers the same machines as the
current yahoo-not-h2 nodes and H2. It looks like a union of "hadoop"
and "docker" would do it, but "docker" is going away. Also I have to
have a single label for use in multi-configuration builds or jenkins
will treat the two labels as an axis for test selection rather than as
just a restriction for where the jobs can run. I could try to go back
to using an expression, but IIRC that gave us things like *s in the
path used for tests, which was not great.

Can we maybe expand the Hadoop label? (or would "beefy" cover the set?)

If the H* nodes are all the same, why do we need the labels HDFS,
MapReduce, Pig, Falcon, Tez, and ZooKeeper in addition to the Hadoop
label?


-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Where do we track our Jenkins stuff

2016-08-05 Thread Sean Busbey
Apologies, but I haven't managed to figure out where we track our changes
to Jenkins.

The ASF build infra has some changes to Java and Maven installations coming
and a fair number of Hadoop related jobs need to be updated.

If I take on doing said updates, do I track it in a JIRA? Or is there some
particular mailing list thread or a wiki page or something?

-- 
Sean Busbey


Re: Jenkins Node Labelling Documentation

2016-08-04 Thread Sean Busbey
> Why? yahoo-not-h2 is really not required since H2 is the same as all the 
> other H* nodes.

The yahoo-not-h2 label exists because the H2 node was misconfigured
for a long time and would fail builds as a result. What label will
jobs taht are currently configured to avoid H2 be migrated to? Will
they be migrated automatically?

> The 'docker' label references installed software and should be dropped. We 
> have and will continue to install docker wherever it is required.

How do we determine where it's required? If I have a job that relies
on docker being installed, do I just get to have it run unlabeled?

On Thu, Aug 4, 2016 at 4:18 AM, Gav  wrote:
> Hi All,
>
> Following on from my earlier mails regarding Java, Maven and Ant
> consolidations, I thought
> you might like a page detailing the Jenkins Labels and which nodes they
> belong to.
>
> I've put it up here :-
>
> https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels
>
> I hope you find it useful.
>
> In addition I propose to remove a couple of redundant labels to make
> choosing a label
> easier.
>
> Proposal is to remove labels yahoo-not-h2, ubuntu and docker. Why?
> yahoo-not-h2 is really not required since H2 is the same as all the other
> H* nodes. ubuntu is a copy of Ubuntu and both are identical.
> The 'docker' label references installed software and should be dropped. We
> have and will continue to install docker wherever it is required.
>
> If no objections I'll remove these labels in ~2 weeks time on 19th August
>
> HTH
>
> Gav... (ASF Infrastructure Team)



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Where do we track our Jenkins stuff

2016-08-06 Thread Sean Busbey
Hi folks!

Any response here? The notice from infra about projects needing to update
for Maven and Java had a three day migration period.

I'm happy to just make something up, but I want to make sure it's useful
for the other person or people who currently update build stuff.

-- 
Sean Busbey
On Aug 5, 2016 08:36, "Sean Busbey" <bus...@cloudera.com> wrote:

> Apologies, but I haven't managed to figure out where we track our changes
> to Jenkins.
>
> The ASF build infra has some changes to Java and Maven installations
> coming and a fair number of Hadoop related jobs need to be updated.
>
> If I take on doing said updates, do I track it in a JIRA? Or is there some
> particular mailing list thread or a wiki page or something?
>
> --
> Sean Busbey
>


Re: Where do we track our Jenkins stuff

2016-08-06 Thread Sean Busbey
Okay, I'll get something together today.

Thanks for the response Chris!

-- 
Sean Busbey
On Aug 6, 2016 10:48, "Chris Nauroth" <cnaur...@hortonworks.com> wrote:

> Hello Sean,
>
> Thank you for bringing this up.  The lack of response could indicate that
> we’re not doing enough recently to maintain our Jenkins jobs.  I’m not
> aware of any existing up-to-date tracking for these jobs.  Starting a wiki
> page to catalog all of our jobs sounds like a great step toward improving
> on this.
>
> Thank you for your help with this!
>
> --Chris Nauroth
>
> On 8/6/16, 6:27 AM, "Sean Busbey" <bus...@cloudera.com> wrote:
>
> Hi folks!
>
> Any response here? The notice from infra about projects needing to
> update
> for Maven and Java had a three day migration period.
>
> I'm happy to just make something up, but I want to make sure it's
> useful
> for the other person or people who currently update build stuff.
>
> --
> Sean Busbey
> On Aug 5, 2016 08:36, "Sean Busbey" <bus...@cloudera.com> wrote:
>
> > Apologies, but I haven't managed to figure out where we track our
> changes
> > to Jenkins.
> >
> > The ASF build infra has some changes to Java and Maven installations
> > coming and a fair number of Hadoop related jobs need to be updated.
> >
> > If I take on doing said updates, do I track it in a JIRA? Or is
> there some
> > particular mailing list thread or a wiki page or something?
> >
> > --
> > Sean Busbey
> >
>
>
>


Re: Jenkins Node Labelling Documentation

2016-08-08 Thread Sean Busbey
I'm trying to transition jobs off of the yahoo-not-h2 label, but again
I don't see a single label I can use that covers an appropriate set of
nodes.

Can we expand Hadoop to include H10 and H11? Can we come up with a
label that covers both the H* and the physical ubuntu hosts that have
been puppetized?

On Thu, Aug 4, 2016 at 5:05 PM, Gav <gmcdon...@apache.org> wrote:
>
>
> On Fri, Aug 5, 2016 at 7:52 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> On Thu, Aug 4, 2016 at 4:16 PM, Gav <gmcdon...@apache.org> wrote:
>> >
>> >
>> > On Fri, Aug 5, 2016 at 3:14 AM, Sean Busbey <bus...@cloudera.com> wrote:
>> >>
>> >> > Why? yahoo-not-h2 is really not required since H2 is the same as all
>> >> > the
>> >> > other H* nodes.
>> >>
>> >> The yahoo-not-h2 label exists because the H2 node was misconfigured
>> >> for a long time and would fail builds as a result.
>> >
>> >
>> > Yes I know, but now its not, so is no longer needed.
>> >
>> >>
>> >> What label will
>> >> jobs taht are currently configured to avoid H2 be migrated to? Will
>> >> they be migrated automatically?
>> >
>> >
>> > Currently I'm asking that projects make the move themselves. Most jobs
>> > would
>> > be fine as they have
>> > multiple labels, so just need to drop the yahoo-not-h2 label to give
>> > them
>> > access to H2. If, when I drop the label I
>> > see jobs with it in use, I'll remove it.
>> >
>>
>> I don't see a label I can move to that covers the same machines as the
>> current yahoo-not-h2 nodes and H2. It looks like a union of "hadoop"
>> and "docker" would do it, but "docker" is going away. Also I have to
>> have a single label for use in multi-configuration builds or jenkins
>> will treat the two labels as an axis for test selection rather than as
>> just a restriction for where the jobs can run. I could try to go back
>> to using an expression, but IIRC that gave us things like *s in the
>> path used for tests, which was not great.
>>
>> Can we maybe expand the Hadoop label? (or would "beefy" cover the set?)
>>
>> If the H* nodes are all the same, why do we need the labels HDFS,
>> MapReduce, Pig, Falcon, Tez, and ZooKeeper in addition to the Hadoop
>> label?
>
>
> I was thinking the same yep, those could all go too imho but wanted to
> discuss
> that one seperately.
>
> Gav...
>
>>
>>
>>
>> --
>> busbey
>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-13410

2016-08-09 Thread Sean Busbey
ServiceLoader API stuff won't load out of the unpacked version, right?

On Tue, Aug 9, 2016 at 11:00 AM, Sangjin Lee  wrote:
> I'd like to get feedback from the community (especially those who might
> remember this) on HADOOP-13410:
> https://issues.apache.org/jira/browse/HADOOP-13410
>
> It appears that Hadoop's RunJar adds the original jar to the app's
> classpath even though the unjarred contents of the jar are in the
> classpath. As long as the file is a jar (or its variant), this seems
> completely superfluous. My suspicion is that the line of code that adds the
> jar to the classpath may have been left there by accident.
>
> Could anyone confirm this? Does anyone see an issue with removing the jar
> from the classpath? I've tested the fix with a couple of simple apps, and I
> didn't see a problem.
>
> Thanks,
> Sangjin



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DICUSS] Upgrading Guice to 4.0(HADOOP-12064)

2016-06-29 Thread Sean Busbey
At the very least, I'm running through an updated shaded hadoop client
this week[1] (HBase is my test application and it wandered onto some
private things that broke in branch-2). And Sangjin has a good lead on
an lower-short-term-cost incremental improvement for runtime isolation
of apps built on yarn/mapreduce[2]. He's been patiently waiting for
more review feedback.


[1]: https://issues.apache.org/jira/browse/HADOOP-11804
[2]: https://issues.apache.org/jira/browse/HADOOP-13070

On Wed, Jun 29, 2016 at 12:33 PM, Vinod Kumar Vavilapalli
 wrote:
> My strong expectation is that we’ll have a version of classpath isolation in 
> our first release of 3.x. I’m planning to spending some cycles right away on 
> this.
>
> Assuming classpath isolation gets in, it is reasonable to bump up our 
> dependencies like Jetty / Guice to the latest stable versions.
>
> Thanks
> +Vinod
>
>> On Jun 27, 2016, at 6:01 AM, Tsuyoshi Ozawa  wrote:
>>
>> Hi developers,
>>
>> I will plan to upgrade Google Guice dependency on trunk. The change
>> also includes asm and cglib upgrade.
>> I checked following points:
>>
>> * Both HDFS and YARN UIs work well.
>> * All webIU-related tests pass as described on HADOOP-12064.
>> * Ran mapreduce job, and it works well.
>>
>> https://issues.apache.org/jira/browse/HADOOP-12064
>>
>> Do you have any concern or opinion?  I would like to merge it to trunk
>> on this Friday if you have no objections.
>>
>> Best,
>> - Tsuyoshi
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.0.0-alpha1 RC0

2016-08-31 Thread Sean Busbey
It's also the key Andrew has in the project's KEYS file:

http://www.apache.org/dist/hadoop/common/KEYS



On Tue, Aug 30, 2016 at 4:12 PM, Andrew Wang  wrote:
> Hi Eric, thanks for trying this out,
>
> I tried this gpg command to get my key, seemed to work:
>
> # gpg --keyserver pgp.mit.edu --recv-keys 7501105C
> gpg: requesting key 7501105C from hkp server pgp.mit.edu
> gpg: /root/.gnupg/trustdb.gpg: trustdb created
> gpg: key 7501105C: public key "Andrew Wang (CODE SIGNING KEY) <
> andrew.w...@cloudera.com>" imported
> gpg: no ultimately trusted keys found
> gpg: Total number processed: 1
> gpg:   imported: 1  (RSA: 1)
>
> Also found via search:
> http://pgp.mit.edu/pks/lookup?search=wang%40apache.org=index
>
>
> On Tue, Aug 30, 2016 at 2:06 PM, Eric Badger  wrote:
>
>> I don't know why my email client keeps getting rid of all of my spacing.
>> Resending the same email so that it is actually legible...
>>
>> All on OSX 10.11.6:
>> - Verified the hashes. However, Andrew, I don't know where to find your
>> public key, so I wasn't able to verify that they were signed by you.
>> - Built from source
>> - Deployed a pseudo-distributed clusterRan a few sample jobs
>> - Poked around the RM UI
>> - Poked around the attached website locally via the tarball
>>
>>
>> I did find one odd thing, though. It could be a misconfiguration on my
>> system, but I've never had this problem before with other releases (though
>> I deal almost exclusively in 2.x and so I imagine things might be
>> different). When I run a sleep job, I do not see any
>> diagnostics/logs/counters printed out by the client. Initially I ran the
>> job like I would on 2.7 and it failed (because I had not set
>> yarn.app.mapreduce.am.env and mapreduce.admin.user.env), but I didn't see
>> anything until I looked at the RM UI. There I was able to see all of the
>> logs for the failed job and diagnose the issue. Then, once I fixed my
>> parameters and ran the job again, I still didn't see any
>> diagnostics/logs/counters.
>>
>>
>> ebadger@foo: env | grep HADOOP
>> HADOOP_HOME=/Users/ebadger/Downloads/hadoop-3.0.0-alpha1-
>> src/hadoop-dist/target/hadoop-3.0.0-alpha1/
>> HADOOP_CONF_DIR=/Users/ebadger/conf
>> ebadger@foo: $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/
>> mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha1-tests.jar sleep
>> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME=$HADOOP_HOME"
>> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" -mt 1 -rt 1
>> -m 1 -r 1
>> WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
>> ebadger@foo:
>>
>>
>> After running the above command, the RM UI showed a successful job, but as
>> you can see, I did not have anything printed onto the command line.
>> Hopefully this is just a misconfiguration on my part, but I figured that I
>> would point it out just in case.
>>
>>
>> Thanks,
>>
>>
>> Eric
>>
>>
>>
>> On Tuesday, August 30, 2016 4:00 PM, Eric Badger
>>  wrote:
>>
>>
>>
>> All on OSX 10.11.6:
>> Verified the hashes. However, Andrew, I don't know where to find your
>> public key, so I wasn't able to verify that they were signed by you.Built
>> from sourceDeployed a pseudo-distributed clusterRan a few sample jobsPoked
>> around the RM UIPoked around the attached website locally via the tarball
>> I did find one odd thing, though. It could be a misconfiguration on my
>> system, but I've never had this problem before with other releases (though
>> I deal almost exclusively in 2.x and so I imagine things might be
>> different). When I run a sleep job, I do not see any
>> diagnostics/logs/counters printed out by the client. Initially I ran the
>> job like I would on 2.7 and it failed (because I had not set
>> yarn.app.mapreduce.am.env and mapreduce.admin.user.env), but I didn't see
>> anything until I looked at the RM UI. There I was able to see all of the
>> logs for the failed job and diagnose the issue. Then, once I fixed my
>> parameters and ran the job again, I still didn't see any
>> diagnostics/logs/counters.
>> ebadger@foo: env | grep HADOOPHADOOP_HOME=/Users/
>> ebadger/Downloads/hadoop-3.0.0-alpha1-src/hadoop-dist/
>> target/hadoop-3.0.0-alpha1/HADOOP_CONF_DIR=/Users/ebadger/confebadger@foo:
>> $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/
>> mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha1-tests.jar sleep
>> -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME=$HADOOP_HOME"
>> -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" -mt 1 -rt 1
>> -m 1 -r 1WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be
>> incomplete.ebadger@foo:
>> After running the above command, the RM UI showed a successful job, but as
>> you can see, I did not have anything printed onto the command line.
>> Hopefully this is just a misconfiguration on my part, but I figured that I
>> would point it out just in case.
>> Thanks,
>> Eric
>>
>>
>>
>> On 

Re: Upgrading Hadoop dependencies and catching potential incompatibilities for HBase

2016-09-28 Thread Sean Busbey
On Wed, Sep 28, 2016 at 1:55 PM, Enis Söztutar  wrote:
> Can Hadoop please shade ALL of the dependencies (including PB) in Hadoop-3
> so that we do not have this mess going forward.
>
> Enis


I'm working on this under HADOOP-11804, using HBase as my test application.

-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[NOTICE] breaking precommit checks

2016-11-08 Thread Sean Busbey
Hi folks!

a host of precommit checks are currently timing out due to an update
to our job configs (the timeout is currently set to 50 minutes).

I'm in the process of giving things more time based on our historic
usage, but if your check fails in the mean time and

1) the total run time is close to 50 minutes

2) the jenkins job when you click through shows status "aborted"

then please resubmit your patch.

-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: H9 build slave is bad

2017-03-08 Thread Sean Busbey
Is this HADOOP-13951?

On Tue, Mar 7, 2017 at 8:32 PM, Andrew Wang  wrote:
> A little ping that H9 hit the same error again, and I'm again going to
> clean it out. One more time and I'll ask infra about either removing or
> reimaging this node.
>
> On Mon, Mar 6, 2017 at 2:12 PM, Allen Wittenauer 
> wrote:
>
>>
>> > On Mar 6, 2017, at 1:57 PM, Andrew Wang 
>> wrote:
>> >
>> > I'll leave it there so it's ready for next time. If this keeps happening
>> on H9, then I'm going to ask infra to reimage it. FWIW I haven't seen this
>> on our internal unit test runs, so it points to an H9-specific issue.
>>
>> I’ve seen test data cause failures on quite a few nodes over the
>> past year or two.  I just usually fixed it without telling anyone since
>> there never seemed to be much interest in the problems.  That said, I’ve
>> mostly stopped babysitting the hadoop builds on the ASF infra.



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: H9 build slave is bad

2017-03-09 Thread Sean Busbey
On Wed, Mar 8, 2017 at 2:04 PM, Allen Wittenauer
<a...@effectivemachines.com> wrote:
>
>> On Mar 8, 2017, at 9:34 AM, Sean Busbey <bus...@cloudera.com> wrote:
>>
>> Is this HADOOP-13951?
>
> Almost certainly.  Here's the run that broke it again:
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/18591
>
> Likely something in the HDFS-7240 branch or with this patch that's 
> doing Bad Things (tm).
>
>

Okay, I'll try to reprioritize getting at least the first-pass "don't
let things break" part of the solution we discussed in place during my
rec time this weekend.

-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: H9 build slave is bad

2017-03-10 Thread Sean Busbey
All the precommit builds should be doing the correct thing now for
making sure we don't render nodes useless. They don't flag the problem
yet and someone will still need to run the "cleanup" job on nodes
broken before jenkins runs pick up the new configuration changes.

Probably best if we move to HADOOP-13951 if we need more discussion.

On Thu, Mar 9, 2017 at 5:16 PM, Allen Wittenauer
 wrote:
>
>> On Mar 9, 2017, at 2:15 PM, Andrew Wang  wrote:
>>
>> H9 is again eating our builds.
>>
>
> H0: https://builds.apache.org/job/PreCommit-HDFS-Build/18652/console
> H6: https://builds.apache.org/job/PreCommit-HDFS-Build/18646/console
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-17 Thread Sean Busbey
disallowing force pushes to trunk was done back in:

* August 2014: INFRA-8195
* February 2016: INFRA-11136

On Mon, Apr 17, 2017 at 11:18 AM, Jason Lowe
 wrote:
> I found at least one commit that was dropped, MAPREDUCE-6673.  I was able to 
> cherry-pick the original commit hash since it was recorded in the commit 
> email.
> This begs the question of why we're allowing force pushes to trunk.  I 
> thought we asked to have that disabled the last time trunk was accidentally 
> clobbered?
> Jason
>
>
> On Monday, April 17, 2017 10:18 AM, Arun Suresh  
> wrote:
>
>
>  Hi
>
> I had the Apr-14 eve version of trunk on my local machine. I've pushed that.
> Don't know if anything was committed over the weekend though.
>
> Cheers
> -Arun
>
> On Mon, Apr 17, 2017 at 7:17 AM, Anu Engineer 
> wrote:
>
>> Hi Allen,
>>
>> https://issues.apache.org/jira/browse/INFRA-13902
>>
>> That happened with ozone branch too. It was an inadvertent force push.
>> Infra has advised us to force push the latest branch if you have it.
>>
>> Thanks
>> Anu
>>
>>
>> On 4/17/17, 7:10 AM, "Allen Wittenauer"  wrote:
>>
>> >Looks like someone reset HEAD back to Mar 31.
>> >
>> >Sent from my iPad
>> >
>> >> On Apr 16, 2017, at 12:08 AM, Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:
>> >>
>> >> For more details, see https://builds.apache.org/job/
>> hadoop-qbt-trunk-java8-linux-x86/378/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> -1 overall
>> >>
>> >>
>> >> The following subsystems voted -1:
>> >>docker
>> >>
>> >>
>> >> Powered by Apache Yetus 0.5.0-SNAPSHOT  http://yetus.apache.org
>> >>
>> >>
>> >>
>> >> -
>> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>> >
>> >-
>> >To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> >For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> >
>> >
>>
>>
>
>
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: zstd compression

2017-07-17 Thread Sean Busbey
I know that the HBase community is also looking at what to do about
our inclusion of zstd. We've had it in releases since late 2016. My
plan was to request that they relicense it.

Perhaps the Hadoop PMC could join HBase in the request?

On Sun, Jul 16, 2017 at 8:11 PM, Allen Wittenauer
 wrote:
>
> It looks like HADOOP-13578 added Facebook's zstd compression codec.  
> Unfortunately, that codec is using the same 3-clause BSD (LICENSE file) + 
> patent grant license (PATENTS file) that React is using and RocksDB was using.
>
> Should that code get reverted?
>
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Pre-Commit build is failing

2017-07-25 Thread Sean Busbey
-dev@yetus to bcc, since I think this is a Hadoop issue and not a yetus
issue.

Please review/commit HADOOP-14686 (which I am providing as a
volunteer/contributor on the Hadoop project).

On Tue, Jul 25, 2017 at 7:54 PM, Allen Wittenauer 
wrote:

>
> Again: just grab the .gitignore file from trunk and update it in
> branch-2.7. It hasn't been touched (outside of one patch) in years.  The
> existing jobs should then work.
>
> The rest of this stuff, yes, I know and yes it's intentional.  The
> directory structure was inherited from the original jobs that Nigel set up
> with the old version of test-patch.  Maybe some day I'll fix it.  But
> that's a project for a different day.  In order to fix it, it means taking
> down the patch testing for Hadoop while I work it out.  You'll notice that
> all of the other Yetus jobs for Hadoop have a much different layout.
>
>
>
>
> > On Jul 25, 2017, at 7:24 PM, suraj acharya  wrote:
> >
> > Hi,
> >
> > Seems like the issue was incorrect/unclean checkout.
> > I made a few changes[1] to the directories the checkout happens to  and
> it is now running.
> > Of course, this build[2] will take some time to run, but at the moment,
> it is running maven install.
> >
> > I am not sure who sets up/ manages the jenkins job of HDFS and dont want
> to change that, but I will keep the dummy job around for a couple of days
> in case anyone wants to see.
> > Also, I see that you'll were using the master branch of Yetus. If there
> is no patch present there that is of importance, then I would recommend to
> use the latest stable release version 0.5.0
> >
> > If you have more questions, feel free to ping dev@yetus.
> > Hope this helps.
> >
> > [1]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/configure
> > [2]: https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/12/console
> >
> > -Suraj Acharya
> >
> > On Tue, Jul 25, 2017 at 6:57 PM, suraj acharya 
> wrote:
> > For anyone looking. I created another job here. [1].
> > Set it with debug to see the issue.
> > The error is being seen here[2].
> > From the looks of it, it looks like, the way the checkout is happening
> is not very clean.
> > I will continue to look at it, but in case anyone wants to jump in.
> >
> > [1] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-Copy/
> > [2] : https://builds.apache.org/job/PreCommit-HDFS-Build-Suraj-
> Copy/11/console
> >
> > -Suraj Acharya
> >
> > On Tue, Jul 25, 2017 at 6:28 PM, Konstantin Shvachko <
> shv.had...@gmail.com> wrote:
> > Hi Yetus developers,
> >
> > We cannot build Hadoop branch-2.7 anymore. Here is a recent example of a
> > failed build:
> > https://builds.apache.org/job/PreCommit-HDFS-Build/20409/console
> >
> > It seems the build is failing because Yetus cannot apply the patch from
> the
> > jira.
> >
> > ERROR: HDFS-11896 does not apply to branch-2.7.
> >
> > As far as I understand this is Yetus problem. Probably in 0.3.0.
> > I can apply this patch successfully, but Yetus test-patch.sh script
> clearly
> > failed to apply. Cannot say why because Yetus does not report it.
> > I also ran Hadoop's test-patch.sh script locally and it passed
> successfully
> > on branch-2.7.
> >
> > Could anybody please take a look and help fixing the build.
> > This would be very helpful for the release (2.7.4) process.
> >
> > Thanks,
> > --Konst
> >
> > On Mon, Jul 24, 2017 at 10:41 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> > wrote:
> >
> > > Or should we backport the entire HADOOP-11917
> > >  ?
> > >
> > > Thanks,
> > > --Konst
> > >
> > > On Mon, Jul 24, 2017 at 6:56 PM, Konstantin Shvachko <
> shv.had...@gmail.com
> > > > wrote:
> > >
> > >> Allen,
> > >>
> > >> Should we add "patchprocess/" to .gitignore, is that the problem for
> 2.7?
> > >>
> > >> Thanks,
> > >> --Konstantin
> > >>
> > >> On Fri, Jul 21, 2017 at 6:24 PM, Konstantin Shvachko <
> > >> shv.had...@gmail.com> wrote:
> > >>
> > >>> What stuff? Is there a jira?
> > >>> It did work like a week ago. Is it a new Yetus requirement.
> > >>> Anyways I can commit a change to fix the build on our side.
> > >>> Just need to know what is missing.
> > >>>
> > >>> Thanks,
> > >>> --Konst
> > >>>
> > >>> On Fri, Jul 21, 2017 at 5:50 PM, Allen Wittenauer <
> > >>> a...@effectivemachines.com> wrote:
> > >>>
> > 
> >  > On Jul 21, 2017, at 5:46 PM, Konstantin Shvachko <
> >  shv.had...@gmail.com> wrote:
> >  >
> >  > + d...@yetus.apache.org
> >  >
> >  > Guys, could you please take a look. Seems like Yetus problem with
> >  > pre-commit build for branch-2.7.
> > 
> > 
> >  branch-2.7 is missing stuff in .gitignore.
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
> >
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional 

Re: zstd compression

2017-07-25 Thread Sean Busbey
Nope. Once I found out HBase's use was compliant as an optional runtime
dependency I stopped looking.

On Jul 24, 2017 7:22 PM, "Andrew Wang" <andrew.w...@cloudera.com> wrote:

> I think it'd still be worth asking FB to relicense zstandard. Being able
> to bundle it in the release would make it easier to use, since I doubt
> there are zstandard packages in the default OS repos.
>
> Sean, have you already filed an issue with zstandard?
>
> On Mon, Jul 17, 2017 at 1:30 PM, Jason Lowe <jl...@yahoo-inc.com.invalid>
> wrote:
>
>> I think we are OK to leave support for the zstd codec in the Hadoop code
>> base.  I asked Chris Mattman for clarification, noting that the support for
>> the zstd codec requires the user to install the zstd headers and libraries
>> and then configure it to be included in the native Hadoop build.  The
>> Hadoop releases are not shipping any zstd code (e.g.: headers or libraries)
>> nor does it require zstd as a mandatory dependency.  Here's what he said:
>>
>>
>> On Monday, July 17, 2017 11:07 AM, Chris Mattmann <mattm...@apache.org>
>> wrote:
>>
>> > Hi Jason,
>> >
>> > This sounds like an optional dependency on a Cat-X software. This isn’t
>> the only type of compression
>> > that is allowed within Hadoop, correct? If it is truly optional and you
>> have gone to that level of detail
>> > below to make the user opt in, and if we are not shipping zstd with our
>> products (source code releases),
>> > then this is an acceptable usage.
>> >
>> > Cheers,
>> > Chris
>>
>>
>> So I think we are in the clear with respect to zstd usage as long as we
>> keep it as an optional codec where the user needs to get the headers and
>> libraries for zstd and configure it into the native Hadoop build.
>>
>> Jason
>>
>> On Monday, July 17, 2017 9:44 AM, Sean Busbey <bus...@cloudera.com>
>> wrote:
>>
>>
>>
>> I know that the HBase community is also looking at what to do about
>>
>> our inclusion of zstd. We've had it in releases since late 2016. My
>>
>> plan was to request that they relicense it.
>>
>>
>> Perhaps the Hadoop PMC could join HBase in the request?
>>
>>
>> On Sun, Jul 16, 2017 at 8:11 PM, Allen Wittenauer
>>
>> <a...@effectivemachines.com> wrote:
>>
>> >
>>
>> > It looks like HADOOP-13578 added Facebook's zstd compression
>> codec.  Unfortunately, that codec is using the same 3-clause BSD (LICENSE
>> file) + patent grant license (PATENTS file) that React is using and RocksDB
>> was using.
>>
>> >
>>
>> > Should that code get reverted?
>>
>> >
>>
>> >
>>
>> >
>>
>> > -
>>
>> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>
>> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>> >
>>
>>
>>
>>
>> --
>>
>> busbey
>>
>>
>> -
>>
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>


[DISCUSS] Can we make our precommit test robust to dependency changes while staying usable?

2017-09-14 Thread Sean Busbey
Moving discussion here from HADOOP-14654.

Short synopsis:

* HADOOP-14654 updated commons-httplient to a new patch release in
hadoop-project
* Precommit checked the modules that changed (i.e. not many)
* nightly had Azure support break due to a change in behavior.

Is this just the cost of our approach to precommit vs post commit testing?

One approach: do a dependency:list of each module and for those that show a
change with the patch we run tests there.

This will cause a slew of tests to run when dependencies change. For the
change in HADOOP-14654 probably we'd just have to run at the top level.

Steve L and I had some more details about things we could do on the ticket
if folks are interested.


-- 
busbey


Re: Trunk fails

2017-09-20 Thread Sean Busbey
On Wed, Sep 20, 2017 at 5:12 AM, Steve Loughran 
wrote:

>
> >
>
> What we could do is have a patch submission process which says "if you are
> playing with packaging, you must declare at the time of patch submission
> that you have run a full mvn clean install". And a commit process which
> says "if you commit a patch which changes the packaging, you need to do a
> build before a test"
>
> This is a variant of what we expect for the hadoop-aws and hadoop-azure
> clients where the submitter has to state the endpoint they ran the
> integration test suite against. Committer is expected to rerun the test
> suite locally before the commit too, for safety
>
> And we should all be trying 'mvn package -Pdist,native" regularly too, and
> playing with the new scripts. We need to find the issues before anyone else
>
> For all this to work, of course, we need reproducible builds. I see my
> mornings build is asking for "json-smart-2.3-SNAPSHOT.pom" as well as the
> doxia stuff. Why is so much -SNAPSHOT stuff getting in? I don't even see a
> ref for json-smart in our POMs



For stability of packaging changes, we could also ask commiters to include
in relevant commits a piece of commit metadata in the message that we issue
from a jenkins job, i.e. give the job a patch and run full nightly QBT
against tree with the patch in place.

-- 
busbey


Re: [DISCUSS] Can we make our precommit test robust to dependency changes while staying usable?

2017-09-14 Thread Sean Busbey


On 2017-09-14 15:36, Chris Douglas  wrote: 
> This has gotten bad enough that people are dismissing legitimate test
> failures among the noise.
> 
> On Thu, Sep 14, 2017 at 1:20 PM, Allen Wittenauer
>  wrote:
> > Someone should probably invest some time into integrating the HBase 
> > flaky test code a) into Yetus and then b) into Hadoop.
> 
> What does the HBase flaky test code do? Another extension to
> test-patch could run all new/modified tests multiple times, and report
> to JIRA if any run fails.
> 

The current HBase stuff segregates untrusted tests by looking through nightly 
test runs to find things that fail intermittently. We then don't include those 
tests in either nightly or precommit tests. We have a different job that just 
runs the untrusted tests and if they start passing removes them from the list.

There's also a project getting used by SOLR called "BeastIT" that goes through 
running parallel copies of a given test a large number of times to reveal flaky 
tests.

Getting either/both of those into Yetus and used here would be a huge 
improvement.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Can we make our precommit test robust to dependency changes while staying usable?

2017-09-14 Thread Sean Busbey
> Committers MUST check the qbt output after a commit.  They MUST make sure
their commit didn’t break something new.

How do we make this easier / more likely to happen?

For example, I don't see any notice on HADOOP-14654 that the qbt
post-commit failed. Is this a timing thing? Did Steve L just notice the
break before we could finish the 10 hours it takes to get qbt done?

How solid would qbt have to be for us to do something drastic like
auto-revert changes after a failure?


On Thu, Sep 14, 2017 at 11:05 AM, Allen Wittenauer <a...@effectivemachines.com
> wrote:

>
> > On Sep 14, 2017, at 8:03 AM, Sean Busbey <bus...@cloudera.com> wrote:
> >
> > * HADOOP-14654 updated commons-httplient to a new patch release in
> > hadoop-project
> > * Precommit checked the modules that changed (i.e. not many)
> > * nightly had Azure support break due to a change in behavior.
>
> OK, so it worked as coded/designed.
>
> > Is this just the cost of our approach to precommit vs post commit
> testing?
>
> Yes.  It’s a classic speed vs. size computing problem.
>
> test-patch: quick but only runs a subset of tests
> qbt: comprehensive but takes a very long time
>
> Committers MUST check the qbt output after a commit.  They MUST
> make sure their commit didn’t break something new.
>
> > One approach: do a dependency:list of each module and for those that
> show a
> > change with the patch we run tests there.
>
> As soon as you change something like junit, you’re running over
> everything …
>
> Plus, let’s get real: there is a large contingent of committers
> that barely take the time to read or even comprehend the current Yetus
> output.  Adding *more* output is the last thing we want to do.
>
> > This will cause a slew of tests to run when dependencies change. For the
> > change in HADOOP-14654 probably we'd just have to run at the top level.
>
> … e.g., exactly what qbt does for 10+ hours every night.
>
> It’s important to also recognize that we need to be “good
> citizens” in the ASF. If we can do dependency checking in one 10 hour
> streak vs. several, that reduces the load on the ASF build infrastructure.
>
>
>


-- 
busbey


Re: [DISCUSS] Can we make our precommit test robust to dependency changes while staying usable?

2017-09-14 Thread Sean Busbey
On Thu, Sep 14, 2017 at 4:23 PM, Andrew Wang 
wrote:

>
> >
> > I discussed this on yetus-dev a while back and Allen thought it'd be
> non-trivial:
>
> https://lists.apache.org/thread.html/552ad614d1b3d5226a656b60c01084
> 57bcaa1219fb9ad985f8750ba1@%3Cdev.yetus.apache.org%3E
>
> I unfortunately don't have the test-patch.sh expertise to dig into this.
>
>
>
Hurm. getting something generic certainly sounds like a lot of work, but
getting something that works specifically with Maven maybe not. lemme see
if I can describe what I think the pieces look  like in a jira.


[DISCUSS] moving to Apache Yetus Audience Annotations

2017-09-22 Thread Sean Busbey
When Apache Yetus formed, it started with several key pieces of Hadoop that
looked reusable. In addition to our contribution testing infra, the project
also stood up a version of our audience annotations for delineating the
public facing API[1].

I recently got the Apache HBase community onto the Yetus version of those
annotations rather than their internal fork of the Hadoop ones[2]. It
wasn't pretty, mostly a lot of blind sed followed by spot checking and
reliance on automated tests.

What do folks think about making the jump ourselves? I'd be happy to work
through things, either as one unreviewable monster or per-module
transitions (though a piece-meal approach might complicate our javadoc
situation).


[1]: http://yetus.apache.org/documentation/0.5.0/interface-classification/
[2]: https://issues.apache.org/jira/browse/HBASE-17823

-- 
busbey


Re: [DISCUSS] moving to Apache Yetus Audience Annotations

2017-09-22 Thread Sean Busbey
I'd refer to it as an incompatible change; we expressly label the
annotations as IA.Public.

If you think it's too late to get in for 3.0, I can make a jira and put it
on the back burner for when trunk goes to 4.0?

On Fri, Sep 22, 2017 at 12:49 PM, Andrew Wang <andrew.w...@cloudera.com>
wrote:

> Is this itself an incompatible change? I imagine the bytecode will be
> different.
>
> I think we're too late to do this for beta1 given that I want to cut an
> RC0 today.
>
> On Fri, Sep 22, 2017 at 7:03 AM, Sean Busbey <bus...@cloudera.com> wrote:
>
>> When Apache Yetus formed, it started with several key pieces of Hadoop
>> that
>> looked reusable. In addition to our contribution testing infra, the
>> project
>> also stood up a version of our audience annotations for delineating the
>> public facing API[1].
>>
>> I recently got the Apache HBase community onto the Yetus version of those
>> annotations rather than their internal fork of the Hadoop ones[2]. It
>> wasn't pretty, mostly a lot of blind sed followed by spot checking and
>> reliance on automated tests.
>>
>> What do folks think about making the jump ourselves? I'd be happy to work
>> through things, either as one unreviewable monster or per-module
>> transitions (though a piece-meal approach might complicate our javadoc
>> situation).
>>
>>
>> [1]: http://yetus.apache.org/documentation/0.5.0/interface-classi
>> fication/
>> [2]: https://issues.apache.org/jira/browse/HBASE-17823
>>
>> --
>> busbey
>>
>
>


-- 
busbey


Re: Moving Java Forward Faster

2017-09-07 Thread Sean Busbey
ugh. this will be rough for cross-jdk compatibility, unless they update the
target jre options of javac to support more than the last 2 major versions.

> Question: Does GPL licensing of the JDK/JVM affect us negatively?

Nope. all the openjdk bits we rely on were already going to be under the
GPLv2 with CE, since the alternative is the Oracle Binary Code License[1],
which is also in Cat-X[2] but for not being an Open Source license. In any
case things built for Java are covered under the "platform" exception to
the Cat-X designation[3], since depending Java is considered unavoidable
for a java project.



[1]: http://www.jcp.org/aboutJava/communityprocess/licenses/SE7_RIv2.doc
[2]: http://apache.org/legal/resolved#category-x as "BCL"
[3]: http://apache.org/legal/resolved#platform

On Thu, Sep 7, 2017 at 9:29 AM, larry mccay  wrote:

> Interesting.
> Thanks for sharing this, Allen.
>
> Question: Does GPL licensing of the JDK/JVM affect us negatively?
>
>
> On Thu, Sep 7, 2017 at 10:14 AM, Allen Wittenauer <
> a...@effectivemachines.com>
> wrote:
>
> >
> >
> > > Begin forwarded message:
> > >
> > > From: "Rory O'Donnell" 
> > > Subject: Moving Java Forward Faster
> > > Date: September 7, 2017 at 2:12:45 AM PDT
> > > To: "strub...@yahoo.de >> Mark Struberg" 
> > > Cc: rory.odonn...@oracle.com, abdul.kolarku...@oracle.com,
> > balchandra.vai...@oracle.com, dalibor.to...@oracle.com,
> bui...@apache.org
> > > Reply-To: bui...@apache.org
> > >
> > > Hi Mark & Gavin,
> > >
> > > Oracle is proposing a rapid release model for Java SE going-forward.
> > >
> > > The high points are highlighted below, details of the changes can be
> > found on Mark Reinhold’s blog [1] , OpenJDK discussion email list [2].
> > >
> > > Under the proposed release model, after JDK 9, we will adopt a strict,
> > time-based model with a new major release every six months, update
> releases
> > every quarter, and a long-term support release every three years.
> > >
> > > The new JDK Project will run a bit differently than the past "JDK $N"
> > Projects:
> > >
> > > - The main development line will always be open but fixes,
> enhancements,
> > and features will be merged only when they're nearly finished. The main
> > line will be Feature Complete [3] at all times.
> > >
> > > - We'll continue to use the JEP Process [4] for new features and other
> > significant changes. The bar to target a JEP to a specific release will,
> > however, be higher since the work must be Feature Complete in order to go
> > in. Owners of large or risky features will be strongly encouraged to
> split
> > such features up into smaller and safer parts, to integrate earlier in
> the
> > release cycle, and to publish separate lines of early-access builds prior
> > to integration.
> > >
> > > The JDK Updates Project will run in much the same way as the past "JDK
> > $N" Updates Projects, though update releases will be strictly limited to
> > fixes of security issues, regressions, and bugs in newer features.
> > >
> > > Related to this proposal, we intend to make a few changes in what we
> do:
> > >
> > > - Starting with JDK 9 we'll ship OpenJDK builds under the GPL [5], to
> > make it easier for developers to deploy Java applications to cloud
> > environments. We'll initially publish OpenJDK builds for Linux/x64,
> > followed later by builds for macOS/x64 and Windows/x64.
> > >
> > > - We'll continue to ship proprietary "Oracle JDK" builds, which include
> > "commercial features" [6] such as Java Flight Recorder and Mission
> Control
> > [7], under a click-through binary-code license [8]. Oracle will continue
> to
> > offer paid support for these builds.
> > >
> > > - After JDK 9 we'll open-source the commercial features in order to
> make
> > the OpenJDK builds more attractive to developers and to reduce the
> > differences between those builds and the Oracle JDK. This will take some
> > time, but the ultimate goal is to make OpenJDK and Oracle JDK builds
> > completely interchangeable.
> > >
> > > - Finally, for the long term we'll work with other OpenJDK contributors
> > to establish an open build-and-test infrastructure. This will make it
> > easier to publish early-access builds for features in development, and
> > eventually make it possible for the OpenJDK Community itself to publish
> > authoritative builds of the JDK.
> > >
> > > Questions , comments, feedback to OpenJDK discuss mailing list [2]
> > >
> > > Rgds,Rory
> > >
> > > [1]https://mreinhold.org/blog/forward-faster
> > > [2]http://mail.openjdk.java.net/pipermail/discuss/2017-
> > September/004281.html
> > > [3]http://openjdk.java.net/projects/jdk8/milestones#Feature_Complete
> > > [4]http://openjdk.java.net/jeps/0
> > > [5]http://openjdk.java.net/legal/gplv2+ce.html
> > > [6]http://www.oracle.com/technetwork/java/javase/terms/
> > products/index.html
> > > [7]http://www.oracle.com/technetwork/java/javaseproducts/mission-
> > 

Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2017-10-24 Thread Sean Busbey
Just curious, Junping what would "solid evidence" look like? Is the
supposition here that the memory leak is within HDFS test code rather than
library runtime code? How would such a distinction be shown?

On Tue, Oct 24, 2017 at 4:06 PM, Junping Du  wrote:

> Allen,
>  Do we have any solid evidence to show the HDFS unit tests going
> through the roof are due to serious memory leak by HDFS? Normally, I don't
> expect memory leak are identified in our UTs - mostly, it (test jvm gone)
> is just because of test or deployment issues.
>  Unless there is concrete evidence, my concern on seriously memory
> leak for HDFS on 2.8 is relatively low given some companies (Yahoo,
> Alibaba, etc.) have deployed 2.8 on large production environment for
> months. Non-serious memory leak (like forgetting to close stream in
> non-critical path, etc.) and other non-critical bugs always happens here
> and there that we have to live with.
>
> Thanks,
>
> Junping
>
> 
> From: Allen Wittenauer 
> Sent: Tuesday, October 24, 2017 8:27 AM
> To: Hadoop Common
> Cc: Hdfs-dev; mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
>
> > On Oct 23, 2017, at 12:50 PM, Allen Wittenauer 
> wrote:
> >
> >
> >
> > With no other information or access to go on, my current hunch is that
> one of the HDFS unit tests is ballooning in memory size.  The easiest way
> to kill a Linux machine is to eat all of the RAM, thanks to overcommit and
> that’s what this “feels” like.
> >
> > Someone should verify if 2.8.2 has the same issues before a release goes
> out …
>
>
> FWIW, I ran 2.8.2 last night and it has the same problems.
>
> Also: the node didn’t die!  Looking through the workspace (so the
> next run will destroy them), two sets of logs stand out:
>
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-
> linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
> and
>
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-
> linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/
>
> It looks like my hunch is correct:  RAM in the HDFS unit tests are
> going through the roof.  It’s also interesting how MANY log files there
> are.  Is surefire not picking up that jobs are dying?  Maybe not if memory
> is getting tight.
>
> Anyway, at the point, branch-2.8 and higher are probably fubar’d.
> Additionally, I’ve filed YETUS-561 so that Yetus-controlled Docker
> containers can have their RAM limits set in order to prevent more nodes
> going catatonic.
>
>
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


-- 
busbey


Re: [DISCUSS] Branch Proposal: HADOOP 15407: ABFS

2018-05-15 Thread Sean Busbey
apologies, copying back in common-dev@ with my question about the code.

On Tue, May 15, 2018 at 2:36 PM, Sean Busbey <bus...@cloudera.com> wrote:

> >  Internal constraints prevented this feature from being developed in
> Apache, so we want to ensure that all the code is discussed, maintainable,
> and documented by the community before it merges.
>
> Has this code gone through ASF IP Clearance already?
>
> On Tue, May 15, 2018 at 10:34 AM, Steve Loughran <ste...@hortonworks.com>
> wrote:
>
>> Hi
>>
>> Chris Douglas I and I've have a proposal for a short-lived feature branch
>> for the Azure ABFS connector to go into the hadoop-azure package. This will
>> connect to the new azure storage service, which will ultimately replace the
>> one used by wasb. It's a big patch and, like all storage connectors, will
>> inevitably take time to stabilize (i.e: nobody ever get seek() right, even
>> when we think we have).
>>
>> Thomas & Esfandiar will do the coding: they've already done the
>> paperwork. Chris, myself & anyone else interested can be involved in the
>> review and testing.
>>
>> Comments?
>>
>> -
>>
>> The initial HADOOP-15407 patch contains a new filesystem client for the
>> forthcoming Azure ABFS, which is intended to replace Azure WASB as the
>> Azure storage layer. The patch is large, as it contains the replacement
>> client, tests, and generated code.
>>
>> We propose a feature branch, so the module can be broken into salient,
>> reviewable chunks. Internal constraints prevented this feature from being
>> developed in Apache, so we want to ensure that all the code is discussed,
>> maintainable, and documented by the community before it merges.
>>
>> To effect this, we also propose adding two developers as branch
>> committers: Thomas Marquardt tm...@microsoft.com<mailto:tma
>> r...@microsoft.com> Esfandiar Manii esma...@microsoft.com> sma...@microsoft.com>
>>
>> Beyond normal feature branch activity and merge criteria for FS modules,
>> we want to add another merge criterion for ABFS. Some of the client APIs
>> are not GA. It seems reasonable to require that this client works with
>> public endpoints before it merges to trunk.
>>
>> To test the Blob FS driver, Blob FS team (including Esfandiar Manii and
>> Thomas Marquardt) in Azure Storage will need the MSDN subscription ID(s)
>> for all reviewers who want to run the tests. The ABFS team will then
>> whitelist the subscription ID(s) for the Blob FS Preview. At that time,
>> future storage accounts created will have the Blob FS endpoint,
>> .dfs.core.windows.net<http://dfs.core.windows.net>, which
>> the Blob FS driver relies on.
>>
>> This is a temporary state during the (current) Private Preview and the
>> early phases of Public Preview. In a few months, the whitelisting will not
>> be required and anyone will be able to create a storage account with access
>> to the Blob FS endpoint.
>>
>> Thomas and Esfandiar have been active in the Hadoop project working on
>> the WASB connector (see https://issues.apache.org/jira
>> /browse/HADOOP-14552). They understand the processes and requirements of
>> the software. Working on the branch directly will let them bring this
>> significant feature into the hadoop-azure module without disrupting
>> existing users.
>>
>
>
>
> --
> busbey
>



-- 
busbey


Re: yetus build rejecting shaded artifacts ....

2018-06-15 Thread Sean Busbey
yep!

I'll walk through how to find it, skip to "tl;dr:" if you just want the answer.

Start with the "Console output" line in the footer of the QABot post

Console output
https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/console

Search the output for "Checking client artifacts". There'll be two
instances, one for the branch and one for the patch. The QABot said
there was a problem with both.

The branch stanza is:



Checking client artifacts on branch




cd /testptch/hadoop
/usr/bin/mvn --batch-mode
-Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-HADOOP-15407-patch-1
verify -fae --batch-mode -am -pl
hadoop-client-modules/hadoop-client-check-invariants -pl
hadoop-client-modules/hadoop-client-check-test-invariants -pl
hadoop-client-modules/hadoop-client-integration-tests
-Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true
-Dfindbugs.skip=true > /testptch/patchprocess/branch-shadedclient.txt
2>&1

And the patch stanza is:



Checking client artifacts on patch




cd /testptch/hadoop
/usr/bin/mvn --batch-mode
-Dmaven.repo.local=/home/jenkins/yetus-m2/hadoop-HADOOP-15407-patch-1
verify -fae --batch-mode -am -pl
hadoop-client-modules/hadoop-client-check-invariants -pl
hadoop-client-modules/hadoop-client-check-test-invariants -pl
hadoop-client-modules/hadoop-client-integration-tests
-Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true
-Dfindbugs.skip=true > /testptch/patchprocess/patch-shadedclient.txt
2>&1


So the two files we want are "branch-shadedclient.txt" and
"patch-shadedclient.txt"

Next we go from the console output of the build to browse the build artifacts.

Click on the build number  in the breadcrumbs at the top of the page,
in this case "#1477"

https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/

click on the "Build Artifacts" link in the center of that page

https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/artifact/

Then follow "out" to get the contents of the patchprocess directory:

https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/artifact/out/

tl;dr:

In this directory listing we'll find both of the logs:

https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/artifact/out/branch-shadedclient.txt
https://builds.apache.org/job/PreCommit-HADOOP-Build/14777/artifact/out/patch-shadedclient.txt

In both cases it looks like the failure is before we do the shaded client test:

...
[INFO] --- maven-antrun-plugin:1.7:run (common-test-bats-driver) @
hadoop-common ---
...
 [exec] ok 1 hadoop_stop_daemon_changing_pid
 [exec] not ok 2 hadoop_stop_daemon_force_kill
 [exec] # (in test file hadoop_stop_daemon.bats, line 43)
 [exec] #   `[ -f ${TMP}/pidfile ]' failed
 [exec] # bindir:
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/scripts
 [exec] # sh: 0: Can't open
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/scripts/process_with_sigterm_trap.sh

Unfortunately, I don't know what help I can provide on the specific
failure. I haven't really dug into the bats testing of shell scripts
yet.

If someone could file a JIRA about having the shaded client test post
a link to its log in the QABot footer and assign it to me I can take
care of making it easier to find this stuff.


On Fri, Jun 15, 2018 at 6:05 AM, Steve Loughran  wrote:
>
> There's a patch for https://issues.apache.org/jira/browse/HADOOP-15407 which 
> is being rejected due to unrelated tests (probably) and to s failure in the 
> shading
>
> Is there a way to get the output of that specific log?
>
> -steve



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: site doc cleanup

2018-06-27 Thread Sean Busbey
IMHO dump the docs from the beta release as well. anyone on an
alpha/beta release should move on to a GA release and beta1 should
have been API frozen compared to GA.

3.1.0 was labeled "not ready for production" in its release notes[1].
Seems that means 3.0.3 is the stable3 release?

Speaking with my HBase hat on I'd rather "current" from the sitemap
point at a version folks could reasonably expect HBase to run on top
of. Unfortunately, I think that would likely be 2.9.1 due to ongoing
issues[2].


[1]: 
https://lists.apache.org/thread.html/8313e605c0ed0012f134cce9cc6adca738eea81feccea99c8de87cd9@%3Cgeneral.hadoop.apache.org%3E
[2]: https://issues.apache.org/jira/browse/HBASE-20502

On Wed, Jun 27, 2018 at 1:02 PM, Steve Loughran  wrote:
> I'm looking at our svn site, and there are a lot of javadocs there, including 
> those for all the 3.0.0-alphas
>
>  du -s -h r3*
> 438M r3.0.0
> 1.2G r3.0.0-alpha1
> 368M r3.0.0-alpha2
> 368M r3.0.0-alpha3
> 374M r3.0.0-alpha4
> 425M r3.0.0-beta1
> 441M r3.0.1
> 441M r3.0.2
> 447M r3.0.3
> 467M r3.1.0
>
>
> I propose: rm -rf docs/3.0.0-* to get rid of all the alpha releases, 
> consistent with the rest of the documentation set.
>
> I also intend to create a symlink stable3 -> r3.1.0  on the basis of the 3.x 
> line, it's the stabile one.
>
> What I'd also like to do is mark that 3.1 as the "current" version in the 
> sitemap, leaving 2.9.1 as the stable branch-2 release that the "stable" link 
> will still point to off there.
>
> Is everyone OK With this? Changes to the forrest XML will only surface when 
> someone rebuilds the site; I think deleting the 3.0.0-alpha artifacts will 
> happen immediately
>
> -steve
>



-- 
busbey

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Do we still have nightly (or even weekly) unit test run for Hadoop projects?

2017-10-19 Thread Sean Busbey
Here's the email from last night to common-dev@hadoop:

https://s.apache.org/ARe1

On Wed, Oct 18, 2017 at 10:42 PM, Akira Ajisaka  wrote:

> Yes, qbt runs nightly and it sends e-mail to dev lists.
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/
>
> Regards,
> Akira
>
>
> On 2017/10/19 7:54, Wangda Tan wrote:
>
>> Hi,
>>
>> Do we still have nightly (or even weekly) unit test run for Hadoop
>> projects? I couldn't find it on Jenkins dashboard and I haven't seen
>> reports set to dev lists for a while.
>>
>> Thanks,
>> Wangda
>>
>>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


-- 
busbey


Re: PreCommit-Admin fails to fetch issue list from jira

2018-01-16 Thread Sean Busbey
FYI Duo, I believe

https://issues.apache.org/jira/browse/YETUS-594

is also slated to fix this. it just needs a review.

On Tue, Jan 16, 2018 at 4:38 AM, Duo Zhang  wrote:
> Started from this build
>
> https://builds.apache.org/job/PreCommit-Admin/329113/
>
> Have dug a bit, it seems that now jira does not allow our query to be run
> by unauthorized user, you will get a 400 if you do not login. This is the
> error page when accessing
>
> https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12323182/SearchRequest-12323182.xml?tempMax=50
> HTTP Status 400 - A value with ID '12315621' does not exist for the field
> 'project'.
>
> *type* Status report
>
> *message* *A value with ID '12315621' does not exist for the field
> 'project'.*
>
> *description* *The request sent by the client was syntactically incorrect.*
> --
> Apache Tomcat/8.5.6
>
> The project '12315621 ' is Ranger. I do not have the permission to modify
> the original filter, and do not have the permission to create a public
> filter either. So for me there is no way to fix the problem.
>
> Could the Ranger project check if you have changed some permission configs?
> Or could someone who has the permission to create public filters creates a
> new filter without Ranger to see if it works? There are plenty of projects
> which rely on the PreCommit-Admin job.
>
> Thanks.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Sean Busbey
I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon  wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>>
>> Hi Steve -
>>
>> This is a long overdue DISCUSS thread!
>>
>> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>> to get to the page like SSL exceptions in the browser do?
>> Similar tactic for UI access without SSL?
>> A new AuthenticationFilter can be added to the filter chains that
>> blocks
>> API calls unless explicitly configured to be open and obvious log a
>> similar
>> message?
>>
>> thanks,
>>
>> --larry
>>
>>
>>
>>
>> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> ste...@hortonworks.com>
>> wrote:
>>
>> > Bitcoins are profitable enough to justify writing malware to run on
>> Hadoop
>> > clusters & schedule mining jobs: there have been a couple of
>> incidents of
>> > this in the wild, generally going in through no security, well known
>> > passwords, open ports.
>> >
>> > Vendors of Hadoop-related products get to deal with their lockdown
>> > themselves, which they often do by installing kerberos from the
>> outset,
>> > making users make up their own password for admin accounts, etc.
>> >
>> > The ASF releases though: we just provide something insecure out the
>> box
>> > and some docs saying "use kerberos if you want security"
>> >
>> > What we can do here?
>> >

  1   2   >