Guys, this is a pretty long email with all the details I can think of on how Bigtop can help stabilization efforts of Hadoop 2.x. A lot of this information is required background. I really, really encourage everyone who's thinking of contributing to this effort to read it up. Once again, I do apologize for its size.
Matt, Andrew, you both brought up very good point, so let me summarize a few things wrt. Bigtop. I'm also CCing Bigtop dev ML so that everybody who's interested in pitching in could discuss the matter further over there. On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell <apurt...@apache.org> wrote: > The other comment on this thread that suggests ASF governance structures > being inadequate for negotiating changes in a large ecosystem might be on to > something, but at the same time Apache BigTop may be an effective ASF-native > answer to that. That is my sincere hope as well. Of course, Apache Bigtop is a project in its own right with its own release schedules, community of users, etc. What we are developing is not really an integration testsuite for Hadoop, it just so happens that without a stable Hadoop base we can't really deliver much. Hence we have a huge vested interest in having a predictable schedule for the stable releases of Hadoop. We also have all the interest in the world to help Hadoop achieve that. At the same time we're a very small project juggling ~18 different open source components trying to put them into a coherent distribution. I don't think it is realistic to expect us to be able to do all the work that ideally we would need to do in order to provide the most of feedback for Hadoop stabilization exercise. At the same time it would be really unfortunate if we all just give up on this collective goal. Ideally we can all pitch in to the extent we believe in the need in having a stable Hadoop 2.x code line out there. I'll elaborate on what exactly bigtop can contribute a bit later and I would expect all the folks who'd be willing to pitch in in the particular area to reach out to us either here or on bigtop ML. On Wed, May 15, 2013 at 4:54 PM, Matt Foley <ma...@apache.org> wrote: > Roman, what is your model for how test results from Bigtop should feed back > into Hadoop-2 development? > With the understanding that (a) software does have bugs, and (b) you're not > going to get an SLA on community-sponsored software, > what are your ideas for how to close the loop better? > > Would "CI" runs of Bigtop against branch-2 be feasible, as Arun suggests? > How should we accomodate changes in individual components (Hadoop Core, but > others as well) that may require changes in one or more other components? > How does Bigtop keep doing a viable nightly build in that chaotic > environment? > Is this a previously solved problem? All excellent questions! Here's my laundry list of what Bigtop can offer today: #0 a publically available continuous integration Jenkins instance that runs on EC2 (because of Cloudera's gracious support of our project) and ties the rest of the bigtop infrastrucutre together: http://bigtop01.cloudera.org:8080/ The benefit of this infrastructure in the open is pretty clear -- just like with builds.apache.org if there are failures/etc. anybody who's interested can jump on it and start making progress. #1 a continuous integration build of all the components comprising the 'current' trunk of Apache Bigtop all the way up to producing easy to install packages for the following Linux platforms: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/ Basically the above link allows one to install nightly builds of Apache Bigtop Hadoop distribution as easyly as typing 'yum install hadoop-conf-pseudo' #2 a potential for 'tracking' builds all the way to packages of each individual component: http://bigtop01.cloudera.org:8080/view/Upstream-tests/ Basically this allows one to easily install the base, fully tested distribution of Hadoop (lets say Bigtop 0.5.0), upgrade just one component and see how it fares. Currently these builds are add-hoc, but I'm trying to work with respective upstream communities to figure out what branches of development they would be interested in testing that way. This is one of the things that Arun and I talked about wrt. hooking up Bigtop Jenkins to the branch-2 on a continuous basis. I wish I had time to do that I honestly simply don't. I might in a few weeks, but again, if anybody is willing to pitch in and help -- that'll be greatly appreciated. #3 a collection of puppet recipes that allow one to deploy packaged Bigtop distro (either from #1 or #2) on a fully distributed cluster. #4 an existing collection of integration tests (~200) for all the components we've got in our stack: http://s.apache.org/UX8 #5 a continuous integration Jenkins jobs that deploy our trunk builds on a nightly basis in 2 configurations: secure and unsecure one over 4 fully distributed nodes running as EC2 VMs: http://bigtop01.cloudera.org:8080/view/Deployment/ #6 a continuous (although currently disabled) nightly runs of all the tests from #4 on two clusters deployed as part of #5. E.g.: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/ At this point we barely have resources to cover the minimum it takes to maintain #1-#6. Here are the areas where we have gaps in coverage (especially in the context of Hadoop 2.x): * there are currently no unit tests in ASF of all the Hadoop ecosystem projects running against a full transitive closure of the Bigtop components. This is actually pretty tricky to accomplish on ASF infrastructure since it requires a combinatoric explosion of the # of Maven artifacts that get published. * more close to the point of this thread there are very few Hadoop ecosystem projects that currently run unit tests against Hadoop 2.x * our integration tests (#4) could be greatly improved upon (what tests couldn't!) and we really would like to have at least 10x of the current amount to feel more comfortable. * it would be awesome to integrate with the rest of the system-level tests that may be available in the community. The model citizen in that respect is Apache Pig where they made their tests flexible enough so that we can run a subset of them against a real cluster: http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/org.apache.pig.test.pigunit/ I wish more project came with tests like that out of the box. One particularly sore point in that respect for me is all of the HBase integration tests that now exist for HBase 0.95. It has been on my list of things to start running on Bigtop infrastructure, but I honestly haven't had a shred of time to make it happen. * if we start making progress on things above we will definitely run into issues of not having enough eyeballs to even triage the issues coming out of these test runs. That's about what I have on my wish list. Feel free to add to it. Also, feel free to pitch in to help on any of the issues. I don't think I have anything more to add to this thread. I'll wait to hear back from those of you who are interested in helping. Thanks, Roman.