[Removed cross-posts to Hadoop dev lists]

Hi Shubham.

These are good points - thanks for bringing them up! The front page needs to
be improved for sure. For once, a link to the Wiki needs to be added and a
bunch of technical information needs to be migrated to the Bigtop wiki.

As for TODO list: most Apache projects are tend to use JIRA to track the
issues than need to be addressed. For instance, we have a dozen or so tickets
related to the wiki and documentation pages. I have just added the one for
landing page improvements, etc. 
    http://is.gd/qRhiqM

In general - any contributions to the project are hightly appreciated. Please
go ahead and pick the tickets you see unassigned in the Bigtop project on JIRA
    https://issues.apache.org/jira/browse/BIGTOP

clone the workspace
    http://git-wip-us.apache.org/repos/asf/bigtop.git

and bring it on! Looking forward for your first patch!

Regards,
  Cos


On Mon, Jun 03, 2013 at 12:25PM, Shubham Goyal wrote:
> Hi Roman,
> 
> Although your email is pretty big but its really very informative. I went
> through Bigtop's website, wiki, archives and other links. I have below
> suggestions. I think adding these points will help newbies who wish to
> contribute to the project:
> 
> 1) It would be good if we can add more details into how a person can
> contribute to the project, how can he get started and what are the areas
> where he can start contributing right now.
> 2) I think adding a *TODO* page is a better idea. In this page we can list
> the things that need to be done. It can start with the simple items and
> move towards more complex one. This will help new contributors to pick an
> item and work on it.
> 
> To start with I have above two suggestions. Now I am planning to contribute
> to Bigtop project. Can you please let me know the current *TODOS* or how
> should I start contributing into the project?
> 
> Thanks in advance.
> Shubham
> 
> 
> On Sat, May 18, 2013 at 4:59 AM, Roman Shaposhnik <[email protected]> wrote:
> 
> > Guys, this is a pretty long email with all the details
> > I can think of on how Bigtop can help stabilization efforts of
> > Hadoop 2.x. A lot of this information is required background.
> > I really, really encourage everyone who's thinking of
> > contributing to this effort to read it up. Once again,
> > I do apologize for its size.
> >
> > Matt, Andrew,
> >
> > you both brought up very good point, so let me summarize
> > a few things wrt. Bigtop. I'm also CCing Bigtop dev ML
> > so that everybody who's interested in pitching in could
> > discuss the matter further over there.
> >
> > On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell <[email protected]>
> > wrote:
> > > The other comment on this thread that suggests ASF governance structures
> > > being inadequate for negotiating changes in a large ecosystem might be
> > on to
> > > something, but at the same time Apache BigTop may be an effective
> > ASF-native
> > > answer to that.
> >
> > That is my sincere hope as well. Of course, Apache Bigtop is a project in
> > its
> > own right with its own release schedules, community of users, etc. What we
> > are developing is not really an integration testsuite for Hadoop, it
> > just so happens
> > that without a stable Hadoop base we can't really deliver much. Hence we
> > have a huge vested interest in having a predictable schedule for the stable
> > releases of Hadoop. We also have all the interest in the world to help
> > Hadoop
> > achieve that.
> >
> > At the same time we're a very small project juggling ~18 different open
> > source
> > components trying to put them into a coherent distribution. I don't think
> > it is
> > realistic to expect us to be able to do all the work that ideally we would
> > need
> > to do in order to provide the most of feedback for Hadoop
> > stabilization exercise.
> >
> > At the same time it would be really unfortunate if we all just give up on
> > this
> > collective goal. Ideally we can all pitch in to the extent we believe in
> > the
> > need in having a stable Hadoop 2.x code line out there. I'll elaborate on
> > what exactly bigtop can contribute a bit later and I would expect all the
> > folks who'd be willing to pitch in in the particular area to reach out to
> > us
> > either here or on bigtop ML.
> >
> > On Wed, May 15, 2013 at 4:54 PM, Matt Foley <[email protected]> wrote:
> > > Roman, what is your model for how test results from Bigtop should feed
> > back
> > > into Hadoop-2 development?
> > > With the understanding that (a) software does have bugs, and (b) you're
> > not
> > > going to get an SLA on community-sponsored software,
> > > what are your ideas for how to close the loop better?
> > >
> > > Would "CI" runs of Bigtop against branch-2 be feasible, as Arun suggests?
> > > How should we accomodate changes in individual components (Hadoop Core,
> > but
> > > others as well) that may require changes in one or more other components?
> > > How does Bigtop keep doing a viable nightly build in that chaotic
> > > environment?
> > > Is this a previously solved problem?
> >
> > All excellent questions! Here's my laundry list of what Bigtop can offer
> > today:
> >     #0 a publically available continuous integration Jenkins instance that
> >          runs on EC2 (because of Cloudera's gracious support of our
> > project)
> >          and ties the rest of the bigtop infrastrucutre together:
> >              http://bigtop01.cloudera.org:8080/
> >
> >          The benefit of this infrastructure in the open is pretty clear --
> > just
> >           like with builds.apache.org if there are failures/etc. anybody
> > who's
> >           interested can jump on it and start making progress.
> >
> >     #1 a continuous integration build of all the components comprising the
> >          'current' trunk of Apache Bigtop all the way up to producing easy
> > to
> >          install packages for the following Linux platforms:
> >
> > http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/
> >          Basically the above link allows one to install nightly builds
> > of Apache Bigtop
> >          Hadoop distribution as easyly as typing 'yum install
> > hadoop-conf-pseudo'
> >
> >     #2 a potential for  'tracking' builds all the way to packages of
> > each individual
> >          component: http://bigtop01.cloudera.org:8080/view/Upstream-tests/
> >          Basically this allows one to easily install the base, fully
> > tested distribution
> >          of Hadoop (lets say Bigtop 0.5.0), upgrade just one component
> > and see how
> >          it fares. Currently these builds are add-hoc, but I'm trying
> > to work with respective
> >          upstream communities to figure out what branches of
> > development they would
> >          be interested in testing that way.
> >
> >          This is one of the things that Arun and I talked about wrt.
> > hooking up Bigtop
> >          Jenkins to the branch-2 on a continuous basis. I wish I had
> > time to do that
> >          I honestly simply don't. I might in a few weeks, but again,
> > if anybody is willing
> >          to pitch in and help -- that'll be greatly appreciated.
> >
> >     #3 a collection of puppet recipes that allow one to deploy
> > packaged Bigtop distro
> >          (either from #1 or #2) on a fully distributed cluster.
> >
> >     #4 an existing collection of integration tests (~200) for all the
> > components
> >          we've got in our stack: http://s.apache.org/UX8
> >
> >     #5 a continuous integration Jenkins jobs that deploy our trunk
> > builds on a nightly
> >          basis in 2 configurations: secure and unsecure one over 4
> > fully distributed nodes
> >          running as EC2 VMs:
> >               http://bigtop01.cloudera.org:8080/view/Deployment/
> >
> >     #6 a continuous (although currently disabled) nightly runs of all
> > the tests from #4
> >          on two clusters deployed as part of #5. E.g.:
> >
> >
> > http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/
> >
> > At this point we barely have resources to cover the minimum it takes
> > to maintain #1-#6. Here are the areas where we have gaps in coverage
> > (especially in the context of Hadoop 2.x):
> >      * there are currently no unit tests in ASF of all the Hadoop ecosystem
> >        projects running against a full transitive closure of the
> > Bigtop components.
> >        This is actually pretty tricky to accomplish on ASF
> > infrastructure since it
> >        requires a combinatoric explosion of the # of Maven artifacts
> > that get published.
> >
> >      * more close to the point of this thread there are very few
> > Hadoop ecosystem
> >        projects that currently run unit tests against Hadoop 2.x
> >
> >      * our integration tests  (#4) could be greatly improved upon (what
> > tests
> >        couldn't!) and we really would like to have at least 10x of the
> > current
> >        amount to feel more comfortable.
> >
> >     * it would be awesome to integrate with the rest of the system-level
> > tests
> >       that may be available in the community. The model citizen in that
> > respect
> >       is Apache Pig where they made their tests flexible enough so that we
> >       can run a subset of them against a real cluster:
> >
> > http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/org.apache.pig.test.pigunit/
> >       I wish more project came with tests like that out of the box.
> > One particularly
> >       sore point in that respect for me is all of the HBase
> > integration tests that
> >       now exist for HBase 0.95. It has been on my list of things to
> > start running
> >       on Bigtop infrastructure, but I honestly haven't had a shred of
> > time to make
> >       it happen.
> >
> >     * if we start making progress on things above we will definitely
> > run into issues
> >       of not having enough eyeballs to even triage the issues coming
> > out of these
> >       test runs.
> >
> > That's about what I have on my wish list. Feel free to add to it. Also,
> > feel
> > free to pitch in to help on any of the issues.
> >
> > I don't think I have anything more to add to this thread. I'll wait to
> > hear back
> > from those of you who are interested in helping.
> >
> > Thanks,
> > Roman.
> >

Reply via email to