Re: Bigtop contributions [Was: [VOTE] - Release 2.0.5-beta]

Shubham Goyal Fri, 07 Jun 2013 03:32:47 -0700

Hi Cos,

Thanks for replying and considering my suggestions.
Bigtop is pretty new to me and currently am trying to understand its
architecture and code.
I am interested in patch contributions and hope that I'll start doing that
once I have understanding of Bigtop's functionality and code.


Thanks,
Shubham


On Thu, Jun 6, 2013 at 12:10 AM, Konstantin Boudnik <[email protected]> wrote:

> [Removed cross-posts to Hadoop dev lists]
>
> Hi Shubham.
>
> These are good points - thanks for bringing them up! The front page needs
> to
> be improved for sure. For once, a link to the Wiki needs to be added and a
> bunch of technical information needs to be migrated to the Bigtop wiki.
>
> As for TODO list: most Apache projects are tend to use JIRA to track the
> issues than need to be addressed. For instance, we have a dozen or so
> tickets
> related to the wiki and documentation pages. I have just added the one for
> landing page improvements, etc.
>     http://is.gd/qRhiqM
>
> In general - any contributions to the project are hightly appreciated.
> Please
> go ahead and pick the tickets you see unassigned in the Bigtop project on
> JIRA
>     https://issues.apache.org/jira/browse/BIGTOP
>
> clone the workspace
>     http://git-wip-us.apache.org/repos/asf/bigtop.git
>
> and bring it on! Looking forward for your first patch!
>
> Regards,
>   Cos
>
>
> On Mon, Jun 03, 2013 at 12:25PM, Shubham Goyal wrote:
> > Hi Roman,
> >
> > Although your email is pretty big but its really very informative. I went
> > through Bigtop's website, wiki, archives and other links. I have below
> > suggestions. I think adding these points will help newbies who wish to
> > contribute to the project:
> >
> > 1) It would be good if we can add more details into how a person can
> > contribute to the project, how can he get started and what are the areas
> > where he can start contributing right now.
> > 2) I think adding a *TODO* page is a better idea. In this page we can
> list
> > the things that need to be done. It can start with the simple items and
> > move towards more complex one. This will help new contributors to pick an
> > item and work on it.
> >
> > To start with I have above two suggestions. Now I am planning to
> contribute
> > to Bigtop project. Can you please let me know the current *TODOS* or how
> > should I start contributing into the project?
> >
> > Thanks in advance.
> > Shubham
> >
> >
> > On Sat, May 18, 2013 at 4:59 AM, Roman Shaposhnik <[email protected]>
> wrote:
> >
> > > Guys, this is a pretty long email with all the details
> > > I can think of on how Bigtop can help stabilization efforts of
> > > Hadoop 2.x. A lot of this information is required background.
> > > I really, really encourage everyone who's thinking of
> > > contributing to this effort to read it up. Once again,
> > > I do apologize for its size.
> > >
> > > Matt, Andrew,
> > >
> > > you both brought up very good point, so let me summarize
> > > a few things wrt. Bigtop. I'm also CCing Bigtop dev ML
> > > so that everybody who's interested in pitching in could
> > > discuss the matter further over there.
> > >
> > > On Wed, May 15, 2013 at 9:25 PM, Andrew Purtell <[email protected]>
> > > wrote:
> > > > The other comment on this thread that suggests ASF governance
> structures
> > > > being inadequate for negotiating changes in a large ecosystem might
> be
> > > on to
> > > > something, but at the same time Apache BigTop may be an effective
> > > ASF-native
> > > > answer to that.
> > >
> > > That is my sincere hope as well. Of course, Apache Bigtop is a project
> in
> > > its
> > > own right with its own release schedules, community of users, etc.
> What we
> > > are developing is not really an integration testsuite for Hadoop, it
> > > just so happens
> > > that without a stable Hadoop base we can't really deliver much. Hence
> we
> > > have a huge vested interest in having a predictable schedule for the
> stable
> > > releases of Hadoop. We also have all the interest in the world to help
> > > Hadoop
> > > achieve that.
> > >
> > > At the same time we're a very small project juggling ~18 different open
> > > source
> > > components trying to put them into a coherent distribution. I don't
> think
> > > it is
> > > realistic to expect us to be able to do all the work that ideally we
> would
> > > need
> > > to do in order to provide the most of feedback for Hadoop
> > > stabilization exercise.
> > >
> > > At the same time it would be really unfortunate if we all just give up
> on
> > > this
> > > collective goal. Ideally we can all pitch in to the extent we believe
> in
> > > the
> > > need in having a stable Hadoop 2.x code line out there. I'll elaborate
> on
> > > what exactly bigtop can contribute a bit later and I would expect all
> the
> > > folks who'd be willing to pitch in in the particular area to reach out
> to
> > > us
> > > either here or on bigtop ML.
> > >
> > > On Wed, May 15, 2013 at 4:54 PM, Matt Foley <[email protected]> wrote:
> > > > Roman, what is your model for how test results from Bigtop should
> feed
> > > back
> > > > into Hadoop-2 development?
> > > > With the understanding that (a) software does have bugs, and (b)
> you're
> > > not
> > > > going to get an SLA on community-sponsored software,
> > > > what are your ideas for how to close the loop better?
> > > >
> > > > Would "CI" runs of Bigtop against branch-2 be feasible, as Arun
> suggests?
> > > > How should we accomodate changes in individual components (Hadoop
> Core,
> > > but
> > > > others as well) that may require changes in one or more other
> components?
> > > > How does Bigtop keep doing a viable nightly build in that chaotic
> > > > environment?
> > > > Is this a previously solved problem?
> > >
> > > All excellent questions! Here's my laundry list of what Bigtop can
> offer
> > > today:
> > >     #0 a publically available continuous integration Jenkins instance
> that
> > >          runs on EC2 (because of Cloudera's gracious support of our
> > > project)
> > >          and ties the rest of the bigtop infrastrucutre together:
> > >              http://bigtop01.cloudera.org:8080/
> > >
> > >          The benefit of this infrastructure in the open is pretty
> clear --
> > > just
> > >           like with builds.apache.org if there are failures/etc.
> anybody
> > > who's
> > >           interested can jump on it and start making progress.
> > >
> > >     #1 a continuous integration build of all the components comprising
> the
> > >          'current' trunk of Apache Bigtop all the way up to producing
> easy
> > > to
> > >          install packages for the following Linux platforms:
> > >
> > >
> http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-Repository/
> > >          Basically the above link allows one to install nightly builds
> > > of Apache Bigtop
> > >          Hadoop distribution as easyly as typing 'yum install
> > > hadoop-conf-pseudo'
> > >
> > >     #2 a potential for  'tracking' builds all the way to packages of
> > > each individual
> > >          component:
> http://bigtop01.cloudera.org:8080/view/Upstream-tests/
> > >          Basically this allows one to easily install the base, fully
> > > tested distribution
> > >          of Hadoop (lets say Bigtop 0.5.0), upgrade just one component
> > > and see how
> > >          it fares. Currently these builds are add-hoc, but I'm trying
> > > to work with respective
> > >          upstream communities to figure out what branches of
> > > development they would
> > >          be interested in testing that way.
> > >
> > >          This is one of the things that Arun and I talked about wrt.
> > > hooking up Bigtop
> > >          Jenkins to the branch-2 on a continuous basis. I wish I had
> > > time to do that
> > >          I honestly simply don't. I might in a few weeks, but again,
> > > if anybody is willing
> > >          to pitch in and help -- that'll be greatly appreciated.
> > >
> > >     #3 a collection of puppet recipes that allow one to deploy
> > > packaged Bigtop distro
> > >          (either from #1 or #2) on a fully distributed cluster.
> > >
> > >     #4 an existing collection of integration tests (~200) for all the
> > > components
> > >          we've got in our stack: http://s.apache.org/UX8
> > >
> > >     #5 a continuous integration Jenkins jobs that deploy our trunk
> > > builds on a nightly
> > >          basis in 2 configurations: secure and unsecure one over 4
> > > fully distributed nodes
> > >          running as EC2 VMs:
> > >               http://bigtop01.cloudera.org:8080/view/Deployment/
> > >
> > >     #6 a continuous (although currently disabled) nightly runs of all
> > > the tests from #4
> > >          on two clusters deployed as part of #5. E.g.:
> > >
> > >
> > >
> http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/
> > >
> > > At this point we barely have resources to cover the minimum it takes
> > > to maintain #1-#6. Here are the areas where we have gaps in coverage
> > > (especially in the context of Hadoop 2.x):
> > >      * there are currently no unit tests in ASF of all the Hadoop
> ecosystem
> > >        projects running against a full transitive closure of the
> > > Bigtop components.
> > >        This is actually pretty tricky to accomplish on ASF
> > > infrastructure since it
> > >        requires a combinatoric explosion of the # of Maven artifacts
> > > that get published.
> > >
> > >      * more close to the point of this thread there are very few
> > > Hadoop ecosystem
> > >        projects that currently run unit tests against Hadoop 2.x
> > >
> > >      * our integration tests  (#4) could be greatly improved upon (what
> > > tests
> > >        couldn't!) and we really would like to have at least 10x of the
> > > current
> > >        amount to feel more comfortable.
> > >
> > >     * it would be awesome to integrate with the rest of the
> system-level
> > > tests
> > >       that may be available in the community. The model citizen in that
> > > respect
> > >       is Apache Pig where they made their tests flexible enough so
> that we
> > >       can run a subset of them against a real cluster:
> > >
> > >
> http://bigtop01.cloudera.org:8080/view/Bigtop-trunk/job/Bigtop-trunk-smokecluster/14/testReport/org.apache.pig.test.pigunit/
> > >       I wish more project came with tests like that out of the box.
> > > One particularly
> > >       sore point in that respect for me is all of the HBase
> > > integration tests that
> > >       now exist for HBase 0.95. It has been on my list of things to
> > > start running
> > >       on Bigtop infrastructure, but I honestly haven't had a shred of
> > > time to make
> > >       it happen.
> > >
> > >     * if we start making progress on things above we will definitely
> > > run into issues
> > >       of not having enough eyeballs to even triage the issues coming
> > > out of these
> > >       test runs.
> > >
> > > That's about what I have on my wish list. Feel free to add to it. Also,
> > > feel
> > > free to pitch in to help on any of the issues.
> > >
> > > I don't think I have anything more to add to this thread. I'll wait to
> > > hear back
> > > from those of you who are interested in helping.
> > >
> > > Thanks,
> > > Roman.
> > >
>

Re: Bigtop contributions [Was: [VOTE] - Release 2.0.5-beta]

Reply via email to