Re: The combinatorial problem of testing hadoop ecosystem tools

Roman Shaposhnik Mon, 10 Jun 2013 11:29:05 -0700

On Mon, Jun 10, 2013 at 11:10 AM, Jay Vyas <[email protected]> wrote:
> Hi again big top !
>
> It would be nice if one could to specify certain distribution components
> (for example, hbase 0.94.7, with hadoop 1.x.x , etc...) in my BigTop
> deployments to test matrices of interoperable components).


Well, this is a way more complicated problem that it sounds. Basically,
you can't just take the components and plug them in. You have to
build a complete Bigtop stack with these components so that the
transitive dependencies get processed correctly.

As long as you're willing to build and deploy custom stacks you
should be fine though.

Now, as far as Hadoop 1.x is concerned -- Bigtop has moved on
to Hadoop 2.x. I'm not sure how much will still be applicable.

> As we all know, its common for different people to run different versions of
> ecosystem components without integrating them.  For example, maybe someone
> will have an old version of HIVE running on a new

One of the very important points of the Bigtop charter is to be
a community-driven place where the most useful stack of
Hadoop-based bigdata management platform gets decided.
Basically, as a community, it is our goal to have the stack
of components that works for everybody. Having a common
stack amplifies our testing significantly since the entire
community is testing the 'same stuff'.

Now, this is not to say that companies/individuals shouldn't
attempt spinning their own stacks -- but then, you're essentially
forking away from the rest of the community and you're kind
of on your own.

Finally, we vote on the BOM (bill of materials) for each release
of Bigtop. Bigtop 0.7.0 BOM suggestions/voting will open
up sometime next week. We'd love to have your feedback.

> Is this commonly done or has anyone worked on the combinatorial ecosystem
> match up testing problem in BigTop ?  It seems like it might be tricky to
> select and deal with the fact that ecosystem tools are constantly being
> upgraded at different rates.

We're no different from a Linux distribution. Just as Fedora or Debian
community gets to decide what version of Kernel and gcc are going
to end up in the next Debian -- we get to decide the same and execute
on it. Just like with Linux there will be users who want to run their
own version of gcc, but this is a dangerous step to take -- once you've
forked away from the distro version you're sort of on your own as
far as bugfixes, etc. are concerned.

Thanks,
Roman.

Re: The combinatorial problem of testing hadoop ecosystem tools

Reply via email to