On 08/12/2011 05:51 PM, Roman Shaposhnik wrote:
> Guys,
>
> I've noticed that a few Apache projects that Bigtop is integrating seem to
> have packaging infrastructure bundled in the trunk (Hadoop proper and
> Pig come to mind right away, but there could be others, I guess).
>
> So far, Bigtop has been solving a problem of providing a point of
> integration and packaging for projects that didn't have any of that.
> Now that some started to solve that same problem in an incompatible
> way, what policy would make the most sense for us moving forward?
>
> Thoughts?
>
> Thanks,
> Roman.
ok, I bite:

I think it is too early to think about that and we should wait for these
packaging efforts to be more battle tested. Until then, we should focus
on what is already in BigTop.

>From a technical point of view, last time I checked, Hadoop and Pig
packaging efforts only concern RPMs for RHEL/CentOS-like platform (I
remember seeing a few things that wouldn't work on something else than
RHEL/CentOS-like, maybe it has changed since).  Whereas BigTop comes
from CDH and even though it comes with some history and cruft, it works
and has been deployed in a wide variety of production clusters across
broad environment and GNU/Linux distributions. So I would rather work on
cleaning up, improving or bringing new features to BigTop instead of
helping other packaging efforts catching up on what BigTop has been
doing since day one.

Then there is the integration part. Each project has a different point
of view on how to organize their layout, what OS to support and do
things and we need some room to adjust for that. I would even argue we
should allow ourselves to patch build and security issues, but this is
out of scope of this discussion.
We need also to be able to pick and choose versions of each project to
ensure they work well with each other. Bigtop is about the whole thing
and not each individual part. Besides nothing prevent having different
packaging efforts with different objectives and focus.
But we should also be able to quickly release fixes for packaging. For
example, I was told that not being able to build on Fedora and openSUSE
is not a blocker to Hadoop 0.20.204. If we were depending on its
packaging, this would be disastrous from a packaging point of view (but
it's fine in a Hadoop context given no one has really complained about
it). So having our packaging work outside of the project allow each
project to focus on their priorities while we can focus on ours without
having to convince each time, each project (*).

But this is a good thing to have these projects starting to think about
packaging issues and to make them aware of it. This will improve their
quality and make our lives easier as well. And nothing prevent us from
collaborating, sending patches or reusing good ideas/code.


(*) I am also lobbying for building and testing our packaging against
the trunk of each project so we can pro-actively ensure releases have a
good enough quality for us, rather than waiting for a release to go out,
file bug/patches and wait for the next release to be able to use it (if
no other issues have been introduced). But all of this is pointless
until we get some build going on.

Reply via email to