On 08/12/2011 05:51 PM, Roman Shaposhnik wrote: > Guys, > > I've noticed that a few Apache projects that Bigtop is integrating seem to > have packaging infrastructure bundled in the trunk (Hadoop proper and > Pig come to mind right away, but there could be others, I guess). > > So far, Bigtop has been solving a problem of providing a point of > integration and packaging for projects that didn't have any of that. > Now that some started to solve that same problem in an incompatible > way, what policy would make the most sense for us moving forward? > > Thoughts? > > Thanks, > Roman. ok, I bite:
I think it is too early to think about that and we should wait for these packaging efforts to be more battle tested. Until then, we should focus on what is already in BigTop. >From a technical point of view, last time I checked, Hadoop and Pig packaging efforts only concern RPMs for RHEL/CentOS-like platform (I remember seeing a few things that wouldn't work on something else than RHEL/CentOS-like, maybe it has changed since). Whereas BigTop comes from CDH and even though it comes with some history and cruft, it works and has been deployed in a wide variety of production clusters across broad environment and GNU/Linux distributions. So I would rather work on cleaning up, improving or bringing new features to BigTop instead of helping other packaging efforts catching up on what BigTop has been doing since day one. Then there is the integration part. Each project has a different point of view on how to organize their layout, what OS to support and do things and we need some room to adjust for that. I would even argue we should allow ourselves to patch build and security issues, but this is out of scope of this discussion. We need also to be able to pick and choose versions of each project to ensure they work well with each other. Bigtop is about the whole thing and not each individual part. Besides nothing prevent having different packaging efforts with different objectives and focus. But we should also be able to quickly release fixes for packaging. For example, I was told that not being able to build on Fedora and openSUSE is not a blocker to Hadoop 0.20.204. If we were depending on its packaging, this would be disastrous from a packaging point of view (but it's fine in a Hadoop context given no one has really complained about it). So having our packaging work outside of the project allow each project to focus on their priorities while we can focus on ours without having to convince each time, each project (*). But this is a good thing to have these projects starting to think about packaging issues and to make them aware of it. This will improve their quality and make our lives easier as well. And nothing prevent us from collaborating, sending patches or reusing good ideas/code. (*) I am also lobbying for building and testing our packaging against the trunk of each project so we can pro-actively ensure releases have a good enough quality for us, rather than waiting for a release to go out, file bug/patches and wait for the next release to be able to use it (if no other issues have been introduced). But all of this is pointless until we get some build going on.
