Sure Cos, I will put something up next week. 


> On Apr 30, 2015, at 3:30 PM, Konstantin Boudnik <[email protected]> wrote:
> 
> Andrew, 
> 
> do you might putting this on our wiki? Such a great and well-put explanation!
> I am sure it will help a lot of new contributors to get up to speed much
> quicker!
> 
> Thanks!
>  Cos
> 
>> On Tue, Apr 28, 2015 at 04:01PM, Andrew Purtell wrote:
>> Inline
>> 
>> On Tue, Apr 28, 2015 at 8:51 AM, Andre Kelpe <[email protected]>
>> wrote:
>> 
>>> Hi,
>>> 
>>> I am currently learning the ins and outs of bigtop to work on the Cascading
>>> integration (https://issues.apache.org/jira/browse/BIGTOP-1766). I have a
>>> few questions around packaging in bigtop:
>>> 
>>> 1) most linux distros have packaging guidelines that should be followed.
>>> Does bigtop follow any set of rules in particular? Is there a linting tool
>>> for spec files etc?
>> 
>> This is distro specific. RedHat family distributions (RHEL, Fedora, Centos,
>> Amazon Linux) offer 'rpmlint'. You can install it and run it by hand. From
>> personal experience if you build deb packages on Ubuntu the package build
>> will run the lintian tool automatically.
>> 
>> 
>>> 2) Related to 1): Does bigtop require to follow a certain directory layout?
>>> Our tools are currently meant to be untarred and used as is, if bigtop
>>> requires them to be split over the file-system, we will have to work on
>>> that upstream before they can be included.
>> 
>> ​Yes, broadly speaking we follow the Linux standard base (LSB). A typical
>> package build happens in four steps. We move files around in the third step
>> to make packages look more like LSB. Let me take you through one package as
>> an example:
>> 
>> Step 1. Download source tarball from the software release site and expand
>> it.
>> 
>> Step 2. do-package-build
>> 
>> Here, for example, see what we do for ZooKeeper:
>> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/zookeeper/do-component-build
>> . We kick off a build of the component's binary artifacts while first
>> normalizing dependency versions according to the release BOM.
>> 
>> Step 3. install_<component>.sh
>> 
>> Again let's look at the ZK package:
>> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/zookeeper/install_zookeeper.sh
>> . Here we take the resulting tarball from the component build, expand it,
>> and move the locations of various types of files around to be more
>> LSB-like.
>> 
>> Step 4. Native packager
>> 
>> Finally we hand off the expanded and munged result from step 3 to the
>> native packager. For ZK, the RPM specfile used is here:
>> https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/zookeeper/SPECS/zookeeper.spec
>> . The Debian package control files are here:
>> https://github.com/apache/bigtop/tree/master/bigtop-packages/src/deb/zookeeper
>> 
>> 
>> 
>> 
>>> 3) I noticed that the packages are build from source instead of re-using
>>> binary releases. Is that a strict requirement or does it just happen to be
>>> that way? For the Cascading integration I was planning on downloading our
>>> binary releases so that bigtop ship with the same bits as our SDK.
>> 
>> ​We typically build packages from source so we can normalize dependencies.
>> ​For example, if a given Bigtop release ships with Hadoop 2.6.0 but the
>> Cascading SDK includes 2.5.1 artifacts, this would be ugly at best and
>> broken at worst.
>> 
>> 
>>> 4) What is your take on packaging standalone libraries? I noticed that most
>>> parts of bigtop are tools in the broader sense. Something one can invoke on
>>> the command line, but there is also a package for apache crunch, which is a
>>> library. What is the reasoning here? Would it make sense to build packages
>>> for libraries in the Cascading eco-system?
>> I'm not sure we have anything that amounts to a policy here. Crunch isn't
>> the only case. We package the DataFu library of UDFs for Pig. We package
>> the Phoenix SQL skin add-on for HBase. We also package Tez, which is a YARN
>> application requiring Hadoop, and although it could be useful on its own
>> it's meant to be picked up and used by the Hive and Pig packages.
>> 
>> If a champion for a component shows up we will give it a look. We could
>> absolutely build a core Cascading package and then a number of library or
>> add-on packages, if that's how you would like to set things up as champion
>> or maintainer of same.
>> 
>> 
>> 
>>> Thanks for your answers!
>>> 
>>> - André
>>> 
>>> --
>>> André Kelpe
>>> [email protected]
>>> http://concurrentinc.com
>> 
>> 
>> 
>> -- 
>> Best regards,
>> 
>>   - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)

Reply via email to