Erich Schubert created BIGTOP-713:
-------------------------------------

             Summary: use newer debhelper and source format 3.0 (quilt) for 
Debian and Ubuntu packaging
                 Key: BIGTOP-713
                 URL: https://issues.apache.org/jira/browse/BIGTOP-713
             Project: Bigtop
          Issue Type: Improvement
          Components: Debian
    Affects Versions: 0.5.0
            Reporter: Erich Schubert
            Priority: Minor


debhelper can automate a lot of common things in debian package creation.

The current packages use an old style of debhelper, that often is unnecessarily 
complicated, making it harder to fix things.

For example, current Hadoop (0.23.3) does not compile on Debian because of the 
new GCC version. The fix is a simple "include <unistd.h>" in the HadoopPipes.cc 
file.

Modern Debian packaging with "quilt" has an excellent mechanism for managing 
such patches. However, in order to use this with the current Bigtop packaging, 
one has to 1. create debian/source/format to use "3.0 (quilt)" 2. manually add 
quilt patching to the debian/rules targets. 3. making sure the .debian.tar.gz 
is also copied instead of the old .diff.gz

You will be surprised how many things debhelper does well on its own with a 
rules file consisting just of little more than the automagic:

%:
        dh $@

Furthermore, "java-wrappers" is a Debian and Ubuntu package that helps with 
setting up classpaths and choosing the JVM. It can do all of bigtop-utils and 
more, and it is used by other Java packages. IMHO it should be preferred 
instead.

If the packaging would be more Debian-standard, it would be alot easier to get 
the packages at some point accepted into Debian mainline. It may even be 
desirable to build the various hadoop components (-commmon, -yarn etc.) 
independently if they are isolated well enough upstream.

Don't get me wrong. I think the packages are pretty good already. In 
particularly I like the split into namenode and datanode packages and the use 
of update-alternatives, for example. I just found it rather hard to get a grip 
of the process and to get my fixes into the package. For example, I had to 
manually set JAVA_HOME before building, some build dependencies were missing 
(cmake, but it probably is a new requirement), some paths have changed 
(probably the yarn promotion to a top level project?)
I understand that you want to have as much common code for all distributions as 
possible, as opposed to having per-distribution packaging. However, if every 
project uses its own specific version of java-wrappers and build process, 
things will not really be better than if it is at least consistent across the 
various distributions.
But ideally, there should be very little packaging code needed anyway, and most 
things be done by an appropriate installation process upstream.

And seriously, /usr/lib/hadoop/lib is a **mess**. There even is a package in 
there with a "*" in the file name. Plus, a lot of these jars are available in 
Debian, and could be shared across packages if the packages would accept them 
to be managed by the distribution instead of shipping their own...

Even within the bigtop packages this leads to a totally unnecessary overlap:

995720 Sep 25 14:18 /usr/lib/hadoop-hdfs/lib/snappy-java-1.0.3.2.jar
995720 Sep 25 14:18 /usr/lib/hadoop-mapreduce/lib/snappy-java-1.0.3.2.jar
995720 Sep 25 14:18 /usr/lib/hadoop-yarn/lib/snappy-java-1.0.3.2.jar
[...]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to