Hi Arrow community! I am a Debian Developer looking to package Arrow officially in Debian as a dependency for a specific tool I want to get into Debian as well.
I do have a working package based on the JFrog packaging groundwork [0] but had to make various changes mostly to avoid downloading dependencies from the Internet (which is not allowed during the Debian build process). So, mostly setting -DARROW_DEPENDENCY_SOURCE=SYSTEM and tuning enabled/disabled features based on what we have and what we don't. Result is at [1]. It looks like I can build all packages built by the JFrog packaging with no problems (at least for amd64). Build log attached. The only exception here are ORC and S3 support, which are missing because the ORC library [2] and the AWS C++ SDK [3] are not packaged yet. But apart from that it looks like everything works. Just so you know, nothing has been officially uploaded yet. The package is still in preparation and only used internally within my organization so far. Being quite far in the packaging process, I have some questions: 1.) Would somebody from the upstream team be interested in collaborating to keep Arrow maintained in Debian? I would be able to review updates and sponsor uploads. 2.) One quite scary thing left is documenting all copyright and license occurrences in the codebase. It looks like there is a fair bit of embedded code coming from various sources and with varying levels of modification. The debian/copyright file in the JFrog packaging only contains a number of TODOs so I guess this is still up to me to finish before I can think of doing an upload. Is the LICENSE.txt in the Arrow source root directory complete and lists _all_ third-party licenses and copyright holders in the release tarball? If so, I could use it as a template and just reformat it as required by Debian? That would be nice to know, otherwise that would mean a lot of digging and probably still missing something. Missed license or copyright holder mentions are the most common reason why new packages are rejected during the initial, mandatory manual review for new packages, BTW, so I'd like to avoid unnecessary review iterations ;) Thanks! Best regards Sascha [0] https://apache.jfrog.io/artifactory/arrow/debian/pool/bullseye/main/a/apache-arrow/apache-arrow_4.0.0-1.debian.tar.xz [1] https://salsa.debian.org/satta/arrow/-/tree/master/debian [2] https://github.com/apache/orc [3] https://github.com/aws/aws-sdk-cpp
arrow-build-log.txt.xz
Description: application/xz
OpenPGP_signature
Description: OpenPGP digital signature