Adar Dembo created KUDU-2990:
--------------------------------

             Summary: Kudu can't distribute libnuma (dependency of memkind)
                 Key: KUDU-2990
                 URL: https://issues.apache.org/jira/browse/KUDU-2990
             Project: Kudu
          Issue Type: Bug
          Components: util
    Affects Versions: 1.10.0, 1.11.0, 1.12.0
            Reporter: Adar Dembo


I noticed in [this 
commit|https://github.com/apache/kudu/commit/973e5cdf8fbcedcdcc659d980f3a3a69dc4f109f]
 that libnuma (a dependency of memkind) is licensed under the LGPL. This means 
that we can't distribute it as per the [ASF 3rd party license 
policy|https://www.apache.org/legal/resolved.html#category-x].

Some background: memkind was added as a new thirdparty dependency in 1.10.0. It 
replaced the libraries provided by [PMDK|https://pmem.io/pmdk/], and is used to 
power our generic non-volatile memory cache implementation, which can be 
configured as a replacement for the standard DRAM-based block cache.

I spent some time looking into whether our use of memkind actually calls into 
libnuma and unfortunately I think the answer is yes: when we map a pmem region 
via memkind, it creates an arena with which to do allocations, and that 
allocates some per-CPU data structures. The precise number of structures is 
derived from a call into libnuma.

We'll need to find a creative solution to this problem. Some ideas:
# Restrict libnuma to build time and expect it on the host system at runtime. 
We do this for some libraries already, like libsasl. I see libnuma installed on 
my laptop (Ubuntu 18) as well as on CentOS 6.6 and 7.3 machines we use for 
development. On my laptop the reverse dependencies look significant enough that 
it's likely installed by default, but I can't guarantee that everywhere, nor is 
it guaranteed for all sorts of funky container images users will no doubt put 
Kudu in.
# Like #1 but also patch memkind to dlopen() libnuma so that if it can't be 
found, whatever memkind function is currently running returns an error. That's 
a much better failure mode than "the Kudu process can't start", but it's 
unclear how much work this would be.
# Make the NVM cache implementation fully optional and excise it from the 
default Kudu distribution. I say "fully optional" because it's already somewhat 
optional: the CMake logic allows for it (and memkind, and libnuma) to not exist 
on macOS where that stuff apparently just doesn't work. Still, this would be 
frustrating for users who wish to use the NVM cache out of the box.

I'm not sure what needs to happen to 1.10.0 (first release with the libnuma 
dependency) and with 1.11.0 (imminently releasing). Could someone with more 
experience in ASF legal matters weigh in?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to