[
https://issues.apache.org/jira/browse/KUDU-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967391#comment-16967391
]
Piotr Balcer commented on KUDU-2990:
------------------------------------
I'm Michal's colleague from the Persistent Memory Development Kit (PMDK) team.
The vast majority of libraries shipped as part of PMDK, including memkind, have
non-optional LGPL dependencies on Linux-based platforms. The reason for that is
simple, the linux kernel convenience user-space interfaces themselves are
sometimes distributed as LGPL libraries, as is the case with libnuma.
In this concrete case, libmemkind needs to call mbind(), which is necessary for
any NUMA-based memory manipulation. The mbind() syscall interface is declared
and implemented in libnuma, and without this dependency we would have to resort
to raw syscalls and re-defining the kernel interfaces (see
[https://github.com/numactl/numactl/blob/master/syscall.c]). It's unlikely that
we would be persuaded to do so without a very important reason.
Also, notice that libnuma isn't some obscure library. All software that wishes
to be explicitly manage NUMA-awareness on multi-socket systems needs to have a
libnuma dependency. Meaning that no ASF project that ships on Linux can do so
without re-implementing Linux user-space interfaces.
The same problem occurs with our other software that relies on Linux kernel
interfaces that ship in the form of an LGPL-licensed libraries. For example,
the entire linux libnvdimm subsystem is controlled through LGPL-licensed
libndctl ([https://github.com/pmem/ndctl]). This is a hard-dependency on Linux
for most of PMDK's libraries. And obviously, there's glibc...
Having said all that, long-term, we could modularize libmemkind so that the
various dependencies are optional, and functionality ships as separate plugins,
e.g., to allocate persistent memory, you'd need base libmemkind and a
libmemkind-pmem plugin. This would solve the licensing issue *in this
particular case*, but it's just an idea right now and it would take
considerable amount of time to implement.
> Kudu can't distribute libnuma (dependency of memkind)
> -----------------------------------------------------
>
> Key: KUDU-2990
> URL: https://issues.apache.org/jira/browse/KUDU-2990
> Project: Kudu
> Issue Type: Bug
> Components: util
> Affects Versions: 1.10.0, 1.11.0, 1.12.0
> Reporter: Adar Dembo
> Assignee: Adar Dembo
> Priority: Blocker
>
> I noticed in [this
> commit|https://github.com/apache/kudu/commit/973e5cdf8fbcedcdcc659d980f3a3a69dc4f109f]
> that libnuma (a dependency of memkind) is licensed under the LGPL. This
> means that we can't distribute it as per the [ASF 3rd party license
> policy|https://www.apache.org/legal/resolved.html#category-x].
> Some background: memkind was added as a new thirdparty dependency in 1.10.0.
> It replaced the libraries provided by [PMDK|https://pmem.io/pmdk/], and is
> used to power our generic non-volatile memory cache implementation, which can
> be configured as a replacement for the standard DRAM-based block cache.
> I spent some time looking into whether our use of memkind actually calls into
> libnuma and unfortunately I think the answer is yes: when we map a pmem
> region via memkind, it creates an arena with which to do allocations, and
> that allocates some per-CPU data structures. The precise number of structures
> is derived from a call into libnuma.
> We'll need to find a creative solution to this problem. Some ideas:
> # Restrict libnuma to build time and expect it on the host system at runtime.
> We do this for some libraries already, like libsasl. I see libnuma installed
> on my laptop (Ubuntu 18) as well as on CentOS 6.6 and 7.3 machines we use for
> development. On my laptop the reverse dependencies look significant enough
> that it's likely installed by default, but I can't guarantee that everywhere,
> nor is it guaranteed for all sorts of funky container images users will no
> doubt put Kudu in.
> # Like #1 but also patch memkind to dlopen() libnuma so that if it can't be
> found, whatever memkind function is currently running returns an error.
> That's a much better failure mode than "the Kudu process can't start", but
> it's unclear how much work this would be.
> # Make the NVM cache implementation fully optional and excise it from the
> default Kudu distribution. I say "fully optional" because it's already
> somewhat optional: the CMake logic allows for it (and memkind, and libnuma)
> to not exist on macOS where that stuff apparently just doesn't work. Still,
> this would be frustrating for users who wish to use the NVM cache out of the
> box.
> I'm not sure what needs to happen to 1.10.0 (first release with the libnuma
> dependency) and with 1.11.0 (imminently releasing). Could someone with more
> experience in ASF legal matters weigh in?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)