[ 
https://issues.apache.org/jira/browse/KUDU-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968169#comment-16968169
 ] 

Adar Dembo commented on KUDU-2990:
----------------------------------

[~pbalcer] thanks for the feedback. Agreed that, given your description, 
libnuma sounds like a core dependency of memkind, and as such, shouldn't be 
dlopen'ed. To be compliant with ASF policy we then have two choices:
# dlopen() memkind at runtime if the NVM cache is desired, thus avoiding any 
dependency on memkind or libnuma.
# At build time, statically link against memkind and dynamically link against 
libnuma. Never ship libnuma, and tell Kudu users that they must have it 
installed on their systems or Kudu won't start.

Given the relative importance of the NVM cache to Kudu, I'm inclined to go with 
option #1, which is what I implemented in the linked code review. The downside 
is users will have to get memkind on their systems themselves, but that's not 
so bad given NVMe hardware is still rare, and is generally used in newer 
distros where a new memkind is readily available. We can always revisit this 
decision in the future.

bq. I checked the version of memkind available in popular distros, CentOS 7 and 
8 both include libmemkind in version 1.7.

Thanks for the correction. I had checked in CentOS 7.3 (the machine I have 
locally) but I can confirm that memkind 1.7 is available in CentOS 7.7.


> Kudu can't distribute libnuma (dependency of memkind)
> -----------------------------------------------------
>
>                 Key: KUDU-2990
>                 URL: https://issues.apache.org/jira/browse/KUDU-2990
>             Project: Kudu
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 1.10.0, 1.11.0, 1.12.0
>            Reporter: Adar Dembo
>            Assignee: Adar Dembo
>            Priority: Blocker
>
> I noticed in [this 
> commit|https://github.com/apache/kudu/commit/973e5cdf8fbcedcdcc659d980f3a3a69dc4f109f]
>  that libnuma (a dependency of memkind) is licensed under the LGPL. This 
> means that we can't distribute it as per the [ASF 3rd party license 
> policy|https://www.apache.org/legal/resolved.html#category-x].
> Some background: memkind was added as a new thirdparty dependency in 1.10.0. 
> It replaced the libraries provided by [PMDK|https://pmem.io/pmdk/], and is 
> used to power our generic non-volatile memory cache implementation, which can 
> be configured as a replacement for the standard DRAM-based block cache.
> I spent some time looking into whether our use of memkind actually calls into 
> libnuma and unfortunately I think the answer is yes: when we map a pmem 
> region via memkind, it creates an arena with which to do allocations, and 
> that allocates some per-CPU data structures. The precise number of structures 
> is derived from a call into libnuma.
> We'll need to find a creative solution to this problem. Some ideas:
> # Restrict libnuma to build time and expect it on the host system at runtime. 
> We do this for some libraries already, like libsasl. I see libnuma installed 
> on my laptop (Ubuntu 18) as well as on CentOS 6.6 and 7.3 machines we use for 
> development. On my laptop the reverse dependencies look significant enough 
> that it's likely installed by default, but I can't guarantee that everywhere, 
> nor is it guaranteed for all sorts of funky container images users will no 
> doubt put Kudu in.
> # Like #1 but also patch memkind to dlopen() libnuma so that if it can't be 
> found, whatever memkind function is currently running returns an error. 
> That's a much better failure mode than "the Kudu process can't start", but 
> it's unclear how much work this would be.
> # Make the NVM cache implementation fully optional and excise it from the 
> default Kudu distribution. I say "fully optional" because it's already 
> somewhat optional: the CMake logic allows for it (and memkind, and libnuma) 
> to not exist on macOS where that stuff apparently just doesn't work. Still, 
> this would be frustrating for users who wish to use the NVM cache out of the 
> box.
> I'm not sure what needs to happen to 1.10.0 (first release with the libnuma 
> dependency) and with 1.11.0 (imminently releasing). Could someone with more 
> experience in ASF legal matters weigh in?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to