[
https://issues.apache.org/jira/browse/ARROW-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-4422:
----------------------------------
Labels: pull-request-available (was: )
> [Plasma] Enforce memory limit in plasma, rather than relying on
> dlmalloc_set_footprint_limit
> --------------------------------------------------------------------------------------------
>
> Key: ARROW-4422
> URL: https://issues.apache.org/jira/browse/ARROW-4422
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, Plasma (C++)
> Affects Versions: 0.12.0
> Reporter: Anurag Khandelwal
> Assignee: Anurag Khandelwal
> Priority: Minor
> Labels: pull-request-available
> Fix For: 0.13.0
>
>
> Currently, Plasma relies on dlmalloc_set_footprint_limit to limit the memory
> utilization for Plasma Store. This is restrictive because:
> * It restricts Plasma to dlmalloc, which supports limiting memory footprint,
> as opposed to other, potentially more performant malloc implementations
> (e.g., jemalloc)
> * dlmalloc_set_footprint_limit does not guarantee that the limit set by it
> the amount of _usable_ memory. As such, we might trigger evictions much
> earlier than hitting this limit, e.g., due to fragmentation or metadata
> overheads.
> To overcome this, we can impose the memory limit at Plasma by tracking the
> number of bytes allocated and freed using malloc and free calls. Whenever the
> allocation reaches the set limit, we fail any subsequent allocations (i.e.,
> return NULL from malloc). This allows Plasma to not be tied to dlmalloc, and
> also provides more accurate tracking of memory allocation/capacity.
> Caveat: We will need to make sure that the mmaped files are living on a file
> system that is a bit larger (depending on malloc implementation) than the
> Plasma memory limit to account for the extra memory required due to
> fragmentation/metadata overheads.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)