Hi,

posix_memalign() in memory_pool.cc of libarrow-dev uses
jemalloc's posix_memalign() (je_posix_memalign()). Because
it's built with ARROW_JEMALLOC=ON (default) and
JEMALLOC_MANGLE
https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool.cc#L53
. So we can't use mimalloc with LD_PRELOAD.

The comment for JEMALLOC_MANGLE in
memory_pool.c said "Needed to support jemalloc 3 and 4" bu
we bundle jemalloc 5.2.1 now. So we can remove JEMALLOC_MANGLE.

Could you open an issue on Jira
https://issues.apache.org/jira/browse/ARROW to add support
for overriding system memory pool's allocator by LD_PRELOAD?
(Do you want to work on this?)


Thanks,
-- 
kou

In <cack8hr5ltedfwrat3flsdp1hq5bsoj+dcilvqjdzpdome29...@mail.gmail.com>
  "Custom default C++ memory pool on Linux, and/or interception/auditing of 
system pool" on Tue, 14 Jun 2022 09:06:51 -0500,
  John Muehlhausen <j...@jgm.org> wrote:

> Hello,
> 
> This comment is regarding installation with `apt` on ubuntu 18.04 ...
> `libarrow-dev/bionic,now 8.0.0-1 amd64`
> 
> I'm a bit confused about the memory pool situation:
> 
> * I run with `ARROW_DEFAULT_MEMORY_POOL=system` and check that
> `arrow::default_memory_pool()->backend_name() ==
> arrow::system_memory_pool()->backend_name()`
> 
> * I then LD_PRELOAD a customized (*) mimalloc according to the directions
> at the mimalloc git repo and things like `strm->Reset(INT32_MAX);` seem not
> to be hitting it... I figured that is a big enough chunk to jostle it into
> doing something... `BufferOutputStream::Create(INT32_MAX)` is also not
> intercepted by mimalloc.  Is the "system" pool somehow going around the
> typical allocation interfaces on linux?  I built my own .so and linked it
> to the app and malloc() is getting intercepted.
> 
> * `arrow::mimalloc_memory_pool(&mmmp);` does return something... but
> apparently not "my" mimalloc ... statically linked?
> 
> * what is going on in Arrow with constructor (pre-main()) allocations?
> Some of this does hit my LD_PRELOADed mimalloc
> 
> * any way to get symbols for the apt-installed libs or would I need to
> build from source to get backtrace with symbols? (for chasing down sources
> of allocations)
> 
> * what is the C++ lib equivalent of the following from the Python code?  I
> figure I could stop trying to understand the built-in/default allocators if
> I could just replace them... but this may also intersect with my question
> about constructors.  Maybe I'd have to make sure my constructor runs first
> to perform the switch-a-roo before anything else tries to use the default
> pool?
> 
> ```
> namespace py {
> 
> static std::mutex memory_pool_mutex;
> static MemoryPool* default_python_pool = nullptr;
> 
> void set_default_memory_pool(MemoryPool* pool) {
>   std::lock_guard<std::mutex> guard(memory_pool_mutex);
>   default_python_pool = pool;
> }
> ```
> 
> 
> (*) the mimalloc customization: the main app has a weak reference that ends
> up defined by the LD_PRELOAD mimalloc, where the function so-supplied
> allows the app to install a function pointer (back to the main app) that
> gets called (if defined) at various interesting points in mimalloc
> 
> 
> Thanks,
> John

Reply via email to