A minimal build using the following seems to have solved my problem.  The
various no-builtin params are guesswork based largely on alloc-override.c
from mimalloc.  It would be nice if someone documented somewhere how to
turn off classes of builtins for each popular compiler or if this received
compiler support (e.g. -fno-builtingroup-allocation)... turning off ALL
builtins seems too heavy-handed.

cmake -E env CFLAGS="-fno-builtin-malloc -fno-builtin-calloc
-fno-builtin-realloc -fno-builtin-free -fno-builtin-reallocf
-fno-builtin-malloc_size -fno-builtin-malloc_usable_size
-fno-builtin-valloc -fno-builtin-vfree -fno-builtin-malloc_good_size
-fno-builtin-posix_memalign -fno-builtin-alligned_alloc -fno-builtin-cfree
-fno-builtin-pvalloc -fno-builtin-reallocarray -fno-builtin-reallocarr
-fno-builtin-memalign -fno-builtin-_aligned_malloc
-fno-builtin-__libc_malloc -fno-builtin-__libc_calloc
-fno-builtin-__libc_realloc -fno-builtin-__libc_free
-fno-builtin-__libc_cfree -fno-builtin-__libc_valloc
-fno-builtin-__libc_pvalloc -fno-builtin-__libc_memalign
-fno-builtin-__posix_memalign -fno-builtin-operator_new
-fno-builtin-operator_delete" CXXFLAGS="-fno-builtin-malloc
-fno-builtin-calloc -fno-builtin-realloc -fno-builtin-free
-fno-builtin-reallocf -fno-builtin-malloc_size
-fno-builtin-malloc_usable_size -fno-builtin-valloc -fno-builtin-vfree
-fno-builtin-malloc_good_size -fno-builtin-posix_memalign
-fno-builtin-alligned_alloc -fno-builtin-cfree -fno-builtin-pvalloc
-fno-builtin-reallocarray -fno-builtin-reallocarr -fno-builtin-memalign
-fno-builtin-_aligned_malloc -fno-builtin-__libc_malloc
-fno-builtin-__libc_calloc -fno-builtin-__libc_realloc
-fno-builtin-__libc_free -fno-builtin-__libc_cfree
-fno-builtin-__libc_valloc -fno-builtin-__libc_pvalloc
-fno-builtin-__libc_memalign -fno-builtin-__posix_memalign
-fno-builtin-operator_new -fno-builtin-operator_delete" cmake --preset
ninja-debug-minimal -DARROW_JEMALLOC=OFF -DARROW_MIMALLOC=OFF
-DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_INSTALL_PREFIX=/usr/local ..

On Tue, Jun 14, 2022 at 12:36 PM John Muehlhausen <j...@jgm.org> wrote:

> My best guess at this moment is that the Arrow lib I'm using was built
> with a compiler that had something like __builtin_posix_memalign in effect
> ??
>
> I say this because deploying __builtin_malloc has the same deleterious
> effect on my own .so
>
> On Tue, Jun 14, 2022 at 10:53 AM John Muehlhausen <j...@jgm.org> wrote:
>
>> I'm using ARROW_DEFAULT_MEMORY_POOL=system
>>
>> Based on a review of memory_pool.cc I expect this to become
>> posix_memalign calls on Linux
>>
>> When I call posiix_memalign in a .so that I created and linked with my
>> app, using LD_PRELOAD=/usr/local/lib/libmimalloc.so to run the app, these
>> calls get forwarded to mi_posix_memalign (because I threw a prinf in there
>> and re-built mimalloc)... note, I'm not talking about Arrow's built-in
>> mimalloc.
>>
>> Maybe Arrow's mimalloc is keeping the LD_PRELOAD of my custom mimalloc
>> from taking effect?  How is mimalloc included in Arrow?  When I
>> call arrow::mimalloc_memory_pool() I do get an Ok status, so it is in the
>> build I'm using from `apt`
>>
>> -John
>>
>> On Tue, Jun 14, 2022 at 10:37 AM Weston Pace <weston.p...@gmail.com>
>> wrote:
>>
>>> Sorry, that should have said "when Arrow builds jemalloc".  Here is
>>> the command we send down (from ThirdPartyToolchain.cmake):
>>>
>>> ```
>>> JEMALLOC_CONFIGURE_COMMAND
>>> "--prefix=${JEMALLOC_PREFIX}"
>>> "--libdir=${JEMALLOC_LIB_DIR}"
>>> "--with-jemalloc-prefix=je_arrow_"
>>> "--with-private-namespace=je_arrow_private_"
>>> "--without-export"
>>> "--disable-shared"
>>> # Don't override operator new()
>>> "--disable-cxx"
>>> "--disable-libdl"
>>> # See https://github.com/jemalloc/jemalloc/issues/1237
>>> "--disable-initial-exec-tls"
>>> ${EP_LOG_OPTIONS})
>>> list(APPEND
>>> ```
>>>
>>> On Tue, Jun 14, 2022 at 5:35 AM Weston Pace <weston.p...@gmail.com>
>>> wrote:
>>> >
>>> > I can try and give a more detailed answer later in the week but the
>>> > gist of it is that Arrow manages all "buffer allocations" with a
>>> > memory pool.  These are the allocations for the actual data in the
>>> > arrays.  These are the allocations that use the memory pool configured
>>> > by ARROW_DEFAULT_MEMORY_POOL.
>>> >
>>> > To avoid interfering with the user's allocations Arrow does not
>>> > configure the system allocator at all.  So when Arrow builds it alters
>>> > it slightly (using cmake variables I think) to be specific to Arrow.
>>> > This might make it a bit tricky to get debug symbols for jemalloc but
>>> > you could always build Arrow in debug mode and intercept the methods
>>> > in memory_pool.cc if your focus is tracking allocations.
>>> >
>>> > Arrow still uses the system allocator for all non-buffer allocations.
>>> > So, for example, when reading in a large IPC file, the majority of the
>>> > data will be allocated by Arrow's memory pool.  However, the schema,
>>> > and the wrapper array object itself will be allocated by the system
>>> > allocator.  This is probably why switching the system allocator to
>>> > jemalloc shows some, but not all, Arrow allocations happening there.
>>> >
>>> > On Tue, Jun 14, 2022 at 5:28 AM John Muehlhausen <j...@jgm.org> wrote:
>>> > >
>>> > > A code review has demonstrated that Arrow uses posix_memalign ... I
>>> do
>>> > > believe mimalloc preload is "catching" this but I didn't tool it
>>> with my
>>> > > customization.  Still interested in any guidance on the other points
>>> > > raised, and sorry for some of this being noise.
>>> > >
>>> > > -John
>>> > >
>>> > > On Tue, Jun 14, 2022 at 9:06 AM John Muehlhausen <j...@jgm.org>
>>> wrote:
>>> > >
>>> > > > Hello,
>>> > > >
>>> > > > This comment is regarding installation with `apt` on ubuntu 18.04
>>> ...
>>> > > > `libarrow-dev/bionic,now 8.0.0-1 amd64`
>>> > > >
>>> > > > I'm a bit confused about the memory pool situation:
>>> > > >
>>> > > > * I run with `ARROW_DEFAULT_MEMORY_POOL=system` and check that
>>> > > > `arrow::default_memory_pool()->backend_name() ==
>>> > > > arrow::system_memory_pool()->backend_name()`
>>> > > >
>>> > > > * I then LD_PRELOAD a customized (*) mimalloc according to the
>>> directions
>>> > > > at the mimalloc git repo and things like `strm->Reset(INT32_MAX);`
>>> seem not
>>> > > > to be hitting it... I figured that is a big enough chunk to jostle
>>> it into
>>> > > > doing something... `BufferOutputStream::Create(INT32_MAX)` is also
>>> not
>>> > > > intercepted by mimalloc.  Is the "system" pool somehow going
>>> around the
>>> > > > typical allocation interfaces on linux?  I built my own .so and
>>> linked it
>>> > > > to the app and malloc() is getting intercepted.
>>> > > >
>>> > > > * `arrow::mimalloc_memory_pool(&mmmp);` does return something...
>>> but
>>> > > > apparently not "my" mimalloc ... statically linked?
>>> > > >
>>> > > > * what is going on in Arrow with constructor (pre-main())
>>> allocations?
>>> > > > Some of this does hit my LD_PRELOADed mimalloc
>>> > > >
>>> > > > * any way to get symbols for the apt-installed libs or would I
>>> need to
>>> > > > build from source to get backtrace with symbols? (for chasing down
>>> sources
>>> > > > of allocations)
>>> > > >
>>> > > > * what is the C++ lib equivalent of the following from the Python
>>> code?  I
>>> > > > figure I could stop trying to understand the built-in/default
>>> allocators if
>>> > > > I could just replace them... but this may also intersect with my
>>> question
>>> > > > about constructors.  Maybe I'd have to make sure my constructor
>>> runs first
>>> > > > to perform the switch-a-roo before anything else tries to use the
>>> default
>>> > > > pool?
>>> > > >
>>> > > > ```
>>> > > > namespace py {
>>> > > >
>>> > > > static std::mutex memory_pool_mutex;
>>> > > > static MemoryPool* default_python_pool = nullptr;
>>> > > >
>>> > > > void set_default_memory_pool(MemoryPool* pool) {
>>> > > >   std::lock_guard<std::mutex> guard(memory_pool_mutex);
>>> > > >   default_python_pool = pool;
>>> > > > }
>>> > > > ```
>>> > > >
>>> > > >
>>> > > > (*) the mimalloc customization: the main app has a weak reference
>>> that
>>> > > > ends up defined by the LD_PRELOAD mimalloc, where the function
>>> so-supplied
>>> > > > allows the app to install a function pointer (back to the main
>>> app) that
>>> > > > gets called (if defined) at various interesting points in mimalloc
>>> > > >
>>> > > >
>>> > > > Thanks,
>>> > > > John
>>> > > >
>>>
>>

Reply via email to