pitrou opened a new issue, #50083:
URL: https://github.com/apache/arrow/issues/50083

   ### Describe the enhancement requested
   
   The [memray memory profiler](https://github.com/bloomberg/memray) works by 
interposing certain dynamic symbols in the profiled process to replace them 
with their own functions that will collect memory allocation data. It will 
currently, to the best of my knowledge, only recognize system C calls such 
`malloc`, `mmap`...
   
   When a third-party allocator like mimalloc or jemalloc is being used, such 
that Arrow does by default, memray does not see the logical allocation calls 
made through these allocator's APIs (because they are not interposed), but only 
the raw memory reservations that they issue using system routines.
   
   This can lead people using memray to think that a given Arrow workload (or 
any workload using such allocators, really) that an inordinate amount of memory 
is being used, while the reported memory mostly represents non-committed 
virtual memory that the allocator keeps for performance reasons. Concrete 
example in GH-40301: we allocate a number of 1kiB buffers from mimalloc, but 
memray sees a similar number of 64MiB calls to `mmap`.
   
   We [discussed](https://github.com/bloomberg/memray/issues/577) how to 
enhance memray such as to account for the corresponding logical allocations, 
and we came to the conclusion that it requires that Arrow exposes API calls 
that can be dynamically interposed. Since we typically build against a static 
`libmimalloc.a`, the mimalloc symbols cannot be exposed (at least, I cannot 
seem to get this to work on Ubuntu). This means we need to define our own 
symbols wrapping the mimalloc APIs.
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to