jvanstraten commented on a change in pull request #12116:
URL: https://github.com/apache/arrow/pull/12116#discussion_r809153416



##########
File path: cpp/src/arrow/device.h
##########
@@ -223,4 +236,55 @@ class ARROW_EXPORT CPUMemoryManager : public MemoryManager 
{
 ARROW_EXPORT
 std::shared_ptr<MemoryManager> default_cpu_memory_manager();
 
+/// A memory manager that uses the immutable zeros interface of the given 
memory pool,
+/// rather than the normal mutable buffer interface.

Review comment:
       > And to me 'managing' means 'can read the data backed by the buffer', 
not necessarily 'will allocate you exactly the same buffer'.
   
   But that's what `Device` represents according to its 
[docstring](https://github.com/apache/arrow/blob/74f512260fa69903feac61e1287f6954a3d98204/cpp/src/arrow/device.h#L36-L39)?
 In contrast, you can have multiple memory managers per device, expressly 
intended to [provide memory management 
primitives](https://github.com/apache/arrow/blob/74f512260fa69903feac61e1287f6954a3d98204/cpp/src/arrow/device.h#L81-L85)
 for some particular allocation method, such as for some particular memory 
pool. The allocation method for immutable zero buffers is fundamentally 
different, so I stand by my interpretation of those docstrings that immutable 
zero buffers warrant a new memory manager class. However, I certainly can't 
vouch for the correctness of those docstrings or the usefulness/applicability 
of this layer if defined as such.
   
   Furthermore, if the intention of associating a `MemoryManager` with a buffer 
is only to specify which device can actually access it and how (which seems 
very sensible to me) rather than also allowing similar buffers to be allocated 
(which indeed does not seem very useful to me), the association shouldn't have 
been a memory manager but a `Device`. After all, the only other information 
provided by `MemoryManager` on top of its associated `Device` is how to 
allocate buffers. The rest of its methods should IMO be moved to `Device`. If 
`Device` should be kept clean of memory-related stuff, there should be an 
intermediate layer called `MemoryRegion` or `AddressSpace` or something that 
decouples allocation strategies from address space/the significance and 
accessibility of a pointer. In general a device can have multiple address 
spaces, anyway (such as NUMA nodes on CPU, different memory banks on an 
accelerator, etc.). Whether `MemoryManager` is still relevant on top of 
`MemoryPool` at t
 hat point is disputable, but I guess the biggest difference is that 
`MemoryPool` operates on pointers to bytes, whereas `MemoryManager` yields 
buffers that clean themselves up via RAII.
   
   Nevertheless, I feel like this discussion is getting to the point where it 
shouldn't be held in a github thread on a mostly unrelated PR (what *would* be 
the right place? a JIRA issue?), and in general I don't really have the 
bandwidth or the use cases right now to dive further into this rabbit hole. So, 
for now, I've just reverted the additions I made relating to `MemoryManager` 
stuff, and we (or other people) can delve into this if/when it becomes 
relevant. 976dcf6




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to