willtemperley commented on issue #154: URL: https://github.com/apache/arrow-swift/issues/154#issuecomment-4204021628
@abandy, thanks for the feedback! To clarify, when I mentioned IPC, I was specifically referring to the [Arrow IPC Format](https://arrow.apache.org/docs/cpp/api/ipc.html#reading-ipc-streams-and-files). Yes ArrowBuffer is flexible in terms of the data it points to (heterogeneity). However, my concern is with the backing storage and lifecycle management. Currently, ArrowBuffer is hardcoded to be either a pointer to memory that Swift must deallocate or a pointer where the user manages the lifecycle. This creates a bottleneck for "true" zero-copy in two major scenarios: Memory-Mapped Files: We currently cannot wrap a memory-mapped file (e.g., via Data(contentsOf:options: .mappedIfSafe)) into an ArrowBuffer without losing the ability for the buffer to manage its own lifecycle. If we want to read a large IPC file without loading it into the heap, we need a buffer type that knows to unmap or hold a reference to the mapping object rather than just calling deallocate(). Custom Memory Pools: Unlike Arrow C++, which uses a MemoryManager abstraction to allow buffers to reside on different devices (like GPUs) or specialized allocators, the Swift ArrowBuffer is strictly tied to a CPU raw pointer. In short: Arrow C++ allows different Buffer implementations (like PoolBuffer, MmapBuffer, or CudaBuffer) while ArrowSwift currently only has one Buffer implementation that is typed to a heap pointer. I believe we need to move toward a design—likely a class-based hierarchy or an ownership-tracking model—that allows these different storage backends to exist under the same ArrowBuffer interface used by ArrowData. I’d be happy to meet and discuss how we can modernize this to support these use cases while keeping the API stable for C interop! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
