This is a request for comments from the Arrow developer community. I’m reaching out to start making the Arrow community aware of work that my team at Micron has recently open-sourced. Because of the Compute Express Link (CXL) standard, sharable disaggregated memory is coming – this is memory shared by multiple nodes in a cluster. Arrow and the other zero-copy formats are a great fit for shared memory if a natural enough access method emerges.
That’s where famfs comes in. (Famfs stands for Fabric-Attached Memory File System.) Famfs supports formatting shared memory as a file system that can be simultaneously mounted from multiple hosts. Putting zero-copy data frames in famfs files allows jobs across a cluster to memory map data frames from a single copy in shared memory. This has the potential to deduplicate memory while reducing or avoiding sharding and shuffling overheads. Famfs files can be memory-mapped and used without awareness that the files are “special” (though creating famfs files does require special steps). Memory mapping a famfs file provides direct access to the memory – with no copying through the page cache. Famfs was published in February as a Linux kernel patch <https://lore.kernel.org/linux-fsdevel/ze5edu3jblefw...@dread.disaster.area/T/#m27639915e97443186b3ade9d1e94423bc58e6e22> and a user space CLI and library; all are available on github <https://github.com/cxl-micron-reskit/famfs/blob/master/README.md>. The kernel patch set has been received seriously; if legitimate use cases are demonstrated, we expect it will make its way into mainline Linux – and we intend to step up and maintain it. Famfs is already usable with shared disaggregated memory (though this memory is not commercially available yet). Conventional memory can be shared among virtual machines today, to build (admittedly scaled down) POCs. I am looking for the following feedback: - Any questions are welcome, on or off-list. - Please tell us what sorts of work flows you might try with famfs shared memory if you had it – we are looking for ways to demonstrate use cases. - Help us get the word out. Are there people, groups, forums or conferences where we should introduce this capability? - If you are interested in testing famfs, please do – and let me know how we can help. Micron’s interest is in enabling an ecosystem where shared memory is practically usable. If famfs is successful, other access methods will surely emerge. Famfs is our attempt to enable shared memory via an existing, ubiquitous interface – making it easy to use without having to adopt new abstractions in advance. Thanks for reading, John Groves Micron