Hey Wes, I just wanted to check-in on this work. Have there been any updates to the Arrow "data frame" project worth sharing?
Thanks, Eric -----Original Message----- From: Wes McKinney <wesmck...@gmail.com> Sent: Tuesday, May 21, 2019 8:17 AM To: dev@arrow.apache.org Subject: Re: [DISCUSS] Developing a "data frame" subproject in the Arrow C++ libraries On Tue, May 21, 2019, 8:43 AM Antoine Pitrou <anto...@python.org> wrote: > > Le 21/05/2019 à 13:42, Wes McKinney a écrit : > > hi Antoine, > > > > On Tue, May 21, 2019 at 5:48 AM Antoine Pitrou <anto...@python.org> > wrote: > >> > >> > >> Hi Wes, > >> > >> How does copy-on-write play together with memory-mapped data? It > >> seems that, depending on whether the memory map has several > >> concurrent users (a condition which may be timing-dependent), we > >> will either persist changes on disk or make them ephemeral in > >> memory. That doesn't sound very user-friendly, IMHO. > > > > With memory-mapping, any Buffer is sliced from the parent MemoryMap > > [1] so mutating the data on disk using this interface wouldn't be > > possible with the way that I've framed it. > > Hmm... I always forget that SliceBuffer returns a read-only view. > The more important issue is that parent_ is non-null. The idea is that no mutation is allowed if we reason that another Buffer object has access to the address space of interest. I think this style of copy-on-write is a reasonable compromise that prevents most kinds of defensive copying. > Regards > > Antoine. >