On Sa, 2016-02-06 at 16:56 -0600, Elliot Hallmark wrote: > Hi all, > > I have a program that uses resize-able arrays. I already over > -provision the arrays and use slices, but every now and then the data > outgrows that array and it needs to be resized. > > Now, I would like to have these arrays shared between processes > spawned via multiprocessing (for fast interprocess communication > purposes, not for parallelizing work on an array). I don't care > about mapping to a file on disk, and I don't want disk I/O happening. > I don't care (really) about data being copied in memory on resize. > I *do* want the array to be resized "in place", so that the child > processes can still access the arrays from the object they were > initialized with. > > > I can share arrays easily using arrays that are backed by memmap. > Ie: > > ``` > #Source: http://github.com/rainwoodman/sharedmem > > > class anonymousmemmap(numpy.memmap): > def __new__(subtype, shape, dtype=numpy.uint8, order='C'): > > descr = numpy.dtype(dtype) > _dbytes = descr.itemsize > > shape = numpy.atleast_1d(shape) > size = 1 > for k in shape: > size *= k > > bytes = int(size*_dbytes) > > if bytes > 0: > mm = mmap.mmap(-1,bytes) > else: > mm = numpy.empty(0, dtype=descr) > self = numpy.ndarray.__new__(subtype, shape, dtype=descr, > buffer=mm, order=order) > self._mmap = mm > return self > > def __array_wrap__(self, outarr, context=None): > return > numpy.ndarray.__array_wrap__(self.view(numpy.ndarray), outarr, > context) > ``` > > This cannot be resized because it does not own it's own data > (ValueError: cannot resize this array: it does not own its data). > (numpy.memmap has this same issue [0], even if I set refcheck to > False and even though the docs say otherwise [1]). > > arr._mmap.resize(x) fails because it is annonymous (error: [Errno 9] > Bad file descriptor). If I create a file and use that fileno to > create the memmap, then I can resize `arr._mmap` but the array itself > is not resized. > > Is there a way to accomplish what I want? Or, do I just need to > figure out a way to communicate new arrays to the child processes? >
I guess the answer is no, but the first question should be whether you can create a new array viewing the same data that is just larger? Since you have the mmap, that would be creating a new view into it. I.e. your "array" would be the memmap, and to use it, you always rewrap it into a new numpy array. Other then that, you would have to mess with the internal ndarray structure, since these kind of operations appear rather unsafe. - Sebastian > Thanks, > Elliot > > [0] https://github.com/numpy/numpy/issues/4198. > > [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap. > resize.html > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion