It seems pickle keeps track of references for basic python types. x = [1] y = [x] x,y = pickle.loads(pickle.dumps((x,y))) x.append(2) print(y) >>> [[1,2]]
Numpy arrays are different but references are forgotten after pickle/unpickle. Shared objects do not remain shared. Based on the quote below it could be considered bug with numpy/pickle. Object sharing (references to the same object in different places): This is similar to self-referencing objects; pickle stores the object once, and ensures that all other references point to the master copy. Shared objects remain shared, which can be very important for mutable objects. link <https://docs.python.org/2.0/lib/module-pickle.html> Another example with ndarrays: x = np.arange(5) y = x[::-1] x, y = pickle.loads(pickle.dumps((x, y))) x[0] = 9 print(y) >>> [4, 3, 2, 1, 0] In this case the two arrays share the exact same object for the data buffer (although object might not be the right word here) On Tue, Oct 25, 2016 at 7:28 PM, Robert Kern <robert.k...@gmail.com> wrote: > On Tue, Oct 25, 2016 at 3:07 PM, Stephan Hoyer <sho...@gmail.com> wrote: > > > > On Tue, Oct 25, 2016 at 1:07 PM, Nathaniel Smith <n...@pobox.com> wrote: > >> > >> Concretely, what do would you suggest should happen with: > >> > >> base = np.zeros(100000000) > >> view = base[:10] > >> > >> # case 1 > >> pickle.dump(view, file) > >> > >> # case 2 > >> pickle.dump(base, file) > >> pickle.dump(view, file) > >> > >> # case 3 > >> pickle.dump(view, file) > >> pickle.dump(base, file) > >> > >> ? > > > > I see what you're getting at here. We would need a rule for when to > include the base in the pickle and when not to. Otherwise, > pickle.dump(view, file) always contains data from the base pickle, even > with view is much smaller than base. > > > > The safe answer is "only use views in the pickle when base is already > being pickled", but that isn't possible to check unless all the arrays are > together in a custom container. So, this isn't really feasible for NumPy. > > It would be possible with a custom Pickler/Unpickler since they already > keep track of objects previously (un)pickled. That would handle [base, > view] okay but not [view, base], so it's probably not going to be all that > useful outside of special situations. It would make a neat recipe, but I > probably would not provide it in numpy itself. > > -- > Robert Kern > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion