[issue5084] unpickling does not intern attribute names
Changes by Antoine Pitrou pit...@free.fr: -- stage: needs patch - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: Thanks, I'll take a look very soon. -- assignee: alexandre.vassalotti - pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: Committed in r72223, r72224. Thanks! -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: (s/string/still/, of course...) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: Jake, are you string working on a test? -- priority: - normal stage: - needs patch versions: +Python 2.7, Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Jake McGuire j...@youtube.com added the comment: This fell through the cracks. But the following unit test seems like it does the right thing (fails with the old module, works with the new ones). -- Added file: http://bugs.python.org/file13835/pickletester.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Jake McGuire j...@youtube.com added the comment: Ugh. Clearly I didn't check to see if it worked or not, as it didn't even compile. A new diff, tested and verified to work, is attached. I'll work on creating a test. Added file: http://bugs.python.org/file13178/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jake McGuire j...@youtube.com: Removed file: http://bugs.python.org/file12882/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jake McGuire j...@youtube.com: Removed file: http://bugs.python.org/file13169/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Jake McGuire j...@youtube.com added the comment: The fromstring/asstring dance was due to my incomplete understanding of refcounting. PyDict_Next returns a borrowed reference but PyString_InternInPlace expects an owned reference. Thanks to Kirk McDonald, I have a new patch that does the refcounting correctly. What sort of test did you have in mind? Added file: http://bugs.python.org/file13169/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: In your patch, I'm not sure where the `name` variable is coming from. Have you checked it works? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: To give an example of what the test could check: class C(object): ... def __init__(self): ...self.some_long_attribute_name = 5 ... c = C() c.__dict__ {'some_long_attribute_name': 5} sorted(map(id, c.__dict__)) [140371243499696] import pickle d = pickle.loads(pickle.dumps(c, -1)) d __main__.C object at 0x7faaba1b0390 d.__dict__ {'some_long_attribute_name': 5} sorted(map(id, d.__dict__)) [140371243501232] The `sorted(map(id, d.__dict__))` should have been the same before and after pickling. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Collin Winter coll...@gmail.com: -- nosy: +collinwinter, jyasskin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Antoine Pitrou pit...@free.fr added the comment: Why do you call PyString_AsString() followed by PyString_FromString()? Strings are immutable so you shouldn't neek to take a copy. Besides, it would be nice to add a test. -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
New submission from Jake McGuire j...@youtube.com: Instance attribute names are normally interned - this is done in PyObject_SetAttr (among other places). Unpickling (in pickle and cPickle) directly updates __dict__ on the instance object. This bypasses the interning so you end up with many copies of the strings representing your attribute names, which wastes a lot of space, both in RAM and in pickles of sequences of objects created from pickles. Note that the native python memcached client uses pickle to serialize objects. import pickle class C(object): ... def __init__(self, x): ... self.long_attribute_name = x ... len(pickle.dumps([pickle.loads(pickle.dumps(C(None), pickle.HIGHEST_PROTOCOL)) for i in range(100)], pickle.HIGHEST_PROTOCOL)) 3658 len(pickle.dumps([C(None) for i in range(100)], pickle.HIGHEST_PROTOCOL)) 1441 Interning the strings on unpickling makes the pickles smaller, and at least for cPickle actually makes unpickling sequences of many objects slightly faster. I have included proposed patches to cPickle.c and pickle.py, and would appreciate any feedback. -- components: Library (Lib) files: cPickle.c.diff keywords: patch messages: 80670 nosy: jakemcguire severity: normal status: open title: unpickling does not intern attribute names type: resource usage Added file: http://bugs.python.org/file12879/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jake McGuire j...@youtube.com: Added file: http://bugs.python.org/file12880/pickle.py.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Gabriel Genellina gagsl-...@yahoo.com.ar added the comment: Either my browser got crazy, or you uploaded the same patch (.py) twice. -- nosy: +gagenellina ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jake McGuire j...@youtube.com: Removed file: http://bugs.python.org/file12879/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Changes by Jake McGuire j...@youtube.com: Added file: http://bugs.python.org/file12882/cPickle.c.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Alexandre Vassalotti alexan...@peadrop.com added the comment: The patch for cPickle doesn't do the same thing as the pickle one. In the cPickle one, you are only interning slot attributes, which, I believe, is not what you intended. :-) -- assignee: - alexandre.vassalotti nosy: +alexandre.vassalotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Jake McGuire j...@youtube.com added the comment: Are you sure? I may not have enough context in my diff, which I should have done against an anonymous svn checkout, but it looks like the slot attributes get set several lines after my diff. while (PyDict_Next(slotstate, ...)) as opposed to the while (PyDict_Next(state, ...)) in my change... -jake On Jan 27, 2009, at 6:54 PM, Alexandre Vassalotti wrote: Alexandre Vassalotti alexan...@peadrop.com added the comment: The patch for cPickle doesn't do the same thing as the pickle one. In the cPickle one, you are only interning slot attributes, which, I believe, is not what you intended. :-) -- assignee: - alexandre.vassalotti nosy: +alexandre.vassalotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5084] unpickling does not intern attribute names
Alexandre Vassalotti alexan...@peadrop.com added the comment: Oh, you are right. I was looking at py3k's version of pickle, which uses PyDict_Update instead of a while loop; that is why I got confused. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5084 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com