On Tue, Jan 27, 2009 at 10:43 AM, "Martin v. Löwis" <mar...@v.loewis.de> wrote:
>> Interning the strings on unpickling makes the pickles smaller, and at
>> least for cPickle actually makes unpickling sequences of many objects
>> slightly faster.  I have included proposed patches to cPickle.c and
>> pickle.py, and would appreciate any feedback.
>
> Please submit patches always to the bug tracker.
>
> On the proposed change: While it is fairly unintrusive, I would like to
> propose a different approach - pickle interned strings special. The
> marshal module already uses this approach, and it should extend to
> pickle (although it would probably require a new protocol).
>
> On pickling, inspect each string and check whether it is interned. If
> so, emit a different code, and record it into the object id dictionary.
> On a second occurrence of the string, only pickle a backward reference.
> (Alternatively, check whether pickling the same string a second time
> would be more compact).
>
> On unpickling, support the new code to intern the result strings;
> subsequent references to it will go to the standard backreferencing
> algorithm.

Hm. This would change the pickling format though. Wouldn't just
interning (short) strings on unpickling be simpler?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to