> Interning the strings on unpickling makes the pickles smaller, and at > least for cPickle actually makes unpickling sequences of many objects > slightly faster. I have included proposed patches to cPickle.c and > pickle.py, and would appreciate any feedback.
Please submit patches always to the bug tracker. On the proposed change: While it is fairly unintrusive, I would like to propose a different approach - pickle interned strings special. The marshal module already uses this approach, and it should extend to pickle (although it would probably require a new protocol). On pickling, inspect each string and check whether it is interned. If so, emit a different code, and record it into the object id dictionary. On a second occurrence of the string, only pickle a backward reference. (Alternatively, check whether pickling the same string a second time would be more compact). On unpickling, support the new code to intern the result strings; subsequent references to it will go to the standard backreferencing algorithm. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com