-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 20/09/13 15:33, Benjamin Peterson wrote: > Well, the pickler should memoize bytes objects if you have lots of > the same one in a pickle...
Only if they are the very same object. Not diferent bytes objects with the same value. Pickle doesn't do "a==b" but "id(a)==id(b)". Yes, I know that "a==b" would break mutable objects. It is just an example. I don't want to pursue that path. Performance of pickle is already appallingly slow. In my project, I will do the redundancy removal on my own way, as explained in ither message on this thread. Example: * Original pickle: 14416284 bytes * Pickle with "interned" strings: 3004880 bytes (quite an improvement, but this is particular to my case, I have a lot of string duplications here. The pickle also loads a bit faster) * Pickle including an extra dictionary of "interned" strings, created using the "interned.setdefault(object,object)" pattern: 5126587 bytes. Sniff. Could I do this more compactly?. - -- Jesús Cea Avión _/_/ _/_/_/ _/_/_/ j...@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ jabber / xmpp:j...@jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBUjxRwZlgi5GaxT1NAQKW8wP/dhVa/v3RZbOKvOtogpHGs5nZyjhtChwn lFK1Lr1wl/+6IgCjgu9axkrRM0LLRaBN91HW+e9AkAM9XSFBQp6qAAqjJpI/jLDp xRLW9fMRHpD21m1tG9zxziz4ACCLNNDnlsyY9l7oHHbMzaAX6Gbigyml3hEbj0uK G5hk4VhyKEY= =m/3T -----END PGP SIGNATURE----- _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com