It's time to start thinking about pickle compatibility between 2.x and 3.0. The main problem is the 2.x str type -- it doesn't have a true equivalent in 3.0.
When 3.0 encounters a 'str' object in a pickle written by 2.x, it has two choices: trying to convert it to a 3.0 (unicode) str object by applying some encoding, or interpreting it as a 3.0 bytes object. The latter would be trivial, but likely wrong, as the 2.x program that wrote the pickle would likely have meant it to be a text string (although there are certainly cases where binary data gets pickled as well, in which case bytes is of course the correct translation). Since in 3.0, bytes don't interact with text strings the way in 2.x str interacts with unicode, receiving bytes is somewhat inconvenient for the 3.0 program. OTOH, applying an encoding gives us the painful choice of deciding what encoding to use -- the input pickle doesn't give us any hints, and as indicated we're not even sure that text was intended. I could leave this all up to the 3.0 application, which would have to "fix up" any bytes in the pickle it receives explicitly if it wants to. Alternatively, I could add an encoding option to the pickle loading APIs (and for full flexibility an errors option as well) so that at least simple text-based applications might have a chance of reading the data that they themselves wrote before they were ported to 3.0 with minimal changes (only the unpickling calls would have to be modified). Do people here think it's worth it? Think of any place where you currently are using pickles. What would your 3.0 porting strategy likely be? Would not having automatic decoding of pickled 8-bit strings be a major burden? -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
