Guido van Rossum wrote: > I've been thinking a bit about a focus for the 2.6 release. > > We are now officially starting parallel development of 2.6 and 3.0. I > really don't expect that we'll be able to merge the easily into the > 3.0 branch much longer, so effectively 3.0 will be a fork of 2.5. > > I wonder if it would make sense to focus in 2.6 on making porting of > 2.6 code to 3.0 easier, rather than trying to introduce new features > in 2.6. We've done releases without new language features before; > notable 2.3 didn't add anything new (except making a few __future__ > imports redundant) and concentrated on bugfixes, performance, and > library additions.
I've been thinking about the transition to unicode strings, and I want to put forward a notion that might allow the transition to be done gradually instead of all at once. The idea would be to temporarily introduce a new name for 8-bit strings - let's call it "ascii". An "ascii" object would be exactly the same as today's 8-bit strings. The 'str' builtin symbol would be assigned to 'ascii' by default, but you could assign it to 'unicode' if you wanted to default to wide strings: str = ascii # Selects 8-bit strings by default str = unicode # Selects unicode strings by default In order to make the transition, what you would do is to temporarily undefine the 'str' symbol from the code base - in other words, remove 'str' from the builtin namespace, and then migrate all of the code -- replacing any library reference to 'str' with a reference to 'ascii' *or* updating that function to deal with unicode strings. Once you get all of the unit tests running again, you can re-introduce 'str', but now you know that since none of the libraries refer to 'str' directly, you can safely change its definition. All of this could be done while retaining compatibility with existing 3rd party code - as long as 'str = ascii' is defined. So you turn it on to run your Python programs, and turn it off when you want to work on 3.0 migration. The next step (which would not be backwards compatible) would be to gradually remove 'ascii' from the code base -- wherever that name occurs, it would be a signal that the function needs to be updated to use 'unicode' instead. Finally, once the last occurance of 'ascii' is removed, the final step is to do a search and replace of all occurances of 'unicode' with 'str'. I know this seems round-about, and is more work than doing it all in one shot. However, I know from past experience that the trickiest part of doing a pervasive change to a code base like this is just keeping track of what parts have been migrated and what parts have not. Many times in the past I've changed the definition of a ubiquitous type by temporarily renaming it, thus vacating the old name so that it can be defined anew, without conflict. -- Talin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com