On 6/13/07, Ron Adam <[EMAIL PROTECTED]> wrote: > Well I can see where a str8() type with an __incoded_with__ attribute could > be useful. It would use a bit more memory, but it won't be the > default/primary string type anymore so maybe it's ok. > > Then bytes can be bytes, and unicode can be unicode, and str8 can be > encoded strings for interfacing with the outside non-unicode world. Or > something like that. <shrug>
Hm... Requiring each str8 instance to have an encoding might be a problem -- it means you can't just create one from a bytes object. What would be the use of this information? What would happen on concatenation? On slicing? (Slicing can break the encoding!) > Attached both the str8 repr as s"..." and s'...', and the latest > no_raw_escape patch which I think is complete now and should apply with no > problems. I like the str8 repr patch enough to check it in. > I tracked the random fails I am having in test_tokenize.py down to it doing > a round trip on random test_*.py files. If one of those files has a > problem it causes test_tokanize.py to fail also. So I added a line to the > test to output the file name it does the round trip on so those can be > fixed as they are found. > > Let me know it needs to be adjusted or something doesn't look right. Well, I'm still philosophically uneasy with r'\' being a valid string literal, for various reasons (one being that writing a string parser becomes harder and harder). I definitely want r'\u1234' to be a 6-character string, however. Do you have a patch that does just that? (We can argue over the rest later in a larger forum.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
