Greg Ewing <[EMAIL PROTECTED]> writes:
> That places a burden on all creators of strings to ensure
> that they are in the minimal format, which could be
> inconvenient for some operations, e.g. taking a substring
> could require making an extra pass to re-code the data.
Yes, but taking a substring already requires a linear time wrt. the
length of the substring.
Allocation a string from a C array of wide characters (which
determines the format from the contents) will be written once and
called as a function.
Most strings are ASCII, so most of the time there is no need to check
whether the substring could become even narrower.
> It would also preclude the possibility of representing
> a substring as a view.
If views were implemented on the level of C pointers, then views would
not have the property of being in the canonical representation wrt.
character width. It's still valuable I think to use a more compact
representation if it would affect most strings.
> I don't see any great advantage given by this restriction
> anyway.
Keeping the canonical representation is not very important. It just
ensures that the advantage of having a more compact representation
taken as often as possible, even if the string has been cut from
another string which contained a wide character.
--
__("< Marcin Kowalczyk
\__/ [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com