> Only the first such call on a given string, though - the idea is to use > lazy > decoding, not to avoid decoding altogether. Most manipulations (len, > indexing, > slicing, concatenation, etc) would require decoding to at least UCS-2 (or > perhaps UCS-4).
My two cents: For len() you can compute the length at string construction and store it in the string object (which is immutable). For example if the string is constructed by concatenation then computing the resulting length should be trivial. Even when real computation is needed, it plays nicer with the CPU cache since the data has to be there anyway. As for concatenation, recoding can be avoided if the strings to be concatenated use the same internal encoding (assuming it does not hold internal state). Given that in many cases the strings will come from similar sources (thus use the same internal encoding), it may be an interesting optimization. Regards Antoine. _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
