Before we can decide on the internal representation of our unicode objects, we need to decide on their external interface. My thoughts so far:
* Most transformation and testing methods (.lower(), .islower(), etc) can be copied directly from 2.x. They require no special implementation to perform reasonably. * Indexing and slicing is the big issue. Do we need constant-time integer slicing? .find() could be changed to return a token that could be used as a constant-time offset. Incrementing the token would have linear costs, but that's no big deal if the offsets are always small. * Grapheme clusters, words, lines, other groupings, do we need/want ways to slice based on them too? * Cheap slicing and concatenation (between O(1) and O(log(n))), do we want to support them? Now would be the time. -- Adam Olsen, aka Rhamphoryncus _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
