> IronPython and Jython can retain UTF-16 as their native form if that
> makes interop cleaner, but in doing so they need to ensure that basic
> operations like indexing and len work in terms of code points, not
> code units, if they are to conform.

That means that they won't conform, period. There is no efficient
maintainable implementation strategy to achieve that property, and
it may take well years until somebody provides an efficient
unmaintainable implementation.

> Does this make sense, or have I completely misunderstood things?

You seem to assume it is ok for Jython/IronPython to provide indexing in
O(n). It is not.

However, non-conformance may not be that much of an issue. They do not
conform in many other aspects, either (such as not supporting Python 3,
for example, or not supporting the C API) that they may well chose to
ignore such a minor requirement if there was one. For BMP strings,
they conform fine, and it may well be that Jython eithers either don't
have non-BMP strings, or don't care whether len() or indexing of their
non-BMP strings is "correct".

Python-Dev mailing list

Reply via email to