On Wed, Sep 17, 2014 at 3:55 AM, Jim Baker <jim.ba...@python.org> wrote:
> Of course, if you do actually have a smuggled isolated low surrogate
> FOLLOWED by a smuggled isolated high surrogate - guess what, the only
> interpretation is a codepoint. Or perhaps more likely garbage. Of course it
> doesn't happen so often, so maybe we are fine with the occasional bug ;)
>
> I personally suspect that we will resolve this by also supporting UCS-4 as a
> representation in Jython 3.x for such Unicode strings, albeit with the
> limitation that we have simply moved the problem to when we try to call Java
> methods taking java.lang.String objects.
>

That'll cost efficiency, of course, but it'll guarantee correctness.
And maybe, just maybe, you'll be able to put some pressure on Java
itself to start supporting UCS-4 natively...

One can dream.

ChrisA
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to