On 8/31/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
On 8/31/06, Paul Prescod <[EMAIL PROTECTED]> wrote:
> On 8/31/06, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > (Adding back py3k list assuming you just forgot it)
>
> Yes, thanks. Gmail's UI really optimizes the "Reply To" operation of "Reply
> To All."
>
> > > Plus, it sounds like you're proposing that the encodings of the
> underlying
> > > data would leak through to the application. As I understood Fredrick's
> > > model, the intention was to treat the encoding as an implementation
> detail.
> > > If it works well, this could be an important differentiator for Python
> > > (versus Java) as Unicode already is (versus Ruby).
> >
> > *Only* for UTF-16, which I consider a necessary evil since we can't
> > rewrite the Java and .NET standards.
>
> I see what you're getting at.
>
> I'd say that decoding UTF-16 data in CPython and PyPy should (by default)
> create true Unicode characters. Jython and IronPython could create
> surrogates and characters when necessary. When you run the program in
> CPython you'll get better behaviour than in Jython/IronPython. Maybe there
> could be a way to make CPython run like Jython and IronPython if you wanted
> 100% absolute compatibility between the environments. I think that we agree
> that it would be unfortunate if CPython copied Java and .NET to its own
> detriment. It's also not inconceivable that Java and .NET might evolve a
> 4-byte mode in the long term.

I think it would be best to do this as a CPython configuration option
just like it's done today. You can choose 4-byte or 2-byte Unicode
(essentially UCS-4 or UTF-16) in order to be compatible with other
packages on the platform. Yes, 4-byte gives better Unicode support.
But 2-bytes may be more compatible with other stuff on the platform.
Too bad .NET and Java don't have this option. :-)

The current model is a hack (and I wrote the PEP!).

If you decide to go to all of the effort and expense of polymorphic strings, I cannot understand why a user should be forced to choose between 16 and 32 bit strings AT BUILD TIME. PEP 261 says that reason for the build-time solution is:
"[The alternate solutions] ... would require a much more 
complex implementation than the accepted solution. ...
Guido is not willing to undertake the implementation right
now. ...This PEP represents least-effort solution."
Fair enough. A world of finite resouces. But I would be very annoyed if my ISP had installed a Python version that could magically handle 8-bit and 16-bit strings efficiently but I had to ask them to install a special version to handle 32 bit strings at all. Obviously build-time configuration is the least flexible of all available options.

 Paul Prescod

_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to