At 03:53 PM 6/27/2010 +1000, Nick Coghlan wrote:
We could talk about this even longer, but the most effective way
forward is going to be a patch that improves the URL parsing
situation.

Certainly, it's the only practical solution for the immediate problems in 3.2.

I only mentioned that I "hate the idea" because I'd be more comfortable if it was explicitly declared to be a temporary hack to work around the absence of a string coercion protocol, due to the moratorium on language changes.

But, since the moratorium *is* in effect, I'll try to make this my last post on string protocols for a while... and maybe wait until I've looked at the code (str/bytes C implementations) in more detail and can make a more concrete proposal for what the protocol would be and how it would work. (Not to mention closer to the end of the moratorium.)


There are a *very small* number of APIs where it is appropriate to be polymorphic

This is only true if you focus exclusively on bytes vs. unicode, rather than the general issue that it's currently impractical to pass *any* sort of user-defined string type through code that you don't directly control (stdlib or third-party).


The virtues of a separate poly_str type are that:
1. It can be simple and implemented in Python, dispatching to str or
bytes as appropriate (probably in the strings module)
2. No chance of impacting the performance of the core interpreter (as
builtins are not affected)

Note that adding a string coercion protocol isn't going to change core performance for existing cases, since any place where the protocol would be invoked would be a code branch that either throws an error or *already* falls back to some other protocol (e.g. the buffer protocol).


3. Lower impact if it turns out to have been a bad idea

How many protocols have been added that turned out to be bad ideas? The only ones that have been removed in 3.x, IIRC, are three-way compare, slice-specific operations, and __coerce__... and I'm going to miss __cmp__. ;-)

However, IIUC, the reason these protocols were dropped isn't because they were "bad ideas". Rather, they're things that can be implemented in terms of a finer-grained protocol. i.e., if you want __cmp__ or __getslice__ or __coerce__, you can always implement them via a mixin that converts the newer fine-grained protocols into invocations of the older protocol. (As I plan to do for __cmp__ in the handful of places I use it.)

At the moment, however, this isn't possible for multi-string operations outside of __add__/__radd__ and comparison -- the coercion rules are hard-wired and can't be overridden by user-defined types.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to