At 02:15 PM 4/6/2006, Guido van Rossum wrote: >On 4/6/06, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > > I don't want to discourage people from working out their own ideas, > > but a lot of the stuff that's being discussed here about protocol > > adaptation is already implemented in PyProtocols. > >That's great. I believe that if we derive things from first principles >here and arrive at the same choices as PyProtocols, we haven't wasted >anybody's time;
I agree; that's why I said "I don't want to discourage people from working out ... ideas". I just wanted to point out that for those who prefer to learn from worked-out examples, there are some available. One reason I like to steal designs from *built* systems is that making something real forces lots of tradeoffs into the open, that don't come up when you're just brainstorming. That having been said, please consider this a "spoiler warning": the rest of this message reveals where I think you're all going to eventually end up. If it'll spoil anyone's fun and/or learning to know this ahead of time, don't read the rest of this message. :) Hint: premature optimization is the root of all evil. I was actually trying to hint in my previous message that it would be valuable to learn from PyProtocols' example by *not* repeating it! I was not suggesting that PyProtocols is a model to which you should all aspire, but rather consider why I wish I hadn't bothered to do it in the first place! >At the moment I'm doing the same for overloadable functions. However, >instead of doing it in the privacy of a hotel room in a temp directory >on a disconnected laptop, I'll blog about it. There'll be a lot of >thinking aloud, a couple of dead alleys and red herrings, and every >once in a while I'll have to buy a clue from PyProtocols. But I really >see no other way to do it; when I look at the completed PyProtocols my >eyes just start glazing over. No criticism of your methods was intended or implied; it's just that I personally hate to struggle to reinvent something that's already been done. More specifically, I wish somebody could've pointed *me* to PyProtocols' docs before I wrote them, or better yet, sat me down and showed me why generic, uh, overloadable functions were an even better idea and sold me on why I shouldn't bother writing PyProtocols, but jump straight to the good stuff. :) This is particularly so in the area of multiple-argument dispatch, as some people have been bringing up implementation ideas that I see as painful dead-ends that took me months to chew my way out of. So I've actually been doing an *excellent* job of sitting on my hands so far and not stealing anybody's learning opportunities. :) However, continuing in the vein of "things I wished somebody told me", I would say that the biggest hurdle to implementing a good multiple-argument dispatch system, IMO, is realizing that there is only one basic algorithm that works, and that there are no shortcuts. You can find lots of ways to make the basic algorithm efficient (indexing, dispatch trees, partial evaluation, etc.), but there are really no shortcuts. The good news, however, is that once you realize there is only one basic algorithm (eliminate non-matching cases and prioritize what's left), you find that there are a lot of implementation shortcuts, especially if you have single-dispatch functions available to build the multiple-dispatch implementation. You just can't directly shoehorn multiple-dispatch into single-dispatch, although you can waste a *lot* of time trying. I literally could've had RuleDispatch done maybe 2 years sooner if I took all the time I spent thinking about how to figure out something "easier" to implement than the Chambers&Chen optimization of the One True Algorithm, and just worked on implementing it instead. Anyway, the point of my comment was not to interfere with anybody's learning, it was just to mention stuff that I would have wished to have, when I was trying to work these things out. I'll go back to sitting on my hands now, except for occasional drooling at the idea of a writable func_ast attribute. :) Actually, I like the func_ast idea so much that it tempts me to write something that would emulate it, like a pair of set_ast and get_ast functions for Python 2.x. (I could use PyPy's abstract interpretation approach to turn bytecode back into a pseudo-AST, perhaps.) It would be useful as a proof-of-concept to demonstrate how simple it would be to implement trivial overloading and then full multiple/predicate dispatch using AST manipulation, because all this mucking about with dictionaries is way too forest-for-the-trees if your goal is to understand the *applications* for overloading. Overloadable functions are easiest to understand first (IMO) in terms of syntactic transforms. After you've got a grasp at that level, you can come up with other optimizations. Better still, with the new AST branch and PyPy, we now have a better chance of just optimizing the ASTs (or compiling them to C) directly, rather than painfully working out a bunch of low-level MRO and registry issues. Those are costly premature optimizations, and I wish I had not been sucked into that trap. If I had it to do over, I would proceed directly to AST manipulation, and then work on ways to make it faster (like the Chambers & Chen-inspired dispatch indexes, and by moving the AST manipulations to C after they were tested). So, I would gently suggest that, now that we have overloaded functions on the table, that it might be better to focus on "what should construct X translate to in naive Python code", than worrying about implementation feasibility of optimized code. One of the interesting findings of Chambers and Chen was that a significant number of overloadable functions "in the wild" have a very small number of methods, and then there are a few that have gobs and gobs of methods. My focus on premature optimization thus blinded me to the fact that the most efficient way to deal with those common cases is just to translate it to a simple set of of if-then statements. The cases where you need big guns like registries are much fewer and further between. Indeed, if you examine PEAK, Zope, or Twisted, you will also find that most interfaces have only a handful of adapter implementations (or direct method implementations in classes), and then there are some that have zillions. And -- here's the key bit -- if you have a tool that lets you handle those common cases (of only a few distinct methods), you can then *use* them as a stepping stone to implement the more complex approaches. RuleDispatch makes extensive use of simple single-dispatch functions to bootstrap the multiple-dispatch implementation. If I had a multiple dispatch implementation whose only drawback was that it was slow if you had more than a handful of overloads for the function, then creating an *efficient* multi-dispatch system on top of that is a piece of cake, especially if you make the "overloading function" itself overloadable. That is, once the system is bootstrapped, you just overload "overload()" to be able to use the more efficient implementation approaches whenever the number of overloads reaches a certain threshold. (Analogy: 'type' is its own type in today's Python, but it's built on a simpler C-level notion of types as a data structure with function pointers that it needs in order to be bootstrapped. In the same way, you can bootstrap overload() using an extremely simple form of overloading, not far removed from monkeypatching.) The reason I haven't done much on documenting RuleDispatch's internals to date is because I basically got to a point where the implementation I have, although relatively efficient for large overload sets, is not as efficient as the naive approach (unoptimized if-then trees) for the common cases, and isn't as customizable or optimized as I'd like for the "zillions of methods" cases. Its internals are hard-wired for things in a way that makes it hard to add more special-case optimizations. Thus, RuleDispatch is effectively optimized for an excluded middle where you have enough rules that the naive approach is too slow, but not so many rules that you need advanced customization features or speed that matches what can be done by hand-tuning. In other words: for use cases that don't actually exist! (This same problem exists to a lesser degree for PyProtocols.) Thus, I've pretty much decided that I need to go back and reimplement the RuleDispatch internals so that they are themselves based on generic functions, and my plan was to use a naive approach to code generation for those functions, and then "overload the overload() function". In truth, my suggestion for Py3K is that simply providing a writable func_ast and a simple overloading implementation that uses it, and is itself overloadable, seems like the simplest thing that would possibly work, and it allows unlimited implementation-level optimizations. For example, let's say that naive if-then overloading is too slow for pickle() or pprint(). So you overload "overload()" to allow overloads to pickle to modify a registry instead of modifying code. Voila! So this is why I think that worrying about protocol objects and MROs and registries is losing the forest for the trees. It's easier to think about those things when you're grounded in specific use cases. And, an important feature of dynamic overloading is that you get to change your mind about how things work - just create another overload! This is something that was totally not obvious to me before I got to a certain point of working on RuleDispatch. It took a while for my thinking to shift to a world where overloading is *cheap*. In today's Python overloading requires interfaces or registries or adaptation; it's not as simple of a thing, so you don't use it unless you know you have to. But if you can overload functions as easily as breathing, then overloadability can be the *default* case, and then you realize that any *single* implementation of a function is necessarily a premature optimization! And just as important, trying to only ever design *one* implementation of an algorithm is necessarily a wasteful compromise! You see, overloading allows you to write correct algorithms, and then overload the abstract operations for a specific implementation scenario. Almost like PyPy generating a custom VM -- you can create a new "back-end" for any given special case. So special cases aren't special enough to break the rules -- but with overloading, you can bend the rules to accomodate them. Or as Uncle Timmy put it, "1) Everything is a trivial special case of something else, and 2) death is a bunch of blue spheres." :) _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com