At 05:54 PM 1/12/05 -0500, Clark C. Evans wrote:
String -> PathName -> File String -> StringIO -> File
Okay, after reading yours and Ian's posts and thinking about them some more, I've learned some really interesting things.
First, adapter abuse is *extremely* attractive to someone new to the concept -- so from here on out I'm going to forget about the idea that we can teach people to avoid this solely by telling them "the right way to do it" up front.
The second, much subtler point I noticed from your posts, was that *adapter abuse tends to sooner or later result in adapter diamonds*.
And that is particularly interesting because the way that I learned how NOT to abuse adapters, was by getting slapped upside the head by PyProtocols pointing out when adapter diamonds had resulted!
Now, that's not because I'm a genius who put the error in because I realized that adapter abuse causes diamonds. I didn't really understand adapter abuse until *after* I got enough errors to be able to have a good intuition about what "as a" really means.
Now, I'm not claiming that adapter abuse inevitably results in a detectable ambiguity, and certainly not that it does so instantaneously. I'm also not claiming that some ambiguities reported by PyProtocols might not be perfectly harmless. So, adaptation ambiguity is a lot like a PyChecker warning: it might be a horrible problem, or it might be that you are just doing something a little unusual.
But the thing I find interesting is that, even with just the diamonds I ended up creating on my own, I was able to infer an intuitive concept of "as a", even though I hadn't fully verbalized the concepts prior to this lengthy debate with Alex forcing me to single-step through my thought processes.
What that suggests to me is that it might well be safe enough in practice to let new users of adaptation whack their hand with the mallet now and then, given that *now* it's possible to give a much better explanation of "as a" than it was before.
Also, consider this... The larger an adapter network there is, the *greater* the probability that adapter abuse will create an ambiguity -- which could mean faster learning.
If the ambiguity error is easily looked up in documentation that explains the as-a concept and the intended working of adaptation, so much the better. But in the worst case of a false alarm (the ambiguity was harmless), you just resolve the ambiguity and move on.
Originally, Python may ship with the String->StringIO and StringIO->File adapters pre-loaded, and if my code was reliant upon this transitive chain, the following will work just wonderfully,
def parse(file: File): ...
parse("helloworld")
by parsing "helloworld" content via a StringIO intermediate object. But then, let's say a new component "pathutils" registers another adapter pair:
String->PathName and PathName->File
This ambiguity causes a few problems:
- How does one determine which adapter path to use? - If a different path is picked, what sort of subtle bugs occur? - If the default path isn't what you want, how do you specify the other path?
The *real* problem here isn't the ambiguity, it's that Pathname->File is "adapter abuse". However, the fact that it results in an ambiguity is a useful clue to fixing the problem. Each time I sat down with one of these detected ambiguities, I learned better how to define sensible interfaces and meaningful adaptation. I would not have learned these things by simply not having transitive adaptation.
| As I think these things through, I'm realizing that registered | adaptators really should be 100% accurate (i.e., no information loss, | complete substitutability), because a registered adapter that seems | pragmatically useful in one place could mess up unrelated code, since | registered adapters have global effects.
I think this isn't all that useful; it's unrealistic to assume that adapters are always perfect. If transitive adaptation is even permitted, it should be unambiguous. Demanding that adaption is 100% perfect is a matter of perspective. I think String->StringIO and StringIO->File are perfectly pure.
The next thing that I realized from your posts is that there's another education issue for people who haven't used adaptation, and that's just how precisely interfaces need to be specified.
For example, we've all been talking about StringIO like it means something, but really we need to talk about whether it's being used to read or write or both. There's a reason why PEAK and Zope tend to have interface names like 'IComponentFactory' and 'IStreamSource' and other oddball names you'd normally not give to a concrete class. An interface has to be really specific -- in the degenerate case an interface can end up being just one method. In fact, I think that something like 10-15% of interfaces in PEAK have only one method; I don't know if it's that high for Zope and Twisted, although I do know that small interfaces (5 or fewer methods) are pretty normal.
What this also suggests to me is that maybe adaptation and interfaces are the wrong solution to the problems we've been trying to solve with them -- adding more objects to solve the problems created by having lots of objects. :)
As a contrasting example, consider the Dylan language. The Dylan concept of a "protocol" is a set of generic functions that can be called on any number of object types. This is just like an interface, but inside-out... maybe you could call it an "outerface". :)
The basic idea is that a file protocol would consist of functions like 'read(stream,byteCount)'. If you implement a new file-like type, you "add a method" to the 'read' generic function that implements 'read' for your type. If a type already exists that you'd like to use 'read' with, you can implement the new method yourself.
There are some important ramifications there. First, there's no requirement to implement a complete interface; the system is already reduced to *operations* rather than interfaces. Second, a different choice of method names isn't a reason to need more interfaces and adapters. As more implementations of some basic idea (like stream-ness) exist, it becomes more and more natural to *share* common generic functions and put them in the stdlib, even without any concrete implementation for them, because they now form a standard "meeting point" for other libraries.
Third, Ka-Ping Yee has been arguing that Python should be able to define interfaces that contain abstract implementation. Well, generic functions can actually *do* this in a straightforward fashion; just define the default implementation of that operation as delegating to other operations. There still needs to be some way to "bottom out" so you don't end up with endless recursive delegation -- although you could perhaps just catch the recursion error and inspect the traceback to tell the user, "must implement one of these operations for type X". (And this could perhaps be done automatically if you can declare that this delegating implementation is an "abstract method".)
Fourth, and this is *really* interesting (but also rather lengthy to explain)... if all functions are generic (just using a fast-path for the nominal case of only one implementation), then you can actually construct adapters automatically, knowing precisely when an operation is "safe".
Let me explain. Suppose that we have a type, SomeType. It doesn't matter if this type is concrete or an interface, we really don't care. The point is that this type defines some operations, and there is an outside operation 'foo' that relies on some set of those operations.
We then have OtherType, a concrete type we want to pass to 'foo'. All we need in order to make it work, is *extend the generic functions in SomeType with methods that take a different 'self' type*! Then, the operation 'adapt(instOfOtherType,SomeType)' can assemble a simple proxy containing methods for just the generic functions that have an implementation available for OtherType.
The result of this is that now any type can be the basis for an interface, which is very intuitive. That is, I can say, "implement file.read()" for my object, and somebody who has an argument declared as "file" will be able to use my object as long as they only need the operations I've implemented. However, unlike using method names alone, we have unambiguous semantics, because all operations are grounded in some fixed type or location of definition that specifies the *meaning* of that operation.
Another benefit of this approach is that it lessens the need for transitive adaptation, because over time people converge towards using common operations, rather than continually reinventing new ones. In this approach, all "adaptation" is endpoint to endpoint, but there are rarely any actual adapters involved, unless a set of related operations actually requires keeping some state. Instead, you simply define an implementation of an operation for some concrete type.
I'm running out of time to explore this idea further, alas. Up to this point, what I'm proposing would work *beautifully* for adaptations that don't require the adapter to add state to the underlying object, and ought to be intuitively obvious, given an appropriate syntax. E.g.:
class StringIO:
def read(self, bytes) implements file.read: # etc...
could be used to indicate the simple case where you are conforming to an existing operation definition. A third-party definition, of the same thing might look like this:
def file.read(self: StringIO, bytes): return self.read(bytes)
Assuming, of course, that that's the syntax for adding an implementation to an existing operation.
Hm. You know, I think the stateful adapter problem could be solved too, if *properties* were also operations. For example, if 'file.fileno' was implemented as a set of three generic functions (get/set/delete), then you could maybe do something like:
class socket:
# internally declare that our fileno has the semantics # of file.fileno:
fileno: int implements file.fileno
or maybe just:
class socket implements file: ...
could be shorthand for saying that anything with the same name as what's in 'file' has the same semantics. OTOH, that could break between Python versions if a new operation were added to 'file', so maybe as verbose as the blow-by-blow declarations are, they'd be safer semantically.
Anyway, if we were a third party externally declaring the correspondence between socket.fileno and file.fileno, we could say:
# declare how to get a file.fileno for a socket instance
def file.fileno.__get__(self: socket): return self.fileno
Now, there isn't any need to have a separate "adapter" to store additional state; with appropriate name mangling it can be stored in the unadapted object, if you like.
This isn't a fully thought-out proposal; it's all a fairly spur-of-the-moment idea. I've been playing with generic functions for a while now, but only recently started doing any "heavy lifting" with them. However, in one instance, I refactored a PEAK module from being 400+ lines of implementation (plus 8 interfaces and lots of adaptation) down to just 140 lines implementation and one interface -- with the interface being pure documentation. And the end result was more flexible than the original code. So since then I've been considering whether adaptation is really the be-all end-all for this sort of thing, and Clark and Ian's posts made me start thinking about it even more seriously.
(One interesting data point: the number of languages with some kind of pattern matching, "guards" or other generic function variants seems to be growing, while Java (via Eclipse) is the only other system I know of that has anything remotely like PEP 246.)
So maybe the *real* answer here is that we should be looking at solutions that might prevent the problems that adapters are meant to solve, from arising in the first place! Generic functions might be a good place to look for one, although the downside is that they might make Python look like a whole new language. OTOH, type declarations might do that anyway.
A big plus, by the way, of the generic function approach is that it does away with the requirement for interfaces altogether, except as a semantic grouping of operations. Lots of people dislike interfaces, and after all this discussion about how perfect interface-to-interface adaptation has to be, I'm personally becoming a lot less enamored with interfaces too!
In general, Python seems to like to let "natural instinct" prevail. What could be more natural than saying "this is how to implement a such-and-such method like what that other guy's got"? It ain't transitive, but if everybody tends to converge on a common "other guy" to define stuff in terms of (like 'file' in the stdlib), then you don't *need* transitivity in the long run, except for fairly specialized situations like pluggable IDE's (e.g. Eclipse) that need to dynamically connect chains between different plugins. Even there, the need could be minimized by most operations grounding in "official" abstract types. And abstract methods -- like a 'file.readline()' implementation for any object that supports 'file.read()' -- could possibly take care of most of the rest.
Generic functions are undoubtedly more complex to implement than PEP 246 adaptation. My generic function implementation comprises 3323 lines of Python, and it actually *uses* PEP 246 adaptation internally for many things, although with more work it could probably do without it.
However, almost half of those lines of code are consumed by a mini-compiler and mini-interpreter for Python expressions; a built-in implementation of generic functions might be able to get away without having those parts, or at least not so many of them. Also, my implementation supports full predicate dispatch, not just multimethod dispatch, so there's probably even more code that could be eliminated if it was decided not to do the whole nine yards.
Back on the downside, this looks like an invitation to another "language vs. stdlib" debates, since PEP 246 in and of itself is pure library. OTOH, Guido's changing the language to add type declarations anyway, and generic functions are an excellent use case for them. Since he's going to be flamed for changing the language anyway, he might as well be hanged for a sheep as for a goat. :)
Oh, and back on the upside again, it *might* be easier to implement actual type checking with this technique than with PEP 246, because if I write a method expecting a 'file' and somebody calls it with a 'Foo' instance, I can maybe now look at the file operations actually used by the method, and then see if there's an implementation for e.g. 'file.read' defined anywhere for 'Foo'. And, comparable type checking algorithms are more likely to already exist for other languages that include generic functions, than to exist for PEP 246-style adaptation.
Okay, I'm really out of time now. Hate to dump this in as a possible spoiler on PEP 246, because I was just as excited as Alex about the possibility of it going in. But this whole debate has made me even less enamored of adaptation, and more interested in finding a cleaner, more intuitive way to do it.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com