Re: [Python-Dev] PEP 246, redux

Phillip J. Eby Tue, 11 Jan 2005 09:53:38 -0800

At 10:34 AM 1/11/05 +0100, Alex Martelli wrote:

The volume of these discussions is (as expected) growing beyond any reasonable bounds; I hope the BDFL can find time to read them but I'm starting to doubt he will. Since obviously we're not going to convince each other, and it seems to me we're at least getting close to pinpointing our differences, maybe we should try to jointly develop an "executive summary" of our differences and briefly stated pros and cons -- a PEP is _supposed_ to have such a section, after all.

Yes, hopefully we will have sufficient convergence to do that soon. For example, I'm going to stop arguing against the use case for Liskov violation, and try looking at alternative implementations. If those don't work out, I'll stop objecting to that item altogether.

the effects of private inheritance could be simulated by delegation to a private auxiliary class, but the extra indirections and complications aren't negligible costs in terms of code complexity and maintainability.

Ah. Well, in PEAK, delegation of methods or even read-only attributes is trivial:

     class SomeObj(object):

         meth1 = meth2 = meth3 = binding.Delegate('_delegatee')

         _delegatee = binding.Make(OtherClass)

This class will create a private instance of OtherClass for a given SomeObj instance the first time meth1, meth2, or meth3 are retrieved from that instance.

I bring this up not to say that people should use PEAK for this, just explaining why my perspective was biased; I'm so used to doing this that I tend to forget it's nontrivial if you don't already have these sorts of descriptors available.

Maybe the ability to ``fake'' __class__ can help, but right now I don't see how, because setting __class__ isn't fake at all -- it really affects object behavior and type:
...
So, it doesn't seem to offer a way to fake out isinstance only, without otherwise affecting behavior.


Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on 
win32
Type "copyright", "credits" or "license()" for more information.

>>> class Phony(object):
        def getClass(self): return Dummy
        __class__ = property(getClass)


>>> class Dummy: pass

>>> Phony().__class__
<class __main__.Dummy at 0x00F4CE70>
>>> isinstance(Phony(),Dummy)
True

Unfortunately, this still doesn't really help, because isinstance() seems to apply to a union of __class__ and type:

>>> isinstance(Phony(),Phony)
True

So, lying about __class__ doesn't fix the issue because you're still considered isinstance, unless adapt() just uses __class__ and doesn't use isinstance().

I can give no example at all in which adapting to a concrete class is a _good_ idea, and I tried to indicate that in the PEP. I just believe that if adaptation does not offer the possibility of using concrete classes as protocols, but rather requires the usage as protocols of some specially blessed 'interface' objects or whatever, then PEP 246 will never fly, (a) because it would then require waiting for the interface thingies to appear, and (b) because people will find it pragmatically useful to just reuse the same classes as protocols too, and too limiting to have to design protocols specifically instead.

Okay, I strongly disagree on this point, because there are people using zope.interface and PyProtocols today, and they are *not* using concrete classes. If PEP 246 were to go into Python 2.5 without interface types, all that would change is that Zope and PyProtocols would check to see if there is an adapt() in builtins and, if not, install their own version.

PEP 246 would certainly be more useful *with* some kind of interface type, but Guido has strongly implied that PEP 246 won't be going in *without* some kind of interface type, so it seems to me academic to say that PEP 246 needs adaptation to concrete types based on isinstance().

In fact, maybe we should drop isinstance() from PEP 246 altogether, and only use __conform__ and __adapt__ to implement adaptation. Thus, to say that you conform to a concrete type, you have to implement __conform__. If this is done, then an abstract base used as an interface can have a __conform__ that answers 'self' for the abstract base used as a protocol, and a Liskov-violating subclass can return 'None' for the abstract base. Inheritance of __conform__ will do the rest.

This approach allows concrete classes and Liskov violations, but simplifies adapt() since it drops the need for isinstance and for the Liskov exception. Further, we could have a default object.__conform__ that does the isinstance check. Then, a Liskov-violating subclass just overrides that __conform__ to block the inheritance it wants to block.

This approach can't work with a separately-distributed PEP 246 implementation, but it should work quite well for a built-in implementation and it's backward compatible with the semantics expected by "old" PEP 246 implementations. It means that all objects will have a tp_conform slot that will have to be called, but in most cases it's just going to be a roundabout way of calling isisntance.

For hash, and all kinds of other built-in functions and operations, it *does not matter* whether instance h has its own per-instance __hash__ -- H.__hash__ is what gets called anyway. Making adapt work differently gives me the shivers.

It's only different because of metaclasses and the absence of tp_conform/tp_adapt issues (assuming the function and module use cases are taken care of by having their tp_conform slots invoke self.__dict__['__conform__'] first).

Anyway, if you adapt a *class* that defines __conform__, you really want to be invoking the *metaclass* __conform__. See Armin Rigo's post re: "metaconfusion" as he calls it.

The PEP just said that it would be raised by __conform__ or __adapt__, not that it would be caught by adapt() or that it would be used to control the behavior in that way. Re-reading, I see that you do mention it much farther down. But at the point where __conform__ and __adapt__ are explained, it has not been explained that adapt() should catch the error or do anything special with it. It is simply implied by the "to prevent this default behavior" at the end of the section. If this approach is accepted, the description should be made explicit, becausse for me at least it required a retroactive re-interpretation of the earlier part of the spec.
OK, I'll add more repetition to the specs, trying to make it more "sequentially readable", even though there were already criticized because they do repeat some aspects more than once.

It might not be necessary if we agree that the isinstance check should be moved to an object.__conform__ method, and there is no longer a need for a LiskovViolation error to exist.

Basically, we both agree that adaptation must accept some complication to deal with practical real-world issues that are gonna stay around, we just disagree on what those issues are. You appear to think old-style classes will stay around and need to be supported by new core Python functionality, while I think they can be pensioned off;

Currently, exceptions must be classic classes. Do you want to disallow adaptation of exceptions? Are you proposing that ClassType.tp_conform not invoke self.__conform__? I don't see any benefit to omitting that functionality.

you appear to think that programmers' minds will miraculously shift into a mode where they don't need covariance or other Liskov violations, and programmers will happily extract the protocol-ish aspects of their classes into neat pristine protocol objects rather than trying to double-use the classes as protocols too, while I think human nature won't budge much on this respect in the near future.

Well, they're doing it now with Zope and PyProtocols, so it didn't seem like such a big assumption to me. :)

Having, I hope, amply clarified the roots of our disagreements, so we can wait for BDFL input before the needed PEP 246 rewrites. If his opinions are much closer to yours than to mine, then perhaps the best next step would be to add you as the first author of the PEP and let you perform the next rewrite -- would you be OK with that?

Sure, although I think that if you're willing to not *object* to classic class support, and if we reach agreement on the other issues, it might not be necessary.

I didn't know about the "let the object lie" quirk in isinstance. If that quirk is indeed an intended design feature,
It is; it's in one of the "what's new" feature highlights for either 2.3 or 2.4, I forget which. It was intended to allow proxy objects (like security proxies in Zope 3) to pretend to be an instance of the class they are proxying.
I just grepped through whatsnew23.tex and whatsnew24.tex and could not find it. Can you please help me find the exact spot? Thanks!


Googling "isinstance __class__" returns this as the first hit:

http://mail.python.org/pipermail/python-bugs-list/2003-February/016098.html

Adding "2.3 new" to the query returns this:

http://www.python.org/2.3/highlights.html

which is the "highlights" document I alluded to.

What _have_ you seen called "casting" in Python?

Er, I haven't seen anything called casting in Python, which is why I was confused. :)

Maybe we're using different definitions of "casting"?
I'm most accustomed to the C and Java definitions of casting, so that's probably why I can't see how it relates at all. :)
Well, in C++ you can call (int)x or int(x) with the same semantics -- they're both casts. In C or Java you must use the former syntax, in Python the latter, but they still relate.

Okay, but if you get your definition of "cast" from C and Java then what C++ and Python do are *conversion*, not casting, and what PEP 246 does *is* "casting".

That's why I think there should be no mention of "casting" in the PEP unless you explicitly mention what language you're talking about -- and Python shouldn't be a candidate language. I've been trying to Google references to type casting in Python, and have so far mainly found arguments that Python does not have casting, and one that further asserts that even in C++, "conversion by constructor is not considered a cast." Also, "cast" is of relatively recent vintage in Python documentation; outside of the C API and optional static typing discussions, it seems to have made its debut in a presentation about Python 2.2's changing 'int' and 'str' to type objects.

So, IMO the term has too many uses to add any clarification; it confused me because I thought that in C++ the things you're talking about were called "conversions", not casts.

You could have specified some options (such as the mode) but they took their default value instead ('r' in this case). What's ``lossy'' about accepting defaults?!

Because it means you're making stuff up and tacking it onto the object, not "adapting" the object. As discussed later, this would probably be better called "noisy" adaptation than "lossy".

The adjective "lossy" is overwhelmingly often used in describing compression, and in that context it means, can every bit of the original be recovered (then the compression is lossless) or not (then it's lossy). I can't easily find "lossy" used elsewhere than in compression, it's not even in American Heritage. Still, when you describe a transformation such as 12.3 -> 12 as "lossy", the analogy is quite clear to me. When you so describe the transformation 'foo.txt' -> file('foo.txt'), you've lost me completely: every bit of the original IS still there, as the .name attribute of the file object, so by no stretch of the imagination can I see the "lossiness" -- what bits of information are LOST?


Right, "noisy" is a better word for this; let's move on.

it for all kinds of crazy things because it seems cool. However, it takes a while to see that adaptation is just about removing unnecessary accidents-of-incompatibility; it's not a license to transform arbitrary things into arbitrary things. There has to be some *meaning* to a particular adaptation, or the whole concept rapidly degenerates into an undifferentiated mess.
We agree, philosophically. Not sure how the PEP could be enriched to get this across.

A few examples of "good" vs. "bad" adaptation might suffice, if each is accompanied by a brief justification for its classification. The filename/file thing is a good one, int/float or decimal/float is good too. We should present "bad" first, then show how to fix the example to accomplish the intent in a good way. (Like filename->file factory + file->file factory, explicit type conversion for precision-losing conversion, etc.)

(Or else, you decide to "fix" it by disallowing transitive adaptation, which IMO is like cutting off your hand because it hurts when you punch a brick wall. Stop punching brick walls (i.e. using semantic-lossy adaptations), and the problem goes away. But I realize that I'm in the minority here with regards to this opinion.)
I'm not so sure about your being in the minority, having never read for example Guido's opinion in the matter.

I don't know if he has one; I mean that Jim Fulton, Glyph Lefkowitz, and yourself have been outspoken about the "potential danger" of transitive adaptation, apparently based on experience with other systems. (Which seems to me a lot like the "potential danger" of whitespace that people speak of based on bad experiences with Make or Fortran.) There have been comparatively few people who have had been outspoken about the virtues of transitive adaptation, perhaps because for those who use it, it seems quite natural. (I have seen one blog post by someone that was like, "What do you mean those other systems aren't transitive? I thought that was the whole point of adaptation. How else would you do it?")

But, let's take an example of Facade. (Here's the 'later' I kept pointing to;-).

I have three data types / protocols: LotsOfInfo has a bazillion data fields, including personFirstName, personMiddleName, personLastName, ... PersonName has just two data fields, theFirstName and theLastName. FullName has three, itsFirst, itsMiddle, itsLast.

The adaptation between such types/protocols has meaning: drop/ignore redundant fields, rename relevant fields, make up missing ones by some convention (empty strings if they have to be strings, None to mean "I dunno" like SQL NULL, etc). But, this *IS* lossy in some cases, in the normal sense: through the facade (simplified interface) I can't access ALL of the bits in the original (information-richer).
Adapting LotsOfInfo -> PersonName is fine; so does LotsOfInfo -> FullName.
Adapting PersonName -> FullName is iffy, because I don't have the deuced middlename information. But that's what NULL aka None is for, so if that's allowed, I can survive.

But going from LotsOfInfo to FullName transitively, by way of PersonName, cannot give the same result as going directly -- the middle name info disappears, because there HAS been a "lossy" step.

Certainly it is preferable to go direct if it's possible, which is why PyProtocols always converges to the "shortest adapter path". However, if you did *not* have a direct adaptation available from LotsOfInfo to FullName, would it not be *preferable* to have some adaptation than none?

The second point is that conversion from PersonName->FullName is only correct if FullName allows "I don't know" as a valid answer for the middle name. If that's *not* the case, then such a conversion is "noisy" because it is pretending to know the middle name, when that isn't possible.

So the issue of "lossy" DOES matter, and I think you muddy things up when you try to apply it to a string -> file adaptation ``by casting'' (opening the file thus named).

Right; as I keep saying, that isn't adaptation, it's conversion. The closest adaptation you can get for the intent is to adapt a string to a file *factory*, that can then be used to open a file.

Forbidding lossy adaptation means forbidding facade here; not being allowed to get adaptation from a rich source of information when what's needed is a subset of that info with some renaming and perhaps mixing.

No, it means it's a bad idea to have implicit conversions that result in unintended data loss or "making up" things to fill out data the original data doesn't have. You should explicitly state that you mean to get rid of things, or what things you want to make up.

By the way, the analogy you're drawing between loss of floating point precision and dropping fields from information about a person isn't valid for the definition of "lossy" I'm struggling to clarify. A floating point number is an atomic value, but facts about a person are not made atomic simply by storing them in the same object. So, separating those facts or using only some of them does not lose any relevant semantics.

Forbidding indications of "I don't know" comparable to SQL's NULL (thus forbidding the adaptation PersonName -> FullName) might make the whole scheme incompatible with the common use of relational databases and the like -- probably not acceptable, either.

If your target protocol allows for "I don't know", then consumers of that protocol must be willing to accept "I don't know" for an answer, in which case everything is fine. It's *faking* when you don't know, and the target protocol does *not* allow for not knowing, that is a problem. ("Noisy" adaptation.)

Allowing both lossy adaptations, NULLs, _and_ transitivity inevitably leads sooner or later to ACCIDENTAL info loss -- the proper adapter to go directly LotsOfInfo -> FullName was not registered, and instead of getting an exception to point out that error, your program limps along having accidentally dropped a piece of information, here the middle-name.

But in this case you have explicitly designed a protocol that does not guarantee that you get all the required information! If the information is in fact required, why did you allow it to be null? This makes no sense to me.

OK, but then 12.3 -> 12 should be OK, since the loss of the fractionary part IS part of the difference in interfaces, right? And yet it doesn't SMELL like adaptation to me -- which is why I tried to push the issue away with the specific disclaimer about numbers.

The semantics of 12.3 are atomic. Let us say it represents some real-world measurement, 12.3 inches perhaps. In the real world, are those .3 inches somehow separable from the 12? That makes no sense.

IOW, adaptation is all about "as a" relationships from concrete objects to abstract roles, and between abstract roles. Although one may colloquially speak of using a screwdriver "as a" hammer, this is not the case in adaptation. One may use a screwdriver "as a" pounder-of-nails. The difference is that a hammer might also be usable "as a" remover-of-nails. Therefore, there is no general "as a" relationship between pounder-of-nails and remover-of-nails, even though a hammer is usable "as" either one. Thus, it does not make sense to say that a screwdriver is usable "as a" hammer, because this would imply it's also usable to remove nails.
I like the "as a" -- but it can't ignore Facade, I think.

I don't think it's a problem, because 1) your example at least represents facts with relatively independent semantics: you *can* separate a first name from a last name, even though they belong to the same person. And 2) if a target protocol has optional aspects, then lossy adaptation to it is okay by definition. Conversely, if the aspect is *not* optional, then lossy adaptation to it is not acceptable. I don't think there can really be a middle ground; you have to decide whether the information is required or not. If you have a protocol whose semantics cannot provide the required target semantics, then you should explicitly perform the loss or addition of information, rather than doing so implicitly via adaptation.

interface-to-interface adaptation should be reserved for non-lossy, non-noisy adapters.
No Facade, no NULLs? Yes, we disagree about this one: I believe adaptation that occurs by showing just a subset of the info, with renaming etc, is absolutely fine (Facade); and adaptation by using an allowed NULL (say None) to mean "missing information", when going to a "wider" interface, is not pleasant but is sometimes indispensable in the real world -- that's why SQL works in the real world, even though SQL beginners and a few purists hate NULLs with a vengeance.

If you allow for nulls, that's fine -- just be prepared to get them. Real-world databases also have NOT NULL columns for this reason. :)

The points are rather that adaptation that "loses" (actually "hides") some information is something we MUST have;


Agreed.

and adaptation that supplies "I don't know" markers (NULL-like) for some missing information, where that's allowed, is really very desirable.

Also agreed, emphasizing "where that's allowed". The point is, if it's allowed, it's not a problem, is it?

Call this lossy and noisy if you wish, we still can't do without.

No; it's noisy only if the target requires a value and the source has no reasonable way to supply it, requiring you to make something up. And leaving out independent semantics (like first name vs. last name) isn't lossy IMO.

Transitivity is a nice convenience, IF it could be something that an adapter EXPLICITLY claims rather than something just happening by default. I might live with it, grudgingly, if it was the default with some nice easy way to turn it off; my problem with that is -- even if 90% of the cases could afford to be transitive, people will routinely forget to mark the other 10% and mysterious, hard-to-find bugs will result.

Actually, in the cases where I have mistakenly defined a lossy or noisy adaptation, my experience has been that it blows up very rapidly and obviously, often because PyProtocols will detect an adapter ambiguity (two adaptation paths of equal length), and it detects this at adapter registration time, not adaptation time.

However, the more *common* source of a transitivity problem in my experience is in *interface inheritance*, not oddball adapters. As I mentioned previously, a common error is to derive an interface from an interface you require, rather than one you intend your new interface to provide. In the presence of inheritance transitivity (which I have not heard you argue against), this means that you may provide something you don't intend, and therefore allow your interface to be used for something that you didn't intend to guarantee.

Anyway, this problem manifests when you try to adapt something to the base interface, and it works when it really shouldn't. It's more difficult to track down than it ought to be, because looking at the base interface won't tell you anything, and the derived interface might be buried deep in a base class of the concrete object.

But there's no way to positively prevent this class of bugs without prohibiting interface inheritance, which is the most common source of adaptation transitivity bugs in my experience.

In PyProtocols docs you specifically warn against adapting from an adapter... yet that's what transitivity intrinsically does!

I warn against *not keeping an original object*, because the original object may be adaptable to things that an adapter is *not*. This is because we don't have an 'IUnknown' to recover the original object, not because of transitivity.

In that case, I generally prefer to be explicit and use conversion rather than using adaptation. For example, if I really mean to truncate the fractional part of a number, I believe it's then appropriate to use 'int(someNumber)' and make it clear that I'm intentionally using a lossy conversion rather than simply treating a number "as an" integer without changing its meaning.
That's how it feels to me FOR NUMBERS, but I can't generalize the feeling to the general case of facade between "records" with many fields of information, see above.

Then perhaps we have made some progress; "records" are typically a collection of facts with independent semantics, while a number is an atomic value. Facts taken in isolation do not alter their semantics, but dropping precision from a value does.


So, to summarize my thoughts from this post:

* Replacing LiskovViolation is possible by dropping type/isinstance checks from adapt(), and adding an isinstance check to object.__conform__; Liskov violators then override __conform__ in their class to return None when asked to conform to a protocol they wish to reject, and return super().__conform__ for all other cases. This achieves your use case while simplifying both the implementation and the usage.

* Classic class support is a must; exceptions are still required to be classic, and even if they weren't in 2.5, backward compatibility should be provided for at least one release.

* Lossy/noisy refer to removing or adding dependent semantics, not independent semantics, so facade-ish adaptation is not lossy or noisy.

* If a target protocol permits NULL, then adaptation that supplies NULL is not noisy or lossy. If it is NOT NULL, then adaptation that supplies NULL is just plain wrong. Either way, there is no issue with transitivity, because either it's allowed or it isn't. (If NULLs aren't allowed, then you should be explicit when you make things up, and not do it implicitly via adaptation.)

* In my experience, incorrectly deriving an interface from another is the most common source of unintended adaptation side-effects, not adapter composition.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 246, redux

Reply via email to