Steven D'Aprano <st...@remove-this-cybersource.com.au> writes: > On Thu, 22 Jan 2009 19:10:05 +0000, Mark Wooding wrote: >> Well, your claim /was/ just wrong. But if you want to play dumb: the >> interface is what's documented as being the interface. > > But you miss my point.
Evidently. > We're told Python doesn't have private attributes. We're told that > we're allowed to "mess with the internals", we're *encouraged* to do > so: Python gives you the freedom to do so, and any suggestion that > freedom might be reduced even a tiny bit is fought passionately. Your deduction skills are faulty. * Python gives us the freedom to do so, and we fight to protect that freedom -- yes. * But interpreting that as encouragement is wrong. It's permission, not encouragement. If you don't want to, that's fine, and we won't think less of you. Many things are possible which aren't, as a general rule, good ideas. Misinterpreting permission as encouragement will lead you to doing many stupid things. > When people ask how to implement private attributes, they're often > told not to bother even using single-underscore names. When it is > suggested that Python should become stricter, with enforced data > hiding, the objections come thick and fast: people vehemently say that > they like Python just the way it is, that they want the ability to > mess with the internals. > You even argued that you disliked data structures implemented in C and > preferred those written in Python because you have more ability to > mess with the private attributes. In context, I had just mentioned > that lists' internals were inaccessible from Python code. I neglected > to give an example at the time, but a good example is the current > length of the list. Umm... I'm pretty sure that that's available via the `len' function, which is tied to list.__len__ (via the magic C-implemented-type mangler, in C). Though it's read-only -- and this is a shame, 'cos it'd be nice to be able to adjust the length of a list in ways which are more convenient than * deleting or assigning to a trailing slice, or * augmenting or assigning to a trailing zero-width slice (Perl has supported assigning to $#ARRAY for a long time. Maybe that's a good argument against it.) > Consider the experience of Microsoft and Apple. No matter how often > they tell people not to mess with the internals, people do it anyway, > and always believe that their reason is a good reason. And Microsoft and Apple can either bend over backwards to preserve compatibility anyway (which effectively rewards the misbehaviour) or change the internals. I'd prefer that they did the latter. There are times when messing with internals is the only way to get things done; but there's a price to be paid for doing that, and the price is compatibility. The internals will change in later versions, and your code will break, in subtle and complex ways. It's not always an easy decision to make -- but I'm glad it's me that gets to decide, and not some random who neither knows nor cares much about the problem I'm trying to solve. It's also important to bear in mind that programs' lifetimes vary. Some programs are expected to live for years; some programs only for a week or so; and some for just long enough to be typed and executed once (e.g., at the interactive prompt). That Python is useful for all these kinds of program lifetimes is testament to its designers' skill. Programmers can, and should!, make different tradeoffs depending on the expected lifetime of the program they're writing. If I type some hacky thing at ipython, I know it's going to be executed there and then, and if the implementation changes tomorrow, I just don't care. If I'm writing a thing to solve an immediate problem, I won't need it much past next week, and I'll still probably get away with any awful hacking -- but there's a chance I might reuse the program in a year or so, so I ought to put a comment in warning the reader of a possible bitrot site. If I'm writing a thing that's meant to last for years, I need to plan accordingly, and it's probably no appropriate to hack with internals without a very good reason. Making these kinds of decisions isn't easy. It requires experience, subtle knowledge of how the systems one's using work, and occasionally a little low cunning. And sometimes one screws up. > And Python culture encourages that behaviour (albeit the consequences > are milder: no buffer overflows or core dumps). > > Add to that the culture of Open Source that encourages reading the source > code. You don't need to buy a book called "Undocumented Tips and Tricks > for Python" to discover the internals. You just need to read the source > code. Indeed. Very useful. Example: for my cryptographic library bindings, I needed to be able to convert between Python's `long's and my library's `mp's. I have a choice between doing it very slowly (using shift and masking operators on the `long') or fast (by including Python/longintrepr.h and digging about by hand). I chose to do it the fast way. I'm quite prepared to rewrite my conversion code (64 lines of it) if the internals change; that I haven't had to yet indicates that my judgement of the stability of the internal representation was about right. The most important point is that, /had/ I turned out to be wrong, I'd only have myself to blame. > And then you have at least two places in the standard library where > _attributes are *explicitly* public: And documented as being so. It's a convention, with explicitly documented exceptions. That's a slight shame because it weakens the convention, but it's not a disaster. > Given this permissive culture, any responsible library writer must > assume that if he changes his so-called "private" attributes, he will > break other people's code. He will, but he can also assume that the maintainers of that code are /willing/ to see it break. This is tough on people who depend on internals by accident. Maybe they'll learn to be more careful. It's not a pleasant way to learn, but it's a lesson worth learning anyway. > In principle it could break just as much code as if he didn't even > bother flagging them with a leading underscore, which is probably why > many people don't even bother with _names. This comes down to documentation. The Python standard library is largely quite well documented, and is clear about what assumptions one can make and what one can't. In the absence of such clear documentation, we're left with conventions -- _things are likely to change in future so avoid messing on them if you don't want stuff to break. > In other words, if you make it easy for people to mess with your > internals, if you have a culture that allows and even encourages them > to mess with your internals, then you don't have internals. Everything > is de facto public. And here you've made a semantic leap that I'm afraid I just can't follow. > No, cmp() can return an infinite number of values. It just never does, > at least not yet, but it might. But when Guido himself says that cmp() > can return three values, can you blame people for acting as if cmp() > can return three values? Possibly not! It's worth thinking about codifying the existing practice and documenting the more constrained behaviour. Note that the C interface -- the tp_compare slot (Python/C API Reference Manual 10.3) -- /is/ defined to return -1, 0, or +1; so presumably the performance issues have already been considered. I think all of this comes down to issues of trust and responsibility. Python, though its `we're all consenting adults' approach, encourages a culture where we trust one another to make decisions for ourselves, and to take responsibility for the consequences of those decisions. Language features such as attribute (or member) visibility or access control, on the other hand, imply a culture without trust, and with an built-in assumption of irresponsibility. That seems rather unpleasant to me. Suppose that you write a Python library module and release it. I find that it's /almost/ the right thing for some program of mine, but it doesn't quite work properly unless I hack about like so... perfect! I'm a happy bunny; you've gained a user (maybe that's a good thing, maybe it isn't!). Now, I've hacked about in your module's internal stuff: how has this affected you? Answer: not at all; you probably didn't feel a thing. You release a new version with improved internal structure and my program breaks: how has this affected you? Answer: still not at all. How did it affect me? Quite a bit, but then again, I knew what I was getting into. I gambled and lost; oh, well, that happens sometimes. I've not dealt with granularity much yet; but that's easy. Basically, decisions should be made at the level at which the consequences of those decisions are felt. This isn't directly practical, but there are mechanisms to manage it: users generally delegate technical decisions to the development team; maybe there's a hierarchy in the dev team. And the team members need to be trusted not to make decisions at the wrong level. If you can't manage that, then Python probably isn't a good match for the team; replace one or the other. Finally, I notice that you completely snipped the part of my reply which dealt with your ConfigParser module. I'm going to assume that this means that you accepted that part of my response. -- [mdw] -- http://mail.python.org/mailman/listinfo/python-list