From: Steven D'Aprano <ste...@remove.this.cybersource.com.au> wrote: > On Mon, 10 Aug 2009 00:32:30 -0700, Douglas Alan wrote:
> > In C++, if I know that the code I'm looking at compiles, > > then I never need worry that I've misinterpreted what a > > string literal means. > If you don't know what your string literals are, you don't > know what your program does. You can't expect the compiler > to save you from semantic errors. Adding escape codes into > the string literal doesn't change this basic truth. I grow weary of these semantic debates. The bottom line is that C++'s strategy here catches bugs early on that Python's approach doesn't. It does so at no additional cost. >From a purely practical point of view, why would any language not want to adopt a zero-cost approach to catching bugs, even if they are relatively rare, as early as possible? (Other than the reason that adopting it *now* is sadly too late.) Furthermore, Python's strategy here is SPECIFICALLY DESIGNED, according to the reference manual to catch bugs. I.e., from the original posting on this issue: Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) If this "feature" is designed to catch bugs, why be half-assed about it? Especially since there seems to be little valid use case for allowing programmers to be lazy in their typing here. > The compiler can't save you from typing 1234 instead of > 11234, or 31.45 instead of 3.145, or "My darling Ho" > instead of "My darling Jo", so why do you expect it to > save you from typing "abc\d" instead of "abc\\d"? Because in the former cases it can't catch the the bug, and in the latter case, it can. > Perhaps it can catch *some* errors of that type, but only > at the cost of extra effort required to defeat the > compiler (forcing the programmer to type \\d to prevent > the compiler complaining about \d). I don't think the > benefit is worth the cost. You and your friend do. Who is > to say you're right? Well, Bjarne Stroustrup, for one. All of these are value judgments, of course, but I truly doubt that anyone would have been bothered if Python from day one had behaved the way that C++ does. Additionally, I expect that if Python had always behaved the way that C++ does, and then today someone came along and proposed the behavior that Python currently implements, so that the programmer could sometimes get away with typing a bit less, such a person would be chided for not understanding the Zen of Python. > > You don't have to go running for the manual every time > > you see code with backslashes, where the upshot might be > > that the programmer was merely saving themselves some > > typing. > Why do you care if there are "funny characters"? Because, of course, "funny characters" often have interesting consequences when output. Furthermore, their consequences aren't always immediately obvious from looking at the source code, unless you are intimately familiar with the function of the special characters in question. For instance, sometimes in the wrong combination, they wedge your xterm. Etc. I'm surprised that this needs to be spelled out. > In C++, if you see an escape you don't recognize, do you > care? Yes, of course I do. If I need to know what the program does. > Do you go running for the manual? If the answer is No, > then why do it in Python? The answer is that I do in both cases. > No. \z *is* a legal escape sequence, it just happens to map to \z. > If you stop thinking of \z as an illegal escape sequence > that Python refuses to raise an error for, the problem > goes away. It's a legal escape sequence that maps to > backslash + z. (1) I already used that argument on my friend, and he wasn't buying it. (Personally, I find the argument technically valid, but commonsensically invalid. It's a language-lawyer kind of argument, rather than one that appeals to any notion of real aesthetics.) (2) That argument disagrees with the Python reference manual, which explicitly states that "unrecognized escape sequences are left in the string unchanged", and that the purpose for doing so is because it "is useful when debugging". > > "\x" is not a legal escape sequence. Shouldn't it also > > get left as "\\x"? > > No, because it actually is an illegal escape sequence. What makes it "illegal". As far as I can tell, it's just another "unrecognized escape sequence". JavaScript treats it that way. Are you going to be the one to tell all the JavaScript programmers that their language can't tell a legal escape sequence from an illegal one? > > Well, I think he's more annoyed that if Python is going > > to be so helpful as to put in the missing "\" for you in > > "foo\zbar", then it should put in the missing "\" for > > you in "\". He considers this to be an inconsistency. > > (1) There is no missing \ in "foo\zbar". > > (2) The problem with "\" isn't a missing backslash, but a > missing end- quote. Says who? All of this really depends on your point of view. The whole morass goes away completely if one adopts C++'s approach here. > Python isn't DWIMing here. The rules are simple and straightforward, > there's no mind-reading or guessing required. It may not be a complex form of DWIMing, but it's still DWIMing a bit. Python is figuring that if I typed "\z", then either I must have really meant to type "\\z", or that I want to see the backslash when I'm debugging because I made a mistake, or that I'm just too lazy to type "\\z". > Is it "a form of DWIMing" to consider 1.234e1 and 12.34 > synonymous? That's a very different issue, as (1) there are very significant use cases for both kinds of numerical representations, and (2) there's often only one obvious way way that the number should be entered, depending on the coding situation. > What about 86 and 0x44? Is that DWIMing? See previous comment. > I'm sure both you and your friend are excellent > programmers, but you're tossing around DWIM as a > meaningless term of opprobrium without any apparent > understand of what DWIM actually is. I don't know if my friend even knows the term DWIM, other than me paraphrasing him, but I certainly understand all about the term. It comes from InterLisp. When DWIM was enabled, your program would run until it hit an error, and for certain kinds of errors, it would wait a few seconds for the user to notice the error message, and if the user didn't tell the program to stop, it would try to figure out what the user most likely meant, and then continue running using the computer-generated "fix". I.e., more or less like continuing on in the face of what the Python Reference manual refers to as an "unrecognized escape sequence". |>ouglas -- http://mail.python.org/mailman/listinfo/python-list