Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Steven D'Aprano st...@remove-this-c...e.com.au wrote: Now that I understand what the semantics of cout Hello world are, I don't have any problem with it either. It is a bit weird, Hello world cout would probably be better, but it's hardly the strangest design in any programming language, and it's probably influenced by input redirection using in various shells. I find it strange that you would prefer: Hello world cout over: cout Hello world The latter seems to me to be more in line with normal assignment: - Take what is on the right and make the left the same. I suppose it is because we read from left to right that the first one seems better to you. Another instance of how different we all are. It goes down to the assembler - there are two schools: mova,b - for Intel like languages, this means move b to a mova,b - for Motorola like languages, this means move a to b Gets confusing sometimes. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sun, 16 Aug 2009 09:24:36 +0200, Hendrik van Rooyen wrote: Steven D'Aprano st...@remove-this-c...e.com.au wrote: Now that I understand what the semantics of cout Hello world are, I don't have any problem with it either. It is a bit weird, Hello world cout would probably be better, but it's hardly the strangest design in any programming language, and it's probably influenced by input redirection using in various shells. I find it strange that you would prefer: Hello world cout over: cout Hello world The latter seems to me to be more in line with normal assignment: - Take what is on the right and make the left the same. I don't like normal assignment. After nearly four decades of mathematics and programming, I'm used to it, but I don't think it is especially good. It confuses beginners to programming: they get one set of behaviour drilled into them in maths class, and then in programming class we use the same notation for something which is almost, but not quite, the same. Consider the difference between: y = 3 + x x = z as a pair of mathematics expressions versus as a pair of assignments. What conclusion can you draw about y and z? Even though it looks funny due to unfamiliarity, I'd love to see the results of a teaching language that used notation like: 3 + x - y len(alist) - n Widget(1, 2, 3).magic - obj etc. for assignment. My prediction is that it would be easier to learn, and just as good for experienced coders. The only downside (apart from unfamiliarity) is that it would be a little bit harder to find the definition of a variable by visually skimming lines of code: your eyes have to zig-zag back and forth to find the end of the line, instead of running straight down the left margin looking for myvar = But it should be easy enough to search for - myvar. I suppose it is because we read from left to right that the first one seems better to you. Probably. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 16, 4:22 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: I don't like normal assignment. After nearly four decades of mathematics and programming, I'm used to it, but I don't think it is especially good. It confuses beginners to programming: they get one set of behaviour drilled into them in maths class, and then in programming class we use the same notation for something which is almost, but not quite, the same. Consider the difference between: y = 3 + x x = z as a pair of mathematics expressions versus as a pair of assignments. What conclusion can you draw about y and z? Yeah, the syntax most commonly used for assignment today sucks. In the past, it was common to see languages with syntaxes like y - y + 1 or y := y + 1 or let y = y + 1 But these languages have mostly fallen out of favor. The popular statistical programming language R still uses the y - y + 1 syntax, though. Personally, my favorite is Lisp, which looks like (set! y (+ y 1)) or (let ((x 3) (y 4)) (foo x y)) I like to be able to read everything from left to right, and Lisp does that more than any other programming language. I would definitely not like a language that obscures assignment by moving it over to the right side of lines. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Steven D'Aprano wrote: I don't like normal assignment. After nearly four decades of mathematics and programming, I'm used to it, but I don't think it is especially good. It confuses beginners to programming: they get one set of behaviour drilled into them in maths class, and then in programming class we use the same notation for something which is almost, but not quite, the same. Consider the difference between: y = 3 + x x = z as a pair of mathematics expressions versus as a pair of assignments. What conclusion can you draw about y and z? What you're saying is true, but it's still a matter of terminology. The symbol = means different things in different contexts, and mathematics and programming are very different ones indeed. The problem is compounded with early languages which lazily confused the two in different context, such as (but not exclusive to) BASIC using = for both assignment and equality testing in what are in esssence totally unrelated contexts. Even though it looks funny due to unfamiliarity, I'd love to see the results of a teaching language that used notation like: 3 + x - y len(alist) - n Widget(1, 2, 3).magic - obj etc. for assignment. My prediction is that it would be easier to learn, and just as good for experienced coders. This really isn't new at all. Reverse the arrow and the relationship to get:: y - x + 3 (and use a real arrow rather than ASCII) and that's assignment in APL and a common representation in pseudocode ever since. Change it to := and that's what Pascal used, as well as quite a few mathematical papers dealing with iterative computations, I might add. Once you get past the point of realizing that you really need to make a distinction between assignment and equality testing, then it's just a matter of choosing two different operators for the job. Whether it's -/= or :=/= or =/== or -/= (with reversed behavior for assignment) is really academic and a matter of taste at that point. Given the history of programming languages, it doesn't really look like the to-be-assigned variable being at the end of expression is going to get much play, since not a single major one I'm familiar with does it that way, and a lot of them have come up with the same convention independently and haven't seen a need to change. -- Erik Max Francis m...@alcyone.com http://www.alcyone.com/max/ San Jose, CA, USA 37 18 N 121 57 W AIM/Y!M/Skype erikmaxfrancis Get there first with the most men. -- Gen. Nathan Bedford Forrest, 1821-1877 -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Douglas Alan wrote: Personally, my favorite is Lisp, which looks like (set! y (+ y 1)) For varying values of Lisp. `set!` is Scheme. -- Erik Max Francis m...@alcyone.com http://www.alcyone.com/max/ San Jose, CA, USA 37 18 N 121 57 W AIM/Y!M/Skype erikmaxfrancis Get there first with the most men. -- Gen. Nathan Bedford Forrest, 1821-1877 -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 16, 4:48 am, Erik Max Francis m...@alcyone.com wrote: Douglas Alan wrote: Personally, my favorite is Lisp, which looks like (set! y (+ y 1)) For varying values of Lisp. `set!` is Scheme. Yes, I'm well aware! There are probably as many different dialects of Lisp as all other programming languages put together. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sun, 16 Aug 2009 01:41:41 -0700, Douglas Alan wrote: I like to be able to read everything from left to right, and Lisp does that more than any other programming language. I would definitely not like a language that obscures assignment by moving it over to the right side of lines. One could argue that left-assigned-from-right assignment obscures the most important part of the assignment, namely *what* you're assigning, in favour of what you're assigning *to*. In any case, after half a century of left-from-right assignment, I think it's worth the experiment in a teaching language or three to try it the other way. The closest to this I know of is the family of languages derived from Apple's Hypertalk, where you do assignment with: put somevalue into name (Doesn't COBOL do something similar?) Beginners found that *very* easy to understand, and it didn't seem to make coding harder for experienced Hypercard developers. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sunday 16 August 2009 12:18:11 Steven D'Aprano wrote: In any case, after half a century of left-from-right assignment, I think it's worth the experiment in a teaching language or three to try it the other way. The closest to this I know of is the family of languages derived from Apple's Hypertalk, where you do assignment with: put somevalue into name (Doesn't COBOL do something similar?) Yup. move banana to pineapple. move accountnum in inrec to accountnum in outrec. move corresponding inrec to outrec. It should all be upper case of course... I cannot quite recall, but I have the feeling that in the second form, of was also allowed instead of in, but it has been a while now so I am probably wrong. The move was powerful - it would do conversions for you based on the types of the operands - it all just worked. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Douglas Alan wrote: [snip] C++ also allows for reading from stdin like so: cin myVar; I think the direction of the arrows probably derives from languages like APL, which had notation something like so: myVar - 3 [] - myVar - was really a little arrow symbol (APL didn't use ascii), and the first line above would assign the value 3 to myVar. In the second line, the [] was really a little box symbol and represented the terminal. Assigning to the box would cause the output to be printed on the terminal, so the above would output 3. If you did this: [] - myVar It would read a value into myVar from the terminal. APL predates Unix by quite a few years. No, APL is strictly right-to-left. - x means goto x. Writing to the console is: [] - myVar Reading from the console is: myVar - [] -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 16, 8:45 am, MRAB pyt...@mrabarnett.plus.com wrote: No, APL is strictly right-to-left. - x means goto x. Writing to the console is: [] - myVar Reading from the console is: myVar - [] Ah, thanks for the correction. It's been 5,000 years since I used APL! |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 16, 6:18 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: On Sun, 16 Aug 2009 01:41:41 -0700, Douglas Alan wrote: I would definitely not like a language that obscures assignment by moving it over to the right side of lines. One could argue that left-assigned-from-right assignment obscures the most important part of the assignment, namely *what* you're assigning, in favour of what you're assigning *to*. The most important things are always the side-effects and the name- bindings. In a large program, it can be difficult to figure out where a name is defined, or which version of a name a particular line of code is seeing. Consequently languages should always go out of their way to make tracking this as easy as possible. Side effects are also a huge issue, and a source of many bugs. This is one of the reasons that that are many functional languages that prohibit or discourage side-effects. Side effects should be made as obvious as is feasible. This is why, for instance, in Scheme, variable assignment as an exclamation mark in it. E.g., (set! x (+ x 1)) The exclamation mark is to make the fact that a side effect is happening there stand out and be immediately apparent. And C++ provides the const declaration for similar reasons. In any case, after half a century of left-from-right assignment, I think it's worth the experiment in a teaching language or three to try it the other way. The closest to this I know of is the family of languages derived from Apple's Hypertalk, where you do assignment with: put somevalue into name That's okay with me, but only because the statement begins with put, which lets you know at the very beginning of the line that something very important is happening. You don't have to scan all the way to the right before you notice. Still, I would prefer let name = somevalue as the let gives me the heads up right away, and then immediately after the let is the name that I might want to be able to scan for quickly. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sun, 16 Aug 2009 05:05:01 +, Steven D'Aprano wrote: Now that I understand what the semantics of cout Hello world are, I don't have any problem with it either. It is a bit weird, Hello world cout would probably be better, Placing the stream on the LHS allows the main forms of to be implemented as methods of the ostream class. C++ only considers the LHS operand when attempting to resolve an infix operator as a method. Also, and are left-associative, and that cannot be changed by overloading. Having the ostream on the LHS allows the operators to be chained: cout Hello , world endl equivalent to: (((cout Hello) , ) world) endl [operator returns the ostream as its result.] Even if you could make right-associative, the values would have to be written right-to-left: endl world , Hello cout i.e.: endl (world (, (Hello cout))) -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Friday 14 August 2009 18:11:52 Steven D'Aprano wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? What the hell would it actually do??? It would shift cout left Hello World times. It is unclear if the shift wraps around or not. It is similar to a banana *holding his hands apart about a foot* this colour. - Hendrik -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sat, Aug 15, 2009 at 4:47 AM, Hendrik van Rooyenhend...@microcorp.co.za wrote: On Friday 14 August 2009 18:11:52 Steven D'Aprano wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? What the hell would it actually do??? It would shift cout left Hello World times. It is unclear if the shift wraps around or not. It is similar to a banana *holding his hands apart about a foot* this colour. - Hendrik I think you managed to successfully dereference the null pointer there... Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 14, 10:25 pm, Dave Angel da...@ieee.org wrote: Benjamin Kaplan wrote: On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote: P.S. Overloading left shift to mean output does indeed seem a bit sketchy, but in 15 years of C++ programming, I've never seen it cause any confusion or bugs. The only reason it hasn't is because people use it in Hello World. I bet some newbie C++ programmers get confused the first time they see used to shift. People typically get confused by a *lot* of things when they learn a new language. I think the better metric is how people fare with a language feature once they've grown accustomed to the language, and how long it takes them to acquire this familiarity. Actually, I've seen it cause confusion, because of operator precedence. The logical shift operators have a fairly high level priority, so sometimes you need parentheses that aren't obvious. Fortunately, most of those cases make compile errors. I've been programming in C++ so long that for me, if there's any confusion, it's the other way around. I see or and I think I/ O. I don't immediately think shifting. Fortunately, shifting is a pretty rare operation to actually use, which is perhaps why C++ reclaimed it for I/O. On the other hand, you are right that the precedence of is messed up for I/O. I've never seen a real-world case where this causes a bug in C++ code, because the static type-checker always seems to catch the error. In a dynamically typed language, this would be a much more serious problem. |ouglas P.S. I find it strange, however, that anyone who is not okay with abusing operator overloading in this manner, wouldn't also take umbrage at Python's overloading of + to work with strings and lists, etc. Numerical addition and sequence concatenation have entirely different semantics. -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
I guess the problem is---does it actually matter? On Fri, Aug 14, 2009 at 10:11 AM, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? What the hell would it actually do??? -- Steven -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 14, 1:55 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: Douglas, you and I clearly have a difference of opinion on this. Neither of us have provided even the tiniest amount of objective, replicable, reliable data on the error-proneness of the C++ approach versus that of Python. The supposed superiority of the C++ approach is entirely subjective and based on personal opinion instead of quantitative facts. Alas, this is true for nearly any engineering methodology or philosophy, which is why, I suppose, Perl, for instance, still has its proponents. It's virtually impossible to prove any thesis, and these things only get decided by endless debate that rages across decades. I prefer languages that permit anything that isn't explicitly forbidden, so I'm happy that Python treats non-special escape sequences as valid, I don't really understand what you mean by this. If Python were to declare that unrecognized escape sequences were forbidden, then they would be explicitly forbidden. Would you then be happy? If not, why are you not upset that Python won't let me do [3, 4, 5] + 2 Some other programming languages I've used certainly do. and your attempts to convince me that this goes against the Zen have entirely failed to convince me. As I've done before, I will admit that one consequence of this design is that it makes it hard to introduce new escape sequences to Python. Given that it's vanishingly rare to want to do so, I'm not so convinced of that in the days of Unicode. If I see, backslash, and then some Kanji character, what am I supposed to make of that? For all I know, that Kanji character might mean newline, and I'm seeing code for a version of Python that was tweaked to be friendly to the Japanese. And in the days where smart hand-held devices are proliferating like crazy, there might be ever-more demand for easy-to-use i/o that lets you control various aspects of those devices. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Sat, 15 Aug 2009 20:00:23 -0700, Douglas Alan wrote: So, as far as I can tell, Python has no real authority to throw stones at C++ on this little tiny particular issue. I think you're being a tad over-defensive. I asked a genuine question about a quote in somebody's signature. That's a quote which can be found all over the Internet, and the poster using it has (as far as I know) no official capacity to speak for Python -- while Aahz is a high-profile, well-respected Pythonista, he's not Guido. Now that I understand what the semantics of cout Hello world are, I don't have any problem with it either. It is a bit weird, Hello world cout would probably be better, but it's hardly the strangest design in any programming language, and it's probably influenced by input redirection using in various shells. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 16, 1:05 am, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: On Sat, 15 Aug 2009 20:00:23 -0700, Douglas Alan wrote: So, as far as I can tell, Python has no real authority to throw stones at C++ on this little tiny particular issue. I think you're being a tad over-defensive. Defensive? Personally, I prefer Python over C++ by about a factor of 100X. I just find it a bit amusing when someone claims that some programming language has a particular fatal flaw, when their own apparently favorite language has the very same issue in an only slightly different form. the poster using it has (as far as I know) no official capacity to speak for Python I never thought he did. I wasn't speaking literally, as I'm not under the opinion that any programming language has any literal authority or any literal ability to throw stones. Now that I understand what the semantics of cout Hello world are, I don't have any problem with it either. It is a bit weird, Hello world cout would probably be better, but it's hardly the strangest design in any programming language, and it's probably influenced by input redirection using in various shells. C++ also allows for reading from stdin like so: cin myVar; I think the direction of the arrows probably derives from languages like APL, which had notation something like so: myVar - 3 [] - myVar - was really a little arrow symbol (APL didn't use ascii), and the first line above would assign the value 3 to myVar. In the second line, the [] was really a little box symbol and represented the terminal. Assigning to the box would cause the output to be printed on the terminal, so the above would output 3. If you did this: [] - myVar It would read a value into myVar from the terminal. APL predates Unix by quite a few years. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
In article 6e13754c-1fa6-4d1b-8861-146bffec8...@h30g2000vbr.googlegroups.com, Douglas Alan darkwate...@gmail.com wrote: My friend begs to differ with the above. It would be much better for debugging if Python generated a parsing error for unrecognized escape sequences, rather than leaving them unchanged. g++ outputs a warning for such escape sequences, for instance. This is what I would consider to be the correct behavior. (Actually, I think it should just generate a fatal parsing error, but a warning is okay too.) Well, then, the usual response applies: create a patch, discuss it on python-ideas, and see what happens. (That is, nobody has previously complained so vociferously IIRC, and adding a warning is certainly within the bounds of what's theoretically acceptable.) -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes -- http://mail.python.org/mailman/listinfo/python-list
OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? What the hell would it actually do??? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? Yes. In C++, the operator is overloaded. Judging by the context in which I've seen it used, it does something like write strings to a stream. What the hell would it actually do??? IIRC in C++, cout Hello world; is equivalent to this in C: printf(Hellow world); or this in Python: print hellow world -- Grant Edwards grante Yow! Bo Derek ruined at my life! visi.com -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Grant Edwards wrote: On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? Yes. In C++, the operator is overloaded. Judging by the context in which I've seen it used, it does something like write strings to a stream. What the hell would it actually do??? IIRC in C++, cout Hello world; It also returns cout, so you can chain them: cout Hello, name '\n'; is equivalent to this in C: printf(Hellow world); or this in Python: print hellow world -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Aug 14, 12:17 pm, Grant Edwards inva...@invalid wrote: On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote: I saw `cout' being shifted Hello world times to the left and stopped right there. --Steve Gonedes Assuming that's something real, and not invented for humour, I presume that's describing something possible in C++. Am I correct? Yes. In C++, the operator is overloaded. Judging by the context in which I've seen it used, it does something like write strings to a stream. There's a persistent rumor that it is *this* very abuse of overloading that caused Java to avoid operator overloading all together. But then then Java went and used + as the string concatenation operator. Go figure! |ouglas P.S. Overloading left shift to mean output does indeed seem a bit sketchy, but in 15 years of C++ programming, I've never seen it cause any confusion or bugs. -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
I think I've spent enough time on this discussion, so I won't be directly responding to any of your recent points -- it's clear that I'm not persuading you that there's any justification for any behaviour for escape sequences other than the way C++ deals with them. That's your prerogative, of course, but I've done enough tilting at windmills for this week, so I'll just make one final comment and then withdraw from an unproductive argument. (I will make an effort to read any final comments you wish to make, so feel free to reply. Just don't expect an answer to any questions.) Douglas, you and I clearly have a difference of opinion on this. Neither of us have provided even the tiniest amount of objective, replicable, reliable data on the error-proneness of the C++ approach versus that of Python. The supposed superiority of the C++ approach is entirely subjective and based on personal opinion instead of quantitative facts. I prefer languages that permit anything that isn't explicitly forbidden, so I'm happy that Python treats non-special escape sequences as valid, and your attempts to convince me that this goes against the Zen have entirely failed to convince me. As I've done before, I will admit that one consequence of this design is that it makes it hard to introduce new escape sequences to Python. Given that it's vanishingly rare to want to do so, and that wanting to add backslashes to strings is common, I think that's a reasonable tradeoff. Other languages may make different tradeoffs, and that's fine by me. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Grant Edwards wrote: On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: What the hell would it actually do??? IIRC in C++, cout Hello world; is equivalent to this in C: printf(Hellow world); or this in Python: print hellow world Well, plus or minus newlines. -- Erik Max Francis m...@alcyone.com http://www.alcyone.com/max/ San Jose, CA, USA 37 18 N 121 57 W AIM/Y!M/Skype erikmaxfrancis It's hard to say what I want my legacy to be when I'm long gone. -- Aaliyah -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On 2009-08-14, Erik Max Francis m...@alcyone.com wrote: Grant Edwards wrote: On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: What the hell would it actually do??? IIRC in C++, cout Hello world; is equivalent to this in C: printf(Hellow world); or this in Python: print hellow world Well, plus or minus newlines. And a few miscellaneous typos... -- Grant Edwards grante Yow! I don't understand at the HUMOUR of the THREE visi.comSTOOGES!! -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Grant Edwards wrote: On 2009-08-14, Erik Max Francis m...@alcyone.com wrote: Grant Edwards wrote: On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote: What the hell would it actually do??? IIRC in C++, cout Hello world; is equivalent to this in C: printf(Hellow world); or this in Python: print hellow world Well, plus or minus newlines. And a few miscellaneous typos... ... and includes and namespaces :-). -- Erik Max Francis m...@alcyone.com http://www.alcyone.com/max/ San Jose, CA, USA 37 18 N 121 57 W AIM/Y!M/Skype erikmaxfrancis It's hard to say what I want my legacy to be when I'm long gone. -- Aaliyah -- http://mail.python.org/mailman/listinfo/python-list
Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote: P.S. Overloading left shift to mean output does indeed seem a bit sketchy, but in 15 years of C++ programming, I've never seen it cause any confusion or bugs. The only reason it hasn't is because people use it in Hello World. I bet some newbie C++ programmers get confused the first time they see used to shift. -- http://mail.python.org/mailman/listinfo/python-list
Re: Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]
Benjamin Kaplan wrote: On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote: P.S. Overloading left shift to mean output does indeed seem a bit sketchy, but in 15 years of C++ programming, I've never seen it cause any confusion or bugs. The only reason it hasn't is because people use it in Hello World. I bet some newbie C++ programmers get confused the first time they see used to shift. Actually, I've seen it cause confusion, because of operator precedence. The logical shift operators have a fairly high level priority, so sometimes you need parentheses that aren't obvious. Fortunately, most of those cases make compile errors. C++ has about 17 levels of precedence, plus some confusing associative rules. And operator overloading does *NOT* change precedence. DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 12, 7:19 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: You are making an unjustified assumption: \y is not an error. You are making in an unjustified assumption that I ever made such an assumption! My claim is and has always been NOT that \y is inately an error, but rather that treating unrecognized escape sequences as legal escape sequences is error PRONE. While I'm amused that you've made my own point for me, I'm less amused that you seem to be totally incapable of seeing past your parochial language assumptions, Where do you get the notion that my assumptions are in any sense parochial? They come from (1) a great deal of experience programming very reliable software, and (2) having learned at least two dozen different programming languages in my life. I disagree with nearly everything you say in this post. I think that a few points you make have some validity, but the vast majority are based on a superficial and confused understanding of language design principles. Whatever. I've taken two graduate level classes at MIT on programming languages design, and got an A in both classes, and designed my own programming language as a final project, and received an A+. But I guess I don't really know anything about the topic at all. But it's not the only reasonable design choice, and Bash has made a different choice, and Python has made yet a third reasonable choice, and Pascal made yet a fourth reasonable choice. And so did Perl and PHP, and whatever other programming language you happen to mention. In fact, all programming languages are equally good, so we might as well just freeze all language design as it is now. Clearly we can do no better. One party insisting that red is the only logical colour for a car, and that anybody who prefers white or black or blue is illogical, is unacceptable. If having all cars be red saved a lot of lives, or increased gas mileage significantly, then it might very well be the best color for a car. But of course, that is not the case. With programming languages, there is much more likely to be an actual fact of the matter on which sorts of language design decisions make programmers more productive on average, and which ones result in more reliable software. I will certainly admit that obtaining objective data on such things is very difficult, but it's a completely different thing that one's color preference for their car. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Tue, 11 Aug 2009 14:48:24 -0700, Douglas Alan wrote: In any case, my argument has consistently been that Python should have treated undefined escape sequences consistently as fatal errors, A reasonable position to take. I disagree with it, but it is certainly reasonable. not as warnings. I don't know what language you're talking about here, because non-special escape sequences in Python aren't either errors or warnings: print ab\cd ab\cd No warning is made, because it's not considered an error that requires a warning. This matches the behaviour of other languages, including C and bash. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Tue, 11 Aug 2009 13:20:52 -0700, Douglas Alan wrote: On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: test.cpp:1:1: warning: unknown escape sequence '\y' Isn't that a warning, not a fatal error? So what does temp contain? My Annotated C++ Reference Manual is packed, and surprisingly in Stroustrup's Third Edition, there is no mention of the issue in the entire 1,000 pages. But Microsoft to the rescue: If you want a backslash character to appear within a string, you must type two backslashes (\\) (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx) Should I assume that Microsoft's C++ compiler treats it as an error, not a warning? Or is is this *still* undefined behaviour, and MS C++ compiler will happily compile ab\cd whatever it feels like? The question of what any specific C++ does if you ignore the warning is irrelevant, as such behavior in C++ is almost *always* undefined. Hence the warning. So a C++ compiler which follows Python's behaviour would be behaving within the language specifications. I note that the bash shell, which claims to follow C semantics, also does what Python does: $ echo $'a s\trin\g with escapes' a s rin\g with escapes Explain to me again why we're treating underspecified C++ semantics, which may or may not do *exactly* what Python does, as if it were the One True Way of treating escape sequences? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Tue, 11 Aug 2009 14:29:43 -0700, Douglas Alan wrote: I need to preface this entire post with the fact that I've already used ALL of the arguments that you've provided on my friend before I ever even came here with the topic, and my own arguments on why Python can be considered to be doing the right thing on this issue didn't even convince ME, much less him. When I can't even convince myself with an argument I'm making, then you know there's a problem with it! I hear all your arguments, and to play Devil's Advocate I repeat them, and they don't convince me either. So by your logic, there's obviously a problem with your arguments as well! That problem basically boils down to a deep-seated philosophical disagreement over which philosophy a language should follow in regard to backslash escapes: Anything not explicitly permitted is forbidden versus Anything not explicitly forbidden is permitted Python explicitly permits all escape sequences, with well-defined behaviour, with the only ones forbidden being those explicitly forbidden: * hex escapes with invalid hex digits; * oct escapes with invalid oct digits; * Unicode named escapes with unknown names; * 16- and 32-bit Unicode escapes with invalid hex digits. C++ apparently forbids all escape sequences, with unspecified behaviour if you use a forbidden sequence, except for a handful of explicitly permitted sequences. That's not better, it's merely different. Actually, that's not true -- that the C++ standard forbids a thing, but leaves the consequences of doing that thing unspecified, is clearly a Bad Thing. [...] Apart from the lack of warning, what actually is the difference between Python's behavior and C++'s behavior? That question makes just about as much sense as, Apart from the lack of a fatal error, what actually is the difference between Python's behavior and C++'s? This is what I get: [steve ~]$ cat test.cc #include iostream int main(int argc, char* argv[]) { std::cout x\yz std::endl; return 0; } [steve ~]$ g++ test.cc -o test test.cc:4:14: warning: unknown escape sequence '\y' [st...@soy ~]$ ./test xyz So on at least one machine in the world, C++ simply strips out backslashes that it doesn't recognise, leaving the suffix. Unfortunately, we can't rely on that, because C++ is underspecified. Fortunately this is not a problem with Python, which does completely specify the behaviour of escape sequences so there are no surprises. [...] I disagree with your sense of aesthetics. I think that having to write \\y when I want \y just to satisfy a bondage-and-discipline compiler is ugly. That's not to deny that BD isn't useful on occasion, but in this case I believe the benefit is negligible, and so even a tiny cost is not worth the pain. EXPLICIT IS BETTER THAN IMPLICIT. Quoting the Zen without understanding (especially shouting) doesn't impress anyone. There's nothing implicit about escape sequences. \y is perfectly explicit. Look Ma, there's a backslash, and a y, it gives a backslash and a y! Implicit has an actual meaning. You shouldn't use it as a mere term of opprobrium for anything you don't like. (2) That argument disagrees with the Python reference manual, which explicitly states that unrecognized escape sequences are left in the string unchanged, and that the purpose for doing so is because it is useful when debugging. How does it disagree? \y in the source code mapping to \y in the string object is the sequence being left unchanged. And the usefulness of doing so is hardly a disagreement over the fact that it does so. Because you've stated that \y is a legal escape sequence, while the Python Reference Manual explicitly states that it is an unrecognized escape sequence, and that such unrecognized escape sequences are sources of bugs. There's that reading comprehension problem again. Unrecognised != illegal. Useful for debugging != source of bugs. If they were equal, we could fix an awful lot of bugs by throwing away our debugging tools. Here's the URL to the relevant page: http://www.python.org/doc/2.5.2/ref/strings.html It seems to me that the behaviour the Python designers were looking to avoid was the case where the coder accidentally inserted a backslash in the wrong place, and the language stripped the backslash out, e.g.: Wanted a\bcd but accidentally typed ab\cd instead, and got abcd. (This is what Bash does by design, and at least some C/C++ compilers do, perhaps by accident, perhaps by design.) In that case, with no obvious backslash, the user may not even be aware that there was a problem: s = ab\cd # assume the backslash is silently discarded assert len(s) == 4 assert s[3] == 'c' assert '\\' not in s All of these tests would wrongly pass, but with Python's behaviour of leaving the backslash in, they would all fail, and the string is visually distinctive (it has an obvious backslash in it). Now, if you consider that \c should be
Re: Unrecognized escape sequences in string literals
On Aug 12, 3:08 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Tue, 11 Aug 2009 14:48:24 -0700, Douglas Alan wrote: In any case, my argument has consistently been that Python should have treated undefined escape sequences consistently as fatal errors, A reasonable position to take. I disagree with it, but it is certainly reasonable. not as warnings. I don't know what language you're talking about here, because non-special escape sequences in Python aren't either errors or warnings: print ab\cd ab\cd I was talking about C++, whose compilers tend to generate warnings for this usage. I think that the C++ compilers I've used take the right approach, only ideally they should be *even* more emphatic, and elevate the problem from a warning to an error. I assume, however, that the warning is a middle ground between doing the completely right thing, and, I assume, maintaining backward compatibility with common C implementations. As Python never had to worry about backward compatibility with C, Python didn't have to walk such a middle ground. On the other hand, *now* it has to worry about backward compatibility with itself. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 12, 3:36 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Tue, 11 Aug 2009 13:20:52 -0700, Douglas Alan wrote: My Annotated C++ Reference Manual is packed, and surprisingly in Stroustrup's Third Edition, there is no mention of the issue in the entire 1,000 pages. But Microsoft to the rescue: If you want a backslash character to appear within a string, you must type two backslashes (\\) (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx) Should I assume that Microsoft's C++ compiler treats it as an error, not a warning? In my experience, C++ compilers generally generate warnings for such situations, where they can. (Clearly, they often can't generate warnings for running off the end of an array, which is also undefined, though a really smart C++ compiler might be able to generate a warning in certain such circumstances.) Or is is this *still* undefined behaviour, and MS C++ compiler will happily compile ab\cd whatever it feels like? If it's a decent compiler, it will generate a warning. Who can say with Microsoft, however. It's clearly documented as illegal code, however. The question of what any specific C++ does if you ignore the warning is irrelevant, as such behavior in C++ is almost *always* undefined. Hence the warning. So a C++ compiler which follows Python's behaviour would be behaving within the language specifications. It might be, but there are also *recommendations* in the C++ standard about what to do in such situations, and the recommendations say, I am pretty sure, not to do that, unless the particular compiler in question has to meet some very specific backward compatibility needs. I note that the bash shell, which claims to follow C semantics, also does what Python does: $ echo $'a s\trin\g with escapes' a s rin\g with escapes Really? Not on my computers. (One is a Mac, and the other is a Fedora Core Linux box.) On my computers, bash doesn't seem to have *any* escape sequences, other than \\, \, \$, and \`. It seems to treat unknown escape sequences the same as Python does, but as there are only four known escape sequences, and they are all meant merely to guard against string interpolation, and the like, it's pretty darn easy to keep straight. Explain to me again why we're treating underspecified C++ semantics, which may or may not do *exactly* what Python does, as if it were the One True Way of treating escape sequences? I'm not saying that C++ does it right for Python. The right thing for Python to do is to generate an error, as Python doesn't have to deal with all the crazy complexities that C++ has to. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 12, 5:32 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: That problem basically boils down to a deep-seated philosophical disagreement over which philosophy a language should follow in regard to backslash escapes: Anything not explicitly permitted is forbidden versus Anything not explicitly forbidden is permitted No, it doesn't. It boils down to whether a language should: (1) Try it's best to detect errors as early as possible, especially when the cost of doing so is low. (2) Make code as readable as possible, in part by making code as self-evident as possible by mere inspection and by reducing the amount of stuff that you have to memorize. Perl fails miserably in this regard, for instance. (3) To quote Einstein, make everything as simple as possible, and no simpler. (4) Take innately ambiguous things and not force them to be unambiguous by mere fiat. Allowing a programmer to program using a completely arbitrary resolution of unrecognized escape sequences violates all of the above principles. The fact that the meanings of unrecognized escape sequences are ambiguous is proved by the fact that every language seems to treat them somewhat differently, demonstrating that there is no natural intuitive meaning for them. Furthermore, allowing programmers to use unrecognized escape sequences without raising an error violates: (1) Explicit is better than implicit: Python provides a way to explicitly specify that you want a backslash. Every programmer should be encouraged to use Python's explicit mechanism here. (2) Simple is better than complex: Python currently has two classes of ambiguously interpretable escape sequences: unrecognized ones, and illegal ones. Making a single class (i.e. just illegal ones) is simpler. Also, not having to memorize escape sequences that you rarely have need to use is simpler. (3) Readability counts: See above comments on readability. (4) Errors should never pass silently: Even the Python Reference Manual indicates that unrecognized escape sequences are a source of bugs. (See more comments on this below.) (5) In the face of ambiguity, refuse the temptation to guess. Every language, other than C++, is taking a guess at what the programmer would find to be most useful expansion for unrecognized escape sequences, and each of the languages is guessing differently. This temptation should be refused! You can argue that once it is in the Reference Manual it is no longer a guess, but that is patently specious, as Perl proves. For instance, the fact that Perl will quietly convert an array into a scalar for you, if you assign the array to a scalar variable is certainly a guess of the sort that this Python koan is referring to. Likewise for an arbitrary interpretation of unrecognized escape sequences. (6) There should be one-- and preferably only one --obvious way to do it. What is the one obvious way to express \\y? It is \\y or \y? Python can easily make one of these ways the one obvious way by making the other one raise an error. (7) Namespaces are one honking great idea -- let's do more of those! Allowing \y to self-expand is intruding into the namespace for special characters that require an escape sequence. C++ apparently forbids all escape sequences, with unspecified behaviour if you use a forbidden sequence, except for a handful of explicitly permitted sequences. That's not better, it's merely different. It *is* better, as it catches errors early on at little cost, and for all the other reasons listed above. Actually, that's not true -- that the C++ standard forbids a thing, but leaves the consequences of doing that thing unspecified, is clearly a Bad Thing. Indeed. But C++ has backward compatibly issues that make any that Python has to deal with, pale in comparison. The recommended behavior for a C++ compiler, however, is to flag the problem as an error or as a warning. So on at least one machine in the world, C++ simply strips out backslashes that it doesn't recognize, leaving the suffix. Unfortunately, we can't rely on that, because C++ is underspecified. No, *fortunately* you can't rely on it, forcing you to go fix your code. Fortunately this is not a problem with Python, which does completely specify the behaviour of escape sequences so there are no surprises. It's not a surprise when the C++ compiler issues a warning to you. If you ignore the warning, then you have no one to blame but yourself. Implicit has an actual meaning. You shouldn't use it as a mere term of opprobrium for anything you don't like. Pardon me, but I'm using implicit to mean implicit, and nothing more. Python's behavior here is implicit in the very same way that Perl implicitly converts an array into a scalar for you. (Though that particular Perl behavior is a far bigger wart than Python's behavior is here!) Because you've stated that \y is a legal escape sequence, while the Python Reference Manual explicitly states
Re: Unrecognized escape sequences in string literals
On Wed, 12 Aug 2009 14:21:34 -0700, Douglas Alan wrote: On Aug 12, 5:32 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: That problem basically boils down to a deep-seated philosophical disagreement over which philosophy a language should follow in regard to backslash escapes: Anything not explicitly permitted is forbidden versus Anything not explicitly forbidden is permitted No, it doesn't. It boils down to whether a language should: (1) Try it's best to detect errors as early as possible, especially when the cost of doing so is low. You are making an unjustified assumption: \y is not an error. It is only an error if you think that anything not explicitly permitted is forbidden. While I'm amused that you've made my own point for me, I'm less amused that you seem to be totally incapable of seeing past your parochial language assumptions, even when those assumptions are explicitly pointed out to you. Am I wasting my time engaging you in discussion? There's a lot more I could say, but time is short, so let me just summarise: I disagree with nearly everything you say in this post. I think that a few points you make have some validity, but the vast majority are based on a superficial and confused understanding of language design principles. (I won't justify that claim now, perhaps later, time permitting.) Nevertheless, I think that your ultimate wish -- for \y etc to be considered an error -- is a reasonable design choice, given your assumptions. But it's not the only reasonable design choice, and Bash has made a different choice, and Python has made yet a third reasonable choice, and Pascal made yet a fourth reasonable choice. These are all reasonable choices, all have some good points and some bad points, but ultimately the differences between them are mostly arbitrary personal preference, like the colour of a car. Disagreements over preferences I can live with. One party insisting that red is the only logical colour for a car, and that anybody who prefers white or black or blue is illogical, is unacceptable. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Steven D'Aprano wrote: On Wed, 12 Aug 2009 14:21:34 -0700, Douglas Alan wrote: On Aug 12, 5:32 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: That problem basically boils down to a deep-seated philosophical disagreement over which philosophy a language should follow in regard to backslash escapes: Anything not explicitly permitted is forbidden versus Anything not explicitly forbidden is permitted No, it doesn't. It boils down to whether a language should: (1) Try it's best to detect errors as early as possible, especially when the cost of doing so is low. You are making an unjustified assumption: \y is not an error. It is only an error if you think that anything not explicitly permitted is forbidden. While I'm amused that you've made my own point for me, I'm less amused that you seem to be totally incapable of seeing past your parochial language assumptions, even when those assumptions are explicitly pointed out to you. Am I wasting my time engaging you in discussion? There's a lot more I could say, but time is short, so let me just summarise: I disagree with nearly everything you say in this post. I think that a few points you make have some validity, but the vast majority are based on a superficial and confused understanding of language design principles. (I won't justify that claim now, perhaps later, time permitting.) Nevertheless, I think that your ultimate wish -- for \y etc to be considered an error -- is a reasonable design choice, given your assumptions. But it's not the only reasonable design choice, and Bash has made a different choice, and Python has made yet a third reasonable choice, and Pascal made yet a fourth reasonable choice. IHMO, it would've been simpler in the long run to say that backslash followed by one of [0-9A-Za-z] is an escape sequence, backslash followed by newline is ignored, and backslash followed by anything else is that something. That way there would be a way to introduce additional escape sequences without breaking existing code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Mon, 10 Aug 2009 15:17:24 -0700, Douglas Alan wrote: From: Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Mon, 10 Aug 2009 00:32:30 -0700, Douglas Alan wrote: In C++, if I know that the code I'm looking at compiles, then I never need worry that I've misinterpreted what a string literal means. If you don't know what your string literals are, you don't know what your program does. You can't expect the compiler to save you from semantic errors. Adding escape codes into the string literal doesn't change this basic truth. I grow weary of these semantic debates. The bottom line is that C++'s strategy here catches bugs early on that Python's approach doesn't. It does so at no additional cost. From a purely practical point of view, why would any language not want to adopt a zero-cost approach to catching bugs, even if they are relatively rare, as early as possible? Because the cost isn't zero. Needing to write \\ in a string literal when you want \ is a cost, and having to read \\ in source code and mentally translate that to \ is also a cost. By all means argue that it's a cost that is worth paying, but please stop pretending that it's not a cost. Having to remember that \n is a special escape and \y isn't is also a cost, but that's a cost you pay in C++ too, if you want your code to compile. By the way, you've stated repeatedly that \y will compile with a warning in g++. So what precisely do you get if you ignore the warning? What do other C++ compilers do? Apart from the lack of warning, what actually is the difference between Python's behaviour and C++'s behaviour? (Other than the reason that adopting it *now* is sadly too late.) Furthermore, Python's strategy here is SPECIFICALLY DESIGNED, according to the reference manual to catch bugs. I.e., from the original posting on this issue: Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) You need to work on your reading comprehension. It doesn't say anything about the motivation for this behaviour, let alone that it was SPECIFICALLY DESIGNED to catch bugs. It says it is useful for debugging. My shoe is useful for squashing poisonous spiders, but it wasn't designed as a poisonous-spider squashing device. The compiler can't save you from typing 1234 instead of 11234, or 31.45 instead of 3.145, or My darling Ho instead of My darling Jo, so why do you expect it to save you from typing abc\d instead of abc\\d? Because in the former cases it can't catch the the bug, and in the latter case, it can. I'm not convinced this is a bug that needs catching, but if you think it is, then that's a reasonable argument. Perhaps it can catch *some* errors of that type, but only at the cost of extra effort required to defeat the compiler (forcing the programmer to type \\d to prevent the compiler complaining about \d). I don't think the benefit is worth the cost. You and your friend do. Who is to say you're right? Well, Bjarne Stroustrup, for one. Then let him design his own language *wink* All of these are value judgments, of course, but I truly doubt that anyone would have been bothered if Python from day one had behaved the way that C++ does. If I'm reading this page correctly, Python does behave as C++ does. Or at least as Larch/C++ does: http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html In C++, if you see an escape you don't recognize, do you care? Yes, of course I do. If I need to know what the program does. Precisely the same as in Python. Do you go running for the manual? If the answer is No, then why do it in Python? The answer is that I do in both cases. You deleted without answer my next question: And if the answer is Yes, then how is Python worse than C++? Seems to me that the answer is It's not worse than C++, it's the same -- in both cases, you have to memorize the special escape sequences, and in both cases, if you see an escape you don't recognize, you need to look it up. No. \z *is* a legal escape sequence, it just happens to map to \z. If you stop thinking of \z as an illegal escape sequence that Python refuses to raise an error for, the problem goes away. It's a legal escape sequence that maps to backslash + z. (1) I already used that argument on my friend, and he wasn't buying it. (Personally, I find the argument technically valid, but commonsensically invalid. It's a language-lawyer kind of argument, rather than one that appeals to any notion of real aesthetics.) I disagree with your sense of aesthetics. I think that having to write \\y when I want \y just to satisfy a bondage-and-discipline compiler is ugly. That's not to deny that BD isn't useful on occasion, but in
Re: Unrecognized escape sequences in string literals
Steven D'Aprano ste...@remove.this.cybersource.com.au (SD) wrote: SD If I'm reading this page correctly, Python does behave as C++ does. Or at SD least as Larch/C++ does: SD http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html They call them `non-standard escape sequences' for a reason: that they are not in standard C++. test.cpp: char* temp = abc\yz; TEMP g++ -c test.cpp test.cpp:1:1: warning: unknown escape sequence '\y' -- Piet van Oostrum p...@cs.uu.nl URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Steven D'Aprano wrote: On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote: But you're right, it's too late to change this now. Not really. There is a procedure for making non-backwards compatible changes. If you care deeply enough about this, you could agitate for Python 3.2 to raise a PendingDepreciation warning for unexpected escape sequences like \z, Python 3.3 to raise a Depreciation warning, and Python 3.4 to treat it as an error. It may even be possible to skip the PendingDepreciation warning and go straight for Depreciation warning in 3.2. And once it's fully depreciated you have to stop writing it off on your taxes. *wink* ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Tue, 11 Aug 2009 15:50:01 +0200, Piet van Oostrum wrote: Steven D'Aprano ste...@remove.this.cybersource.com.au (SD) wrote: SD If I'm reading this page correctly, Python does behave as C++ does. Or at SD least as Larch/C++ does: SD http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html They call them `non-standard escape sequences' for a reason: that they are not in standard C++. test.cpp: char* temp = abc\yz; TEMP g++ -c test.cpp test.cpp:1:1: warning: unknown escape sequence '\y' Isn't that a warning, not a fatal error? So what does temp contain? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: test.cpp:1:1: warning: unknown escape sequence '\y' Isn't that a warning, not a fatal error? So what does temp contain? My Annotated C++ Reference Manual is packed, and surprisingly in Stroustrup's Third Edition, there is no mention of the issue in the entire 1,000 pages. But Microsoft to the rescue: If you want a backslash character to appear within a string, you must type two backslashes (\\) (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx) The question of what any specific C++ does if you ignore the warning is irrelevant, as such behavior in C++ is almost *always* undefined. Hence the warning. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Douglas Alan wrote: On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: test.cpp:1:1: warning: unknown escape sequence '\y' Isn't that a warning, not a fatal error? So what does temp contain? My Annotated C++ Reference Manual is packed, and surprisingly in Stroustrup's Third Edition, there is no mention of the issue in the entire 1,000 pages. But Microsoft to the rescue: If you want a backslash character to appear within a string, you must type two backslashes (\\) (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx) The question of what any specific C++ does if you ignore the warning is irrelevant, as such behavior in C++ is almost *always* undefined. Hence the warning. |ouglas Almost always undefined? Whereas with Python, and some memorization or a small table/list nearby, you can easily *know* what you will get. Mind you, I'm not really vested in how Python *should* handle backslashes one way or the other, but I am glad it has rules that it follows for consitent results, and I don't have to break out a byte-code editor to find out what's in my string literal. ~Ethan~ -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Steven D'Aprano wrote: Because the cost isn't zero. Needing to write \\ in a string literal when you want \ is a cost, I need to preface this entire post with the fact that I've already used ALL of the arguments that you've provided on my friend before I ever even came here with the topic, and my own arguments on why Python can be considered to be doing the right thing on this issue didn't even convince ME, much less him. When I can't even convince myself with an argument I'm making, then you know there's a problem with it! Now back the our regularly scheduled debate: I think that the total cost of all of that extra typing for all the Python programmers in the entire world is now significantly less than the time it took to have this debate. Which would have never happened if Python did things the right way on this issue to begin with. Meaning that we're now at LESS than zero cost for doing things right! And we haven't even yet included all the useless heat that is going to be generated during code reviews and in-house coding standard debates. That's why I stand by Python's motto: THERE SHOULD BE ONE-- AND PREFERABLY ONLY ONE --OBVIOUS WAY TO DO IT. and having to read \\ in source code and mentally translate that to \ is also a cost. For me that has no mental cost. What does have a mental cost is remembering whether \b is an unrecognized escape sequence or not. By all means argue that it's a cost that is worth paying, but please stop pretending that it's not a cost. I'm not pretending. I'm pwning you with logic and common sense! Having to remember that \n is a special escape and \y isn't is also a cost, but that's a cost you pay in C++ too, if you want your code to compile. Ummm, no I don't! I just always use \\ when I want a backslash to appear, and I only think about the more obscure escape sequences if I actually need them, or some code that I am reading has used them. By the way, you've stated repeatedly that \y will compile with a warning in g++. So what precisely do you get if you ignore the warning? A program with undefined behavior. That's typically what a warning means from a C++ compiler. (Sometimes it means use of a deprecated feature, though.) What do other C++ compilers do? The Microsoft compilers also consider it to be incorrect code, as I documented in a different post. Apart from the lack of warning, what actually is the difference between Python's behavior and C++'s behavior? That question makes just about as much sense as, Apart from the lack of a fatal error, what actually is the difference between Python's behavior and C++'s? Sure, warnings aren't fatal errors, but if you ignore them, then you are almost always doing something very wrong. (Unless you're building legacy code.) Furthermore, Python's strategy here is SPECIFICALLY DESIGNED, according to the reference manual to catch bugs. I.e., from the original posting on this issue: Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the string. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) You need to work on your reading comprehension. It doesn't say anything about the motivation for this behaviour, let alone that it was SPECIFICALLY DESIGNED to catch bugs. It says it is useful for debugging. My shoe is useful for squashing poisonous spiders, but it wasn't designed as a poisonous-spider squashing device. As I have a BS from MIT in BS-ology, I can readily set aside your aspersions to my intellect, and point out the gross errors of your ways: Natural language does not work the way you claim. It is is much more practical, implicit, and elliptical. More specifically, if your shoe came with a reference manual claiming that it was useful for squashing poisonous spiders, then you may now validly assume poisonous spider squashing was a design requirement of the shoe. (Or at least it has become one, even if ipso facto.) Furthermore, if it turns out that the shoe is deficient at poisonous spider squashing, and consequently causes you to get bitten by a poisonous spider, then you now have grounds for a lawsuit. Because in the former cases it can't catch the the bug, and in the latter case, it can. I'm not convinced this is a bug that needs catching, but if you think it is, then that's a reasonable argument. All my arguments are reasonable. Perhaps it can catch *some* errors of that type, but only at the cost of extra effort required to defeat the compiler (forcing the programmer to type \\d to prevent the compiler complaining about \d). I don't think the benefit is worth the cost. You and your friend do. Who is to say you're right? Well, Bjarne Stroustrup, for one. Then let him design his own language *wink* Oh, I'm not sure that's such a good idea. He might come up
Re: Unrecognized escape sequences in string literals
On Aug 10, 11:27 pm, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote: But you're right, it's too late to change this now. Not really. There is a procedure for making non-backwards compatible changes. If you care deeply enough about this, you could agitate for Python 3.2 to raise a PendingDepreciation warning for unexpected escape sequences like \z, How does one do this? Not that I necessarily think that it is important enough a nit to break a lot of existing code. Also, if I agitate for change, then in the future people might actually accurately accuse me of agitating for change, when typically I just come here for a good argument, and I provide a connected series of statements intended to establish a proposition, but in return I receive merely the automatic gainsaying of any statement I make. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 11, 4:38 pm, Ethan Furman et...@stoneleaf.us wrote: Mind you, I'm not really vested in how Python *should* handle backslashes one way or the other, but I am glad it has rules that it follows for consitent results, and I don't have to break out a byte-code editor to find out what's in my string literal. I don't understand your comment. C++ generates a warning if you use an undefined escape sequence, which indicates that your program should be fixed. If the escape sequence isn't undefined, then C++ does the same thing as Python. It would be *even* better if C++ generated a fatal error in this situation. (g++ probably has an option to make warnings fatal, but I don't happen to know what that option is.) g++ might not generate an error so that you can compile legacy C code with it. In any case, my argument has consistently been that Python should have treated undefined escape sequences consistently as fatal errors, not as warnings. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
I wrote: But you're right, it's too late to change this now. P.S. But if it weren't too late, I think that your idea to have \s be the escape sequence for a backslash instead of \\ might be a good one. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Carl Banks wrote: IOW it's an error-prone mess. It would be better if Python (like C) treated \ consistently as an escape character. (And in raw strings, consistently as a literal.) Agreed. For one thing, if another escape character ever has to be added to the language, that may change the semantics of previously correct strings. If \ followed by a non-special character is treated as an error, that doesn't happen. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote: Steven D'Aprano wrote: Why should a backslash in a string literal be an error? Because in Python, if my friend sees the string foo\xbar\n, he has no idea whether the \x is an escape sequence, or if it is just the characters \x, unless he looks it up in the manual, or tries it out in the REPL, or what have you. Fair enough, but isn't that just another way of saying that if you look at a piece of code and don't know what it does, you don't know what it does unless you look it up or try it out? My friend is adamant that it would be better if he could just look at the string literal and know. He doesn't want to be bothered to have to store stuff like that in his head. He wants to be able to figure out programs just by looking at them, to the maximum degree that that is feasible. I actually sympathize strongly with that attitude. But, honestly, your friend is a programmer (or at least pretends to be one *wink*). You can't be a programmer without memorizing stuff: syntax, function calls, modules to import, quoting rules, blah blah blah. Take C as an example -- there's absolutely nothing about () that says group expressions or call a function and {} that says group a code block. You just have to memorize it. If you don't know what a backslash escape is going to do, why would you use it? I'm sure your friend isn't in the habit of randomly adding backslashes to strings just to see whether it will still compile. This is especially important when reading (as opposed to writing) code. You read somebody else's code, and see foo\xbar\n. Let's say you know it compiles without warning. Big deal -- you don't know what the escape codes do unless you've memorized them. What does \n resolve to? chr(13) or chr(97) or chr(0)? Who knows? Unless you know the rules, you have no idea what is in the string. Allowing \y to resolve to a literal backslash followed by y doesn't change that. All it means is that some \c combinations return a single character, and some return two. In comparison to Python, in C++, he can just look foo\xbar\n and know that \x is a special character. (As long as it compiles without warnings under g++.) So what you mean is, he can just look at foo\xbar\n AND COMPILE IT USING g++, and know whether or not \x is a special character. [sarcasm] Gosh. That's an enormous difference from Python, where you have to print the string at the REPL to know what it does. [/sarcasm] Aside: \x isn't a special character: \x ValueError: invalid \x escape However, \xba is: \xba '\xba' len(\xba) 1 ord(\xba) 186 He's particularly annoyed too, that if he types foo\xbar at the REPL, it echoes back as foo\\xbar. He finds that to be some sort of annoying DWIM feature, and if Python is going to have DWIM features, then it should, for example, figure out what he means by \ and not bother him with a syntax error in that case. Now your friend is confused. This is a good thing. Any backslash you see in Python's default string output is *always* an escape: a string with a 'proper' escape \t (tab) a string with a 'proper' escape \t (tab) a string with an 'improper' escape \y (backslash-y) a string with an 'improper' escape \\y (backslash-y) The REPL is actually doing him a favour. It always escapes backslashes, so there is no ambiguity. A backslash is displayed as \\, any other \c is a special character. Of course I think that he's overreacting a bit. :) My point of view is that every language has *some* warts; Python just has a bit fewer than most. It would have been nice, I should think, if this wart had been fixed in Python 3, as I do consider it to be a minor wart. And if anyone had cared enough to raise it a couple of years back, it possibly might have been. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote: Why should a backslash in a string literal be an error? Because the behavior of \ in a string is context-dependent, which means a reader can't know if \ is a literal character or escape character without knowing the context, and it means an innocuous change in context can cause a rather significant change in \. *Any* change in context is significant with escapes. this \nhas two lines If you change the \n to a \t you get a significant difference. If you change the \n to a \y you get a significant difference. Why is the first one acceptable but the second not? IOW it's an error-prone mess. I've never had any errors caused by this. I've never seen anyone write to this newsgroup confused over escape behaviour, or asking for help with an error caused by it, and until this thread, never seen anyone complain about it either. Excuse my cynicism, but I believe that you are using error-prone to mean I don't like this behaviour rather than it causes lots of errors. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Sun, 09 Aug 2009 23:03:14 -0700, John Nagle wrote: if another escape character ever has to be added to the language, that may change the semantics of previously correct strings. And that's the only argument in favour of prohibiting non-special backslash sequences I've seen yet that is even close to convincing. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 2:03 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote: Because in Python, if my friend sees the string foo\xbar\n, he has no idea whether the \x is an escape sequence, or if it is just the characters \x, unless he looks it up in the manual, or tries it out in the REPL, or what have you. Fair enough, but isn't that just another way of saying that if you look at a piece of code and don't know what it does, you don't know what it does unless you look it up or try it out? Not really. It's more like saying that easy things should be easy, and hard things should possible. But in this case, Python is making something that should be really easy, a bit harder and more error prone than it should be. In C++, if I know that the code I'm looking at compiles, then I never need worry that I've misinterpreted what a string literal means. At least not if it doesn't have any escape characters in it that I'm not familiar with. But in Python, if I see, \f\o\o\b\a\z, I'm not really sure what I'm seeing, as I surely don't have committed to memory some of the more obscure escape sequences. If I saw this in C++, and I knew that it was in code that compiled, then I'd at least know that there are some strange escape codes that I have to look up. Unlike with Python, it would never be the case in C++ code that the programmer who wrote the code was just too lazy to type in \\f\\o\\o\\b\\a\\z instead. My friend is adamant that it would be better if he could just look at the string literal and know. He doesn't want to be bothered to have to store stuff like that in his head. He wants to be able to figure out programs just by looking at them, to the maximum degree that that is feasible. I actually sympathize strongly with that attitude. But, honestly, your friend is a programmer (or at least pretends to be one *wink*). Actually, he's probably written more code than you, me, and ten other random decent programmers put together. As he can slap out massive amounts of code very quickly, he'd prefer not to have crap getting in his way. In the time it takes him to look something up, he might have written another page of code. He's perfectly capable of dealing with crap, as years of writing large programs in Perl and PHP quickly proves, but his whole reason for learning Python, I take it, is so that he will be bothered with less crap and therefore write code even faster. You can't be a programmer without memorizing stuff: syntax, function calls, modules to import, quoting rules, blah blah blah. Take C as an example -- there's absolutely nothing about () that says group expressions or call a function and {} that says group a code block. I don't really think that this is a good analogy. It's like the difference between remembering rules of grammar and remembering English spelling. As a kid, I was the best in my school at grammar, and one of the worst at speling. You just have to memorize it. If you don't know what a backslash escape is going to do, why would you use it? (1) You're looking at code that someone else wrote, or (2) you forget to type \\ instead of \ in your code (or get lazy sometimes), as that is okay most of the time, and you inadvertently get a subtle bug. This is especially important when reading (as opposed to writing) code. You read somebody else's code, and see foo\xbar\n. Let's say you know it compiles without warning. Big deal -- you don't know what the escape codes do unless you've memorized them. What does \n resolve to? chr(13) or chr(97) or chr(0)? Who knows? It *is* a big deal. Or at least a non-trivial deal. It means that you can tell just by looking at the code that there are funny characters in the string, and not just a backslashes. You don't have to go running for the manual every time you see code with backslashes, where the upshot might be that the programmer was merely saving themselves some typing. In comparison to Python, in C++, he can just look foo\xbar\n and know that \x is a special character. (As long as it compiles without warnings under g++.) So what you mean is, he can just look at foo\xbar\n AND COMPILE IT USING g++, and know whether or not \x is a special character. I'm not sure that your comments are paying due diligence to full life-cycle software development issues that involve multiple programmers (or even just your own program that you wrote a year ago, and you don't remember all the details of what you did) combined with maintaining and modifying existing code, etc. Aside: \x isn't a special character: \x ValueError: invalid \x escape I think that this all just goes to prove my friend's point! Here I've been programming in Python for more than a decade (not full time, mind you, as I also program in other languages, like C++), and even I didn't know that \xba was an escape sequence, and I inadvertently introduced a subtle bug into my argument
Re: Unrecognized escape sequences in string literals
On Aug 9, 11:10 pm, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote: Why should a backslash in a string literal be an error? Because the behavior of \ in a string is context-dependent, which means a reader can't know if \ is a literal character or escape character without knowing the context, and it means an innocuous change in context can cause a rather significant change in \. *Any* change in context is significant with escapes. this \nhas two lines If you change the \n to a \t you get a significant difference. If you change the \n to a \y you get a significant difference. Why is the first one acceptable but the second not? Because when you change \n to \t, you've haven't changed the meaning of the \ character; but when you change \n to \y, you have, and you did so without even touching the backslash. IOW it's an error-prone mess. I've never had any errors caused by this. Thank you for your anecdotal evidence. Here's mine: This has gotten me at least twice, and a compiler complaint would have reduced my bug- hunting time from tens of minutes to ones of seconds. [Aside: it was when I was using Python on Windows for the first time] I've never seen anyone write to this newsgroup confused over escape behaviour, or asking for help with an error caused by it, and until this thread, never seen anyone complain about it either. More anecdotal evidence. Here's mine: I have. Excuse my cynicism, but I believe that you are using error-prone to mean I don't like this behaviour rather than it causes lots of errors. No, I'm using error-prone to mean error-prone. Someone (obviously not you because you're have perfect knowledge of the language and 100% situation awareness at all times) might have a string like abcd\stuv and change it to abcd\tuvw without even thinking about the fact that the s comes after the backslash. Worst of all: they might not even notice the error, because the repr of this string is: 'abcd\tuwv' They might not notice that the backslash is single, because (unlike you) mortal fallible human beings don't always register tiny details like a backslash being single when it should be double. Point is, this is a very bad inconsistency. It makes the behavior of \ impossible to learn by analogy, now you have to memorize a list of situations where it behaves one way or another. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 2:10 am, Steven D'Aprano I've never had any errors caused by this. But you've seen an error caused by this, in this very discussion. I.e., foo\xbar. \xba isn't an escape sequence in any other language that I've used, which is one reason I made this error... Oh, wait a minute -- it *is* an escape sequence in JavaScript. But in JavaScript, while \xba is a special character, \xb is synonymous with xb. The fact that every language seems to treat these things similarly but differently, is yet another reason why they should just be treated utterly consistently by all of the languages: I.e., escape sequences that don't have a special meaning should be an error! I've never seen anyone write to this newsgroup confused over escape behaviour, My friend objects strongly the claim that he is confused by it, so I guess you are right that no one is confused. He just thinks that it violates the beautiful sense of aesthetics that he was sworn over and over again Python to have. But aesthetics is a non-negligible issue with practical ramifications. (Not that anything can be done about this wart at this point, however.) or asking for help with an error caused by it, and until this thread, never seen anyone complain about it either. Oh, this bothered me too when I first learned Python, and I thought it was stupid. It just didn't bother me enough to complain publicly. Besides, the vast majority of Python noobs don't come here, despite appearance sometimes, and by the time most people get here, they've probably got bigger fish to fry. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Mon, 10 Aug 2009 00:37:33 -0700, Carl Banks wrote: On Aug 9, 11:10 pm, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote: Why should a backslash in a string literal be an error? Because the behavior of \ in a string is context-dependent, which means a reader can't know if \ is a literal character or escape character without knowing the context, and it means an innocuous change in context can cause a rather significant change in \. *Any* change in context is significant with escapes. this \nhas two lines If you change the \n to a \t you get a significant difference. If you change the \n to a \y you get a significant difference. Why is the first one acceptable but the second not? Because when you change \n to \t, you've haven't changed the meaning of the \ character; I assume you mean the \ character in the literal, not the (non-existent) \ character in the string. but when you change \n to \y, you have, and you did so without even touching the backslash. Not at all. '\n' maps to the string chr(10). '\y' maps to the string chr(92) + chr(121). In both cases the backslash in the literal have the same meaning: grab the next token (usually a single character, but not always), look it up in a mapping somewhere, and insert the result in the string object being built. (I don't know if the *implementation* is precisely as described, but that's irrelevant. It's still functionally a mapping.) IOW it's an error-prone mess. I've never had any errors caused by this. Thank you for your anecdotal evidence. Here's mine: This has gotten me at least twice, and a compiler complaint would have reduced my bug- hunting time from tens of minutes to ones of seconds. [Aside: it was when I was using Python on Windows for the first time] Okay, that's twice in, how many years have you been programming? I've mistyped xrange as xrnage two or three times. Does that make xrange() an error-prone mess too? Probably not. Why is my mistake my mistake, but your mistake the language's fault? [...] Oh, wait, no, I tell I lie -- I *have* seen people reporting bugs here caused by backslashes. They're invariably Windows programmers writing pathnames using backslashes, so I'll give you that one: if you don't know that Python treats backslashes as special in string literals, you will screw up your Windows pathnames. Interestingly, the problem there is not that \y resolves to literal backslash followed by y, but that \t DOESN'T resolve to the expected backslash-t. So it seems to me that the problem for Windows coders is not that \y doesn't raise an error, but the mere existence of backslash escapes. Someone (obviously not you because you're have perfect knowledge of the language and 100% situation awareness at all times) might have a string like abcd\stuv and change it to abcd\tuvw without even thinking about the fact that the s comes after the backslash. Deary me. And they might type 4+15 instead of 4*51, and now arithmetic is an error-prone mess too. If you know of a programming language which can prevent you making semantic errors, please let us all know what it is. If you edit code without thinking, you will be burnt, and you get *zero* sympathy from me. Worst of all: they might not even notice the error, because the repr of this string is: 'abcd\tuwv' They might not notice that the backslash is single, because (unlike you) mortal fallible human beings don't always register tiny details like a backslash being single when it should be double. Help help, 123145 looks too similar to 1231145, and now I calculated my taxes wrong and will go to jail!!! Point is, this is a very bad inconsistency. It makes the behavior of \ impossible to learn by analogy, now you have to memorize a list of situations where it behaves one way or another. No, you don't have to memorize anything, you can go right ahead and escape every backslash, as I did for years. Your code will still work fine. You already have to memorize what escape codes return special characters. The only difference is whether you learn ...and everything else raises an exception or ...and everything else is returned unchanged. There is at least one good reason for preferring an error, namely that it allows Python to introduce new escape codes without going through a long, slow process. But the rest of these complaints are terribly unconvincing. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Mon, 10 Aug 2009 00:57:18 -0700, Douglas Alan wrote: On Aug 10, 2:10 am, Steven D'Aprano I've never had any errors caused by this. But you've seen an error caused by this, in this very discussion. I.e., foo\xbar. Your complaint is that invalid escapes like \y resolve to a literal backslash-y instead of raising an error. But \xbar doesn't contain an invalid escape, it contains a valid hex escape. Your ignorance that \xHH is a valid hex escape (for suitable hex digits) isn't an example of an error caused by invalid escapes like \y. \xba isn't an escape sequence in any other language that I've used, which is one reason I made this error... Oh, wait a minute -- it *is* an escape sequence in JavaScript. But in JavaScript, while \xba is a special character, \xb is synonymous with xb. The fact that every language seems to treat these things similarly but differently, is yet another reason why they should just be treated utterly consistently by all of the languages: I.e., escape sequences that don't have a special meaning should be an error! Perhaps all the other languages should follow Python's lead instead? Or perhaps they should follow bash's lead, and map \C to C for every character. If there were no special escapes at all, Windows programmers wouldn't keep getting burnt when they write C:\\Documents\today\foo and end up with something completely unexpected. Oh wait, no, that still wouldn't work, because they'd end up with C:\Documentstodayfoo. So copying bash doesn't work. But copying C will upset the bash coders, because they'll write some\ file\ with\ spaces and suddenly their code won't even compile!!! Seems like no matter what you do, you're going to upset *somebody*. I've never seen anyone write to this newsgroup confused over escape behaviour, My friend objects strongly the claim that he is confused by it, so I guess you are right that no one is confused. He just thinks that it violates the beautiful sense of aesthetics that he was sworn over and over again Python to have. Fair enough. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Douglas Alan darkwate...@gmail.com wrote: \xba isn't an escape sequence in any other language that I've used, which is one reason I made this error... Oh, wait a minute -- it *is* an escape sequence in JavaScript. But in JavaScript, while \xba is a special character, \xb is synonymous with xb. \xba is an escape sequence in c, c++, c#, python, javascript, perl and probably many others. \xb is an escape sequence in c, c++, c# but not in Python, Javascript, or Perl. Python will throw ValueError if you try to use \xb in a string, Javascript simply ignores the backslash. The fact that every language seems to treat these things similarly but differently, is yet another reason why they should just be treated utterly consistently by all of the languages: I.e., escape sequences that don't have a special meaning should be an error! It would be nice if these things were treated consistently, but they aren't and it seems unlikely to change. -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: Or perhaps they should follow bash's lead, and map \C to C for every character. If there were no special escapes at all, Windows programmers wouldn't keep getting burnt when they write C:\\Documents\today\foo and end up with something completely unexpected. Oh wait, no, that still wouldn't work, because they'd end up with C:\Documentstodayfoo. So copying bash doesn't work. There is of course no problem at all so long as you stick to writing your paths as MS intended them to be written: 8.3 and UPPERCASE C:\DOCUME~1\TODAY\FOO 'C:\\DOCUME~1\\TODAY\\FOO' :^) -- Duncan Booth http://kupuguy.blogspot.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Mon, 10 Aug 2009 00:32:30 -0700, Douglas Alan wrote: In C++, if I know that the code I'm looking at compiles, then I never need worry that I've misinterpreted what a string literal means. If you don't know what your string literals are, you don't know what your program does. You can't expect the compiler to save you from semantic errors. Adding escape codes into the string literal doesn't change this basic truth. Semantics matters, and unlike syntax, the compiler can't check it. There's a difference between a program that does the equivalent of: os.system(cp myfile myfile~) and one which does this os.system(rm myfile myfile~) The compiler can't save you from typing 1234 instead of 11234, or 31.45 instead of 3.145, or My darling Ho instead of My darling Jo, so why do you expect it to save you from typing abc\d instead of abc\\d? Perhaps it can catch *some* errors of that type, but only at the cost of extra effort required to defeat the compiler (forcing the programmer to type \\d to prevent the compiler complaining about \d). I don't think the benefit is worth the cost. You and your friend do. Who is to say you're right? At least not if it doesn't have any escape characters in it that I'm not familiar with. But in Python, if I see, \f\o\o\b\a\z, I'm not really sure what I'm seeing, as I surely don't have committed to memory some of the more obscure escape sequences. If I saw this in C++, and I knew that it was in code that compiled, then I'd at least know that there are some strange escape codes that I have to look up. And if you saw that in Python, you'd also know that there are some strange escape codes that you have to look up. Fortunately, in Python, that's really simple: \f\o\o\b\a\z '\x0c\\o\\o\x08\x07\\z' Immediately you can see that the \o and \z sequences resolve to themselves, and the \f \b and \a don't. Unlike with Python, it would never be the case in C++ code that the programmer who wrote the code was just too lazy to type in \\f\\o\\o\\b\\a\\z instead. But if you see abc\n, you can't be sure whether the lazy programmer intended abc+newline, or abc+backslash+n. Either way, the compiler won't complain. You just have to memorize it. If you don't know what a backslash escape is going to do, why would you use it? (1) You're looking at code that someone else wrote, or (2) you forget to type \\ instead of \ in your code (or get lazy sometimes), as that is okay most of the time, and you inadvertently get a subtle bug. The same error can occur in C++, if you intend \\n but type \n by mistake. Or vice versa. The compiler won't save you from that. This is especially important when reading (as opposed to writing) code. You read somebody else's code, and see foo\xbar\n. Let's say you know it compiles without warning. Big deal -- you don't know what the escape codes do unless you've memorized them. What does \n resolve to? chr(13) or chr(97) or chr(0)? Who knows? It *is* a big deal. Or at least a non-trivial deal. It means that you can tell just by looking at the code that there are funny characters in the string, and not just a backslashes. I'm not entirely sure why you think that's a big deal. Strictly speaking, there are no funny characters, not even \0, in Python. They're all just characters. Perhaps the closest is newline (which is pretty obvious). You don't have to go running for the manual every time you see code with backslashes, where the upshot might be that the programmer was merely saving themselves some typing. Why do you care if there are funny characters? In C++, if you see an escape you don't recognize, do you care? Do you go running for the manual? If the answer is No, then why do it in Python? And if the answer is Yes, then how is Python worse than C++? [...] Also, it seems that Python is being inconsistent here. Python knows that the string \x doesn't contain a full escape sequence, so why doesn't it treat the string \x the same way that it treats the string \z? [...] I.e., \z is not a legal escape sequence, so it gets left as \\z. No. \z *is* a legal escape sequence, it just happens to map to \z. If you stop thinking of \z as an illegal escape sequence that Python refuses to raise an error for, the problem goes away. It's a legal escape sequence that maps to backslash + z. \x is not a legal escape sequence. Shouldn't it also get left as \\x? No, because it actually is an illegal escape sequence. He's particularly annoyed too, that if he types foo\xbar at the REPL, it echoes back as foo\\xbar. He finds that to be some sort of annoying DWIM feature, and if Python is going to have DWIM features, then it should, for example, figure out what he means by \ and not bother him with a syntax error in that case. Now your friend is confused. This is a good thing. Any backslash you see in Python's default string output is *always* an escape: Well, I think he's more
Re: Unrecognized escape sequences in string literals
Steven D'Aprano wrote: On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote: [snip] My point of view is that every language has *some* warts; Python just has a bit fewer than most. It would have been nice, I should think, if this wart had been fixed in Python 3, as I do consider it to be a minor wart. And if anyone had cared enough to raise it a couple of years back, it possibly might have been. My preference would've been that a backslash followed by A-Z, a-z, or 0-9 is special, but a backslash followed by any other character is just the character, except for backslash followed by a newline, which suppresses the newline. I would also have preferred a backslash in a raw string to always be a literal. Ah well, something for Python 4.x. :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 4:37 am, Steven D'Aprano There is at least one good reason for preferring an error, namely that it allows Python to introduce new escape codes without going through a long, slow process. But the rest of these complaints are terribly unconvincing. What about: o Beautiful is better than ugly o Explicit is better than implicit o Simple is better than complex o Readability counts o Special cases aren't special enough to break the rules o Errors should never pass silently ? And most importantly: o In the face of ambiguity, refuse the temptation to guess. o There should be one -- and preferably only one -- obvious way to do it. ? So, what's the one obvious right way to express foo\zbar? Is it foo\zbar or foo\\zbar And if it's the latter, what possible benefit is there in allowing the former? And if it's the former, why does Python echo the latter? |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Douglas Alan wrote: So, what's the one obvious right way to express foo\zbar? Is it foo\zbar or foo\\zbar And if it's the latter, what possible benefit is there in allowing the former? And if it's the former, why does Python echo the latter? Actually, if we were designing from fresh (with no C behind us), I might advocate for \s to be the escape sequence for a backslash. I don't particularly like that it is hard to see if the following string contains a tab: abc\table. The string rules reflect C's rules, and I see little excuse for trying to change them now. --Scott David Daniels scott.dani...@acm.org -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 10:58 am, Scott David Daniels scott.dani...@acm.org wrote: The string rules reflect C's rules, and I see little excuse for trying to change them now. No they don't. Or at least not C++'s rules. C++ behaves exactly as I should like. (Or at least g++ does. Or rather *almost* as I would like, as by default it generates a warning for foo\zbar, while I think that an error would be somewhat preferable.) But you're right, it's too late to change this now. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 4:41 am, MRAB pyt...@mrabarnett.plus.com wrote: Steven D'Aprano wrote: On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote: [snip] My point of view is that every language has *some* warts; Python just has a bit fewer than most. It would have been nice, I should think, if this wart had been fixed in Python 3, as I do consider it to be a minor wart. And if anyone had cared enough to raise it a couple of years back, it possibly might have been. My preference would've been that a backslash followed by A-Z, a-z, or 0-9 is special, but a backslash followed by any other character is just the character, except for backslash followed by a newline, which suppresses the newline. That would be reasonable; it'd match the behavior of regexps. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 10, 1:37 am, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Mon, 10 Aug 2009 00:37:33 -0700, Carl Banks wrote: On Aug 9, 11:10 pm, Steven D'Aprano ste...@remove.this.cybersource.com.au wrote: On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote: Why should a backslash in a string literal be an error? Because the behavior of \ in a string is context-dependent, which means a reader can't know if \ is a literal character or escape character without knowing the context, and it means an innocuous change in context can cause a rather significant change in \. *Any* change in context is significant with escapes. this \nhas two lines If you change the \n to a \t you get a significant difference. If you change the \n to a \y you get a significant difference. Why is the first one acceptable but the second not? Because when you change \n to \t, you've haven't changed the meaning of the \ character; I assume you mean the \ character in the literal, not the (non-existent) \ character in the string. but when you change \n to \y, you have, and you did so without even touching the backslash. Not at all. '\n' maps to the string chr(10). '\y' maps to the string chr(92) + chr(121). In both cases the backslash in the literal have the same meaning: grab the next token (usually a single character, but not always), look it up in a mapping somewhere, and insert the result in the string object being built. That is a ridiculous rationalization. Nobody sees \y in a string and thinks it's an escape sequence that returns the bytes '\y'. [snip rest, because an argument in favor inconsistent, context- dependent behavior doesn't need any further refutation than to point out that it is an argument in favor of inconsistent, context-dependent behavior] Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote: But you're right, it's too late to change this now. Not really. There is a procedure for making non-backwards compatible changes. If you care deeply enough about this, you could agitate for Python 3.2 to raise a PendingDepreciation warning for unexpected escape sequences like \z, Python 3.3 to raise a Depreciation warning, and Python 3.4 to treat it as an error. It may even be possible to skip the PendingDepreciation warning and go straight for Depreciation warning in 3.2. -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Sun, 09 Aug 2009 12:26:54 -0700, Douglas Alan wrote: A friend of mine is just learning Python, and he's a bit tweaked about how unrecognized escape sequences are treated in Python. ... In any case, I think my friend should mellow out a bit, but we both consider this something of a wart. He's just more wart-phobic than I am. Is there any way that this behavior can be considered anything other than a wart? Other than the unconvincing claim that you can use this feature to save you a bit of typing sometimes when you actually want a backslash to be in your string? I'd put it this way: a backslash is just an ordinary character, except when it needs to be special. So Python's behaviour is treat backslash as a normal character, except for these exceptions while the behaviour your friend wants is treat a backslash as an error, except for these exceptions. Why should a backslash in a string literal be an error? -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
Steven D'Aprano wrote: Why should a backslash in a string literal be an error? Because in Python, if my friend sees the string foo\xbar\n, he has no idea whether the \x is an escape sequence, or if it is just the characters \x, unless he looks it up in the manual, or tries it out in the REPL, or what have you. My friend is adamant that it would be better if he could just look at the string literal and know. He doesn't want to be bothered to have to store stuff like that in his head. He wants to be able to figure out programs just by looking at them, to the maximum degree that that is feasible. In comparison to Python, in C++, he can just look foo\xbar\n and know that \x is a special character. (As long as it compiles without warnings under g++.) He's particularly annoyed too, that if he types foo\xbar at the REPL, it echoes back as foo\\xbar. He finds that to be some sort of annoying DWIM feature, and if Python is going to have DWIM features, then it should, for example, figure out what he means by \ and not bother him with a syntax error in that case. Another reason that Python should not behave the way that it does, is that it pegs Python into a corner where it can't add new escape sequences in the future, as doing so will break existing code. Generating a syntax error instead for unknown escape sequences would allow for future extensions. Now not to pick on Python unfairly, most other languages have similar issues with escape sequences. (Except for the Bourne Shell and bash, where \x always just means x, no matter what character x happens to be.) But I've been telling my friend for years to switch to Python because of how wonderful and consistent Python is in comparison to most other languages, and now he seems disappointed and seems to think that Python is just more of the same. Of course I think that he's overreacting a bit. My point of view is that every language has *some* warts; Python just has a bit fewer than most. It would have been nice, I should think, if this wart had been fixed in Python 3, as I do consider it to be a minor wart. |ouglas -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 9, 5:06 pm, Steven D'Aprano st...@remove-this- cybersource.com.au wrote: On Sun, 09 Aug 2009 12:26:54 -0700, Douglas Alan wrote: A friend of mine is just learning Python, and he's a bit tweaked about how unrecognized escape sequences are treated in Python. ... In any case, I think my friend should mellow out a bit, but we both consider this something of a wart. He's just more wart-phobic than I am. Is there any way that this behavior can be considered anything other than a wart? Other than the unconvincing claim that you can use this feature to save you a bit of typing sometimes when you actually want a backslash to be in your string? I'd put it this way: a backslash is just an ordinary character, except when it needs to be special. So Python's behaviour is treat backslash as a normal character, except for these exceptions while the behaviour your friend wants is treat a backslash as an error, except for these exceptions. Why should a backslash in a string literal be an error? Because the behavior of \ in a string is context-dependent, which means a reader can't know if \ is a literal character or escape character without knowing the context, and it means an innocuous change in context can cause a rather significant change in \. IOW it's an error-prone mess. It would be better if Python (like C) treated \ consistently as an escape character. (And in raw strings, consistently as a literal.) It's kind of a minor issue in terms of overall real-world importance, but in terms of raw unPythonicness this might be the worst offense the language makes. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Unrecognized escape sequences in string literals
On Aug 9, 8:06 pm, Steven D'Aprano wrote: while the behaviour your friend wants is treat a backslash as an error, except for these exceptions. Besides, can't all error situations be described as, treat the error situation as an error, except for the exception of when the situation isn't an error??? The behavior my friend wants isn't any more exceptional than that! |ouglas -- http://mail.python.org/mailman/listinfo/python-list