On Tue, 11 Aug 2009 14:29:43 -0700, Douglas Alan wrote:

> I need to preface this entire post with the fact that I've already used
> ALL of the arguments that you've provided on my friend before I ever
> even came here with the topic, and my own arguments on why Python can be
> considered to be doing the right thing on this issue didn't even
> convince ME, much less him. When I can't even convince myself with an
> argument I'm making, then you know there's a problem with it!


I hear all your arguments, and to play Devil's Advocate I repeat them, 
and they don't convince me either. So by your logic, there's obviously a 
problem with your arguments as well!

That problem basically boils down to a deep-seated philosophical 
disagreement over which philosophy a language should follow in regard to 
backslash escapes:

"Anything not explicitly permitted is forbidden"

versus  

"Anything not explicitly forbidden is permitted"

Python explicitly permits all escape sequences, with well-defined 
behaviour, with the only ones forbidden being those explicitly forbidden:

* hex escapes with invalid hex digits;

* oct escapes with invalid oct digits;

* Unicode named escapes with unknown names;

* 16- and 32-bit Unicode escapes with invalid hex digits.

C++ apparently forbids all escape sequences, with unspecified behaviour 
if you use a forbidden sequence, except for a handful of explicitly 
permitted sequences.

That's not better, it's merely different.

Actually, that's not true -- that the C++ standard forbids a thing, but 
leaves the consequences of doing that thing unspecified, is clearly a Bad 
Thing.



[...]

>> Apart from the lack of warning, what actually is the difference between
>> Python's behavior and C++'s behavior?
> 
> That question makes just about as much sense as, "Apart from the lack of
> a fatal error, what actually is the difference between Python's behavior
> and C++'s?"

This is what I get:

[steve ~]$ cat test.cc
#include <iostream>
int main(int argc, char* argv[])
{
std::cout << "x\yz" << std::endl;
return 0;
}
[steve ~]$ g++ test.cc -o test
test.cc:4:14: warning: unknown escape sequence '\y'
[st...@soy ~]$ ./test
xyz


So on at least one machine in the world, C++ simply strips out 
backslashes that it doesn't recognise, leaving the suffix. Unfortunately, 
we can't rely on that, because C++ is underspecified. Fortunately this is 
not a problem with Python, which does completely specify the behaviour of 
escape sequences so there are no surprises. 



[...]

>> I disagree with your sense of aesthetics. I think that having to write
>> \\y when I want \y just to satisfy a bondage-and-discipline compiler is
>> ugly. That's not to deny that B&D isn't useful on occasion, but in this
>> case I believe the benefit is negligible, and so even a tiny cost is
>> not worth the pain.
> 
> EXPLICIT IS BETTER THAN IMPLICIT.

Quoting the Zen without understanding (especially shouting) doesn't 
impress anyone. There's nothing implicit about escape sequences. \y is 
perfectly explicit. Look Ma, there's a backslash, and a y, it gives a 
backslash and a y!

Implicit has an actual meaning. You shouldn't use it as a mere term of 
opprobrium for anything you don't like.



>> > (2) That argument disagrees with the Python reference manual, which
>> > explicitly states that "unrecognized escape sequences are left in the
>> > string unchanged", and that the purpose for doing so is because it
>> > "is useful when debugging".
>>
>> How does it disagree? \y in the source code mapping to \y in the string
>> object is the sequence being left unchanged. And the usefulness of
>> doing so is hardly a disagreement over the fact that it does so.
> 
> Because you've stated that "\y" is a legal escape sequence, while the
> Python Reference Manual explicitly states that it is an "unrecognized
> escape sequence", and that such "unrecognized escape sequences" are
> sources of bugs.

There's that reading comprehension problem again.

Unrecognised != illegal.

"Useful for debugging" != "source of bugs". If they were equal, we could 
fix an awful lot of bugs by throwing away our debugging tools.

Here's the URL to the relevant page:
http://www.python.org/doc/2.5.2/ref/strings.html

It seems to me that the behaviour the Python designers were looking to 
avoid was the case where the coder accidentally inserted a backslash in 
the wrong place, and the language stripped the backslash out, e.g.:

Wanted "a\bcd" but accidentally typed "ab\cd" instead, and got "abcd".

(This is what Bash does by design, and at least some C/C++ compilers do, 
perhaps by accident, perhaps by design.)

In that case, with no obvious backslash, the user may not even be aware 
that there was a problem:

s = "ab\cd"  # assume the backslash is silently discarded
assert len(s) == 4
assert s[3] == 'c'
assert '\\' not in s

All of these tests would wrongly pass, but with Python's behaviour of 
leaving the backslash in, they would all fail, and the string is visually 
distinctive (it has an obvious backslash in it).

Now, if you consider that \c should be an error, then obviously it would 
be even better if "ab\cd" would raise a SyntaxError. But why consider \c 
to be an error?



[invalid hex escape sequences]

>> > What makes it "illegal". As far as I can tell, it's just another
>> > "unrecognized escape sequence".
>>
>> No, it's recognized, because \x is the prefix for an hexadecimal escape
>> code. And it's illegal, because it's missing the actual hexadecimal
>> digits.
> 
> So? Why does that make it "illegal" rather than merely "unrecognized?"

Because the empty string is not a legal pair of hex digits.

In '\y', the suffix y is a legal character, but it isn't recognized as a 
"special" character.

In '\x', the suffix '' is not a pair of hex digits. Since hex-escapes are 
documented as requiring a pair of hex digits, this is an error.


[...]
> Because anyone with common sense will agree that "\y" is an illegal
> escape sequence. 

"No True Scotsman would design a language that behaves like that!!!!"

Why should it be illegal? It seems like a perfectly valid escape sequence 
to me, so long as the semantics are specified explicitly.



[...]

>> > It may not be a complex form of DWIMing, but it's still DWIMing a
>> > bit.  Python is figuring that if I typed "\z", then either I must
>> > have really meant to type "\\z",
>>
>> Nope, not in the least. Python NEVER EVER EVER tries to guess what you
>> mean.
> 
> Neither does Perl. That doesn't mean that Perl isn't often DWIMy.

Fine, but we're not discussing Perl, we're discussing Python. Perl's DWIM-
iness is irrelevant.



>> This is *exactly* like C++, except that in Python the semantics of \y
>> and \\y are identical. Python doesn't guess what you mean, it *imposes*
>> a meaning on the escape sequence. You just don't like that meaning.
> 
> That's because I don't like things that are ill-conceived.

And yet you like C++... go figure *wink*


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to