On 2012-10-11 06:34, Greg Ewing wrote:
Steven D'Aprano wrote:
If you escape a character, you should get
something. If it's a special character, you get the special meaning.
If it's not, escaping should be transparent: escaping something that
doesn't need escaping is a null op

I think that calling "\n", "\t" etc. "escape sequences" is a misnomer
that is causing confusion in this discussion.

The term "escape" in this context means to prevent something from
having a special meaning that it would otherwise have. But the
backslash in these is being used to *give* a special meaning to
the following character.

In Python string literals, the only true escape sequences associated
with the backslash are '\\', "\'" and '\"'.

So the backslash is a bit schizophrenic -- sometimes it's an escape
character, sometimes it's a prefix that imparts a special meaning.

This means that "\c" where c is not special in any way is somewhat
ambiguous. Are you redundantly escaping something that doesn't
need it, are you asking for a special meaning that doesn't exist
(which is probably a mistake), or do you just want a literal
backslash?

Python guesses that you want a literal backslash. This seems to be
motivated by the desire to minimise the need for backslash doubling.
That sounds fine in theory, but I don't think it helps much in
practice. I for one don't trust myself to keep the entire set of
special characters in my head, including all the rarely-used ones,
so I end up doubling every backslash anyway.

Given that, I wouldn't have minded at all if Python had refused
to guess in this case, and raised a compile-time error. That would
have left the way open for extending the set of special chars in
the future.

Adding a new escape sequence is almost as big a step as adding a new
built-in or new syntax. I see that as a good thing, it discourages too
many requests for new escape sequences.

I don't see it makes much difference. We get plenty of requests for
new syntax of all kinds, and we seem to have enough sense to reject
them unless they're backed by extremely good arguments. There's no
reason requests for new special chars should be treated any differently.

My own preference is that a backslash followed by an ASCII letter or
digit either has a special meaning currently (with a compile-time error
if it's not correctly formed) or is reserved for future use (with a
compile-time currently), and that a backslash followed by any other
character (codepoint) is a literal (although they may some exceptions
to that, such as a backslash followed by a newline being ignored).
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to