Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Hendrik van Rooyen

Steven D'Aprano st...@remove-this-c...e.com.au wrote:

Now that I understand what the semantics of cout  Hello world are, I 
don't have any problem with it either. It is a bit weird, Hello world 
 cout would probably be better, but it's hardly the strangest design in 
any programming language, and it's probably influenced by input 
redirection using  in various shells.

I find it strange that you would prefer:

Hello world  cout 
over:
cout  Hello world 

The latter seems to me to be more in line with normal assignment: -
Take what is on the right and make the left the same.
I suppose it is because we read from left to right that the first one seems 
better to you.
Another instance of how different we all are.

It goes down to the assembler - there are two schools:

mova,b  - for Intel like languages, this means move b to a
mova,b  - for Motorola like languages, this means move a to b

Gets confusing sometimes.

- Hendrik



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 09:24:36 +0200, Hendrik van Rooyen wrote:

Steven D'Aprano st...@remove-this-c...e.com.au wrote:
 
Now that I understand what the semantics of cout  Hello world are, I
don't have any problem with it either. It is a bit weird, Hello world
 cout would probably be better, but it's hardly the strangest design
 in
any programming language, and it's probably influenced by input
redirection using  in various shells.
 
 I find it strange that you would prefer:
 
 Hello world  cout
 over:
 cout  Hello world
 
 The latter seems to me to be more in line with normal assignment: - Take
 what is on the right and make the left the same. 

I don't like normal assignment. After nearly four decades of mathematics 
and programming, I'm used to it, but I don't think it is especially good. 
It confuses beginners to programming: they get one set of behaviour 
drilled into them in maths class, and then in programming class we use 
the same notation for something which is almost, but not quite, the same. 
Consider the difference between:

y = 3 + x
x = z

as a pair of mathematics expressions versus as a pair of assignments. 
What conclusion can you draw about y and z?

Even though it looks funny due to unfamiliarity, I'd love to see the 
results of a teaching language that used notation like:

3 + x - y
len(alist) - n
Widget(1, 2, 3).magic - obj
etc.

for assignment. My prediction is that it would be easier to learn, and 
just as good for experienced coders. The only downside (apart from 
unfamiliarity) is that it would be a little bit harder to find the 
definition of a variable by visually skimming lines of code: your eyes 
have to zig-zag back and forth to find the end of the line, instead of 
running straight down the left margin looking for myvar =  But it 
should be easy enough to search for - myvar.


 I suppose it is because
 we read from left to right that the first one seems better to you.

Probably.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Douglas Alan
On Aug 16, 4:22 am, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

 I don't like normal assignment. After nearly four decades of mathematics
 and programming, I'm used to it, but I don't think it is especially good.
 It confuses beginners to programming: they get one set of behaviour
 drilled into them in maths class, and then in programming class we use
 the same notation for something which is almost, but not quite, the same.
 Consider the difference between:

 y = 3 + x
 x = z

 as a pair of mathematics expressions versus as a pair of assignments.
 What conclusion can you draw about y and z?

Yeah, the syntax most commonly used for assignment today sucks. In the
past, it was common to see languages with syntaxes like

   y - y + 1

or

   y := y + 1

or

   let y = y + 1

But these languages have mostly fallen out of favor. The popular
statistical programming language R still uses the

   y - y + 1

syntax, though.

Personally, my favorite is Lisp, which looks like

   (set! y (+ y 1))

or

   (let ((x 3)
 (y 4))
 (foo x y))

I like to be able to read everything from left to right, and Lisp does
that more than any other programming language.

I would definitely not like a language that obscures assignment by
moving it over to the right side of lines.

|ouglas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Erik Max Francis

Steven D'Aprano wrote:
I don't like normal assignment. After nearly four decades of mathematics 
and programming, I'm used to it, but I don't think it is especially good. 
It confuses beginners to programming: they get one set of behaviour 
drilled into them in maths class, and then in programming class we use 
the same notation for something which is almost, but not quite, the same. 
Consider the difference between:


y = 3 + x
x = z

as a pair of mathematics expressions versus as a pair of assignments. 
What conclusion can you draw about y and z?


What you're saying is true, but it's still a matter of terminology.  The 
symbol = means different things in different contexts, and mathematics 
and programming are very different ones indeed.  The problem is 
compounded with early languages which lazily confused the two in 
different context, such as (but not exclusive to) BASIC using = for both 
assignment and equality testing in what are in esssence totally 
unrelated contexts.


Even though it looks funny due to unfamiliarity, I'd love to see the 
results of a teaching language that used notation like:


3 + x - y
len(alist) - n
Widget(1, 2, 3).magic - obj
etc.

for assignment. My prediction is that it would be easier to learn, and 
just as good for experienced coders.


This really isn't new at all.  Reverse the arrow and the relationship to 
get::


y - x + 3

(and use a real arrow rather than ASCII) and that's assignment in APL 
and a common representation in pseudocode ever since.  Change it to := 
and that's what Pascal used, as well as quite a few mathematical papers 
dealing with iterative computations, I might add.


Once you get past the point of realizing that you really need to make a 
distinction between assignment and equality testing, then it's just a 
matter of choosing two different operators for the job.  Whether it's 
-/= or :=/= or =/== or -/= (with reversed behavior for assignment) is 
really academic and a matter of taste at that point.


Given the history of programming languages, it doesn't really look like 
the to-be-assigned variable being at the end of expression is going to 
get much play, since not a single major one I'm familiar with does it 
that way, and a lot of them have come up with the same convention 
independently and haven't seen a need to change.


--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Skype erikmaxfrancis
  Get there first with the most men.
   -- Gen. Nathan Bedford Forrest, 1821-1877
--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Erik Max Francis

Douglas Alan wrote:

Personally, my favorite is Lisp, which looks like

   (set! y (+ y 1))


For varying values of Lisp.  `set!` is Scheme.

--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Skype erikmaxfrancis
  Get there first with the most men.
   -- Gen. Nathan Bedford Forrest, 1821-1877
--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Douglas Alan
On Aug 16, 4:48 am, Erik Max Francis m...@alcyone.com wrote:
 Douglas Alan wrote:
  Personally, my favorite is Lisp, which looks like

     (set! y (+ y 1))

 For varying values of Lisp.  `set!` is Scheme.

Yes, I'm well aware!

There are probably as many different dialects of Lisp as all other
programming languages put together.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Steven D'Aprano
On Sun, 16 Aug 2009 01:41:41 -0700, Douglas Alan wrote:

 I like to be able to read everything from left to right, and Lisp does
 that more than any other programming language.
 
 I would definitely not like a language that obscures assignment by
 moving it over to the right side of lines.

One could argue that left-assigned-from-right assignment obscures the 
most important part of the assignment, namely *what* you're assigning, in 
favour of what you're assigning *to*.

In any case, after half a century of left-from-right assignment, I think 
it's worth the experiment in a teaching language or three to try it the 
other way. The closest to this I know of is the family of languages 
derived from Apple's Hypertalk, where you do assignment with:

put somevalue into name

(Doesn't COBOL do something similar?)

Beginners found that *very* easy to understand, and it didn't seem to 
make coding harder for experienced Hypercard developers.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Hendrik van Rooyen
On Sunday 16 August 2009 12:18:11 Steven D'Aprano wrote:

 In any case, after half a century of left-from-right assignment, I think
 it's worth the experiment in a teaching language or three to try it the
 other way. The closest to this I know of is the family of languages
 derived from Apple's Hypertalk, where you do assignment with:

 put somevalue into name

 (Doesn't COBOL do something similar?)

Yup.

move banana to pineapple.

move accountnum in inrec to accountnum in outrec.

move corresponding inrec to outrec.

It should all be upper case of course...

I cannot quite recall, but I have the feeling that in the second  form, of 
was also allowed instead of in, but it has been a while now so I am 
probably wrong.

The move was powerful - it would do conversions for you based on the types of 
the operands - it all just worked.

- Hendrik
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread MRAB

Douglas Alan wrote:
[snip]

C++ also allows for reading from stdin like so:

   cin  myVar;

I think the direction of the arrows probably derives from languages
like APL, which had notation something like so:

 myVar - 3
 [] - myVar

- was really a little arrow symbol (APL didn't use ascii), and the
first line above would assign the value 3 to myVar. In the second
line, the [] was really a little box symbol and represented the
terminal.  Assigning to the box would cause the output to be printed
on the terminal, so the above would output 3.  If you did this:

 [] - myVar

It would read a value into myVar from the terminal.

APL predates Unix by quite a few years.


No, APL is strictly right-to-left.

- x

means goto x.

Writing to the console is:

[] - myVar

Reading from the console is:

myVar - []
--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Douglas Alan
On Aug 16, 8:45 am, MRAB pyt...@mrabarnett.plus.com wrote:

 No, APL is strictly right-to-left.

      - x

 means goto x.

 Writing to the console is:

      [] - myVar

 Reading from the console is:

      myVar - []

Ah, thanks for the correction. It's been 5,000 years since I used APL!

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Douglas Alan
On Aug 16, 6:18 am, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

 On Sun, 16 Aug 2009 01:41:41 -0700, Douglas Alan wrote:

  I would definitely not like a language that obscures assignment by
  moving it over to the right side of lines.

 One could argue that left-assigned-from-right assignment obscures the
 most important part of the assignment, namely *what* you're assigning, in
 favour of what you're assigning *to*.

The most important things are always the side-effects and the name-
bindings.

In a large program, it can be difficult to figure out where a name is
defined, or which version of a name a particular line of code is
seeing. Consequently languages should always go out of their way to
make tracking this as easy as possible.

Side effects are also a huge issue, and a source of many bugs. This is
one of the reasons that that are many functional languages that
prohibit or discourage side-effects. Side effects should be made as
obvious as is feasible.

This is why, for instance, in Scheme, variable assignment as an
exclamation mark in it. E.g.,

   (set! x (+ x 1))

The exclamation mark is to make the fact that a side effect is
happening there stand out and be immediately apparent. And C++
provides the const declaration for similar reasons.

 In any case, after half a century of left-from-right assignment, I think
 it's worth the experiment in a teaching language or three to try it the
 other way. The closest to this I know of is the family of languages
 derived from Apple's Hypertalk, where you do assignment with:

 put somevalue into name

That's okay with me, but only because the statement begins with put,
which lets you know at the very beginning of the line that something
very important is happening. You don't have to scan all the way to the
right before you notice.

Still, I would prefer

   let name = somevalue

as the let gives me the heads up right away, and then immediately
after the let is the name that I might want to be able to scan for
quickly.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-16 Thread Nobody
On Sun, 16 Aug 2009 05:05:01 +, Steven D'Aprano wrote:

 Now that I understand what the semantics of cout  Hello world are, I 
 don't have any problem with it either. It is a bit weird, Hello world 
  cout would probably be better,

Placing the stream on the LHS allows the main forms of  to be
implemented as methods of the ostream class. C++ only considers the LHS
operand when attempting to resolve an infix operator as a method.

Also,  and  are left-associative, and that cannot be changed by
overloading. Having the ostream on the LHS allows the operators to be
chained:

cout  Hello  ,   world  endl

equivalent to:

(((cout  Hello)  , )  world)  endl

[operator returns the ostream as its result.]

Even if you could make  right-associative, the values would have to be
written right-to-left:

endl  world  ,   Hello  cout
i.e.:
endl  (world  (,   (Hello  cout)))

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread Hendrik van Rooyen
On Friday 14 August 2009 18:11:52 Steven D'Aprano wrote:
 On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:
  I saw `cout' being shifted Hello world times to the left and stopped
  right there.  --Steve Gonedes

 Assuming that's something real, and not invented for humour, I presume
 that's describing something possible in C++. Am I correct? What the hell
 would it actually do???

It would shift cout left Hello World times.
It is unclear if the shift wraps around or not.

It is similar to a banana *holding his hands apart about a foot* this colour.

- Hendrik
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread Chris Rebert
On Sat, Aug 15, 2009 at 4:47 AM, Hendrik van
Rooyenhend...@microcorp.co.za wrote:
 On Friday 14 August 2009 18:11:52 Steven D'Aprano wrote:
 On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:
  I saw `cout' being shifted Hello world times to the left and stopped
  right there.  --Steve Gonedes

 Assuming that's something real, and not invented for humour, I presume
 that's describing something possible in C++. Am I correct? What the hell
 would it actually do???

 It would shift cout left Hello World times.
 It is unclear if the shift wraps around or not.

 It is similar to a banana *holding his hands apart about a foot* this colour.

 - Hendrik

I think you managed to successfully dereference the null pointer there...

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread Douglas Alan
On Aug 14, 10:25 pm, Dave Angel da...@ieee.org wrote:

 Benjamin Kaplan wrote:

  On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote:

  P.S. Overloading left shift to mean output does indeed seem a bit
  sketchy, but in 15 years of C++ programming, I've never seen it cause
  any confusion or bugs.

  The only reason it hasn't is because people use it in Hello World. I bet
  some newbie C++ programmers get confused the first time they see  used to
  shift.

People typically get confused by a *lot* of things when they learn a
new language. I think the better metric is how people fare with a
language feature once they've grown accustomed to the language, and
how long it takes them to acquire this familiarity.

 Actually, I've seen it cause confusion, because of operator precedence.  
 The logical shift operators have a fairly high level priority, so
 sometimes you need parentheses that aren't obvious.  Fortunately, most
 of those cases make compile errors.

I've been programming in C++ so long that for me, if there's any
confusion, it's the other way around. I see  or  and I think I/
O. I don't immediately think shifting. Fortunately, shifting is a
pretty rare operation to actually use, which is perhaps why C++
reclaimed it for I/O.

On the other hand, you are right that the precedence of  is messed
up for I/O. I've never seen a real-world case where this causes a bug
in C++ code, because the static type-checker always seems to catch the
error. In a dynamically typed language, this would be a much more
serious problem.

|ouglas

P.S. I find it strange, however, that anyone who is not okay with
abusing operator overloading in this manner, wouldn't also take
umbrage at Python's overloading of + to work with strings and lists,
etc. Numerical addition and sequence concatenation have entirely
different semantics.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread John Haggerty
I guess the problem is---does it actually matter?

On Fri, Aug 14, 2009 at 10:11 AM, Steven D'Aprano 
st...@remove-this-cybersource.com.au wrote:

 On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:

  I saw `cout' being shifted Hello world times to the left and stopped
  right there.  --Steve Gonedes

 Assuming that's something real, and not invented for humour, I presume
 that's describing something possible in C++. Am I correct? What the hell
 would it actually do???


 --
 Steven
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-15 Thread Douglas Alan
On Aug 14, 1:55 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

 Douglas, you and I clearly have a difference of opinion on
 this. Neither of us have provided even the tiniest amount
 of objective, replicable, reliable data on the
 error-proneness of the C++ approach versus that of
 Python. The supposed superiority of the C++ approach is
 entirely subjective and based on personal opinion instead
 of quantitative facts.

Alas, this is true for nearly any engineering methodology or
philosophy, which is why, I suppose, Perl, for instance,
still has its proponents. It's virtually impossible to prove
any thesis, and these things only get decided by endless
debate that rages across decades.

 I prefer languages that permit anything that isn't
 explicitly forbidden, so I'm happy that Python treats
 non-special escape sequences as valid,

I don't really understand what you mean by this. If Python
were to declare that unrecognized escape sequences were
forbidden, then they would be explicitly forbidden. Would
you then be happy?

If not, why are you not upset that Python won't let me do

   [3, 4, 5] + 2

Some other programming languages I've used certainly do.

 and your attempts to convince me that this goes against
 the Zen have entirely failed to convince me. As I've done
 before, I will admit that one consequence of this design
 is that it makes it hard to introduce new escape sequences
 to Python. Given that it's vanishingly rare to want to do
 so,

I'm not so convinced of that in the days of Unicode. If I
see, backslash, and then some Kanji character, what am I
supposed to make of that? For all I know, that Kanji
character might mean newline, and I'm seeing code for a
version of Python that was tweaked to be friendly to the
Japanese. And in the days where smart hand-held devices are
proliferating like crazy, there might be ever-more demand
for easy-to-use i/o that lets you control various aspects of
those devices.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread Steven D'Aprano
On Sat, 15 Aug 2009 20:00:23 -0700, Douglas Alan wrote:

 So, as far as I can tell, Python has no real authority to throw stones
 at C++ on this little tiny particular issue.

I think you're being a tad over-defensive. I asked a genuine question 
about a quote in somebody's signature. That's a quote which can be found 
all over the Internet, and the poster using it has (as far as I know) no 
official capacity to speak for Python -- while Aahz is a high-profile, 
well-respected Pythonista, he's not Guido.

Now that I understand what the semantics of cout  Hello world are, I 
don't have any problem with it either. It is a bit weird, Hello world 
 cout would probably be better, but it's hardly the strangest design in 
any programming language, and it's probably influenced by input 
redirection using  in various shells.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-15 Thread Douglas Alan
On Aug 16, 1:05 am, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:
 On Sat, 15 Aug 2009 20:00:23 -0700, Douglas Alan wrote:
  So, as far as I can tell, Python has no real authority to throw stones
  at C++ on this little tiny particular issue.

 I think you're being a tad over-defensive.

Defensive? Personally, I prefer Python over C++ by about a factor of
100X. I just find it a bit amusing when someone claims that some
programming language has a particular fatal flaw, when their own
apparently favorite language has the very same issue in an only
slightly different form.

 the poster using it has (as far as I know) no official capacity to speak
 for Python

I never thought he did. I wasn't speaking literally, as I'm not under
the opinion that any programming language has any literal authority or
any literal ability to throw stones.

 Now that I understand what the semantics of cout  Hello world are, I
 don't have any problem with it either. It is a bit weird, Hello world cout
 would probably be better, but it's hardly the strangest design in
 any programming language, and it's probably influenced by input
 redirection using  in various shells.

C++ also allows for reading from stdin like so:

   cin  myVar;

I think the direction of the arrows probably derives from languages
like APL, which had notation something like so:

 myVar - 3
 [] - myVar

- was really a little arrow symbol (APL didn't use ascii), and the
first line above would assign the value 3 to myVar. In the second
line, the [] was really a little box symbol and represented the
terminal.  Assigning to the box would cause the output to be printed
on the terminal, so the above would output 3.  If you did this:

 [] - myVar

It would read a value into myVar from the terminal.

APL predates Unix by quite a few years.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-14 Thread Aahz
In article 6e13754c-1fa6-4d1b-8861-146bffec8...@h30g2000vbr.googlegroups.com,
Douglas Alan  darkwate...@gmail.com wrote:

My friend begs to differ with the above. It would be much better for
debugging if Python generated a parsing error for unrecognized escape
sequences, rather than leaving them unchanged. g++ outputs a warning
for such escape sequences, for instance. This is what I would consider
to be the correct behavior. (Actually, I think it should just generate
a fatal parsing error, but a warning is okay too.)

Well, then, the usual response applies: create a patch, discuss it on
python-ideas, and see what happens.

(That is, nobody has previously complained so vociferously IIRC, and
adding a warning is certainly within the bounds of what's theoretically
acceptable.)
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

I saw `cout' being shifted Hello world times to the left and stopped
right there.  --Steve Gonedes
-- 
http://mail.python.org/mailman/listinfo/python-list


OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Steven D'Aprano
On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:

 I saw `cout' being shifted Hello world times to the left and stopped
 right there.  --Steve Gonedes

Assuming that's something real, and not invented for humour, I presume 
that's describing something possible in C++. Am I correct? What the hell 
would it actually do???


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Grant Edwards
On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:
 On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:

 I saw `cout' being shifted Hello world times to the left and stopped
 right there.  --Steve Gonedes

 Assuming that's something real, and not invented for humour, I presume 
 that's describing something possible in C++. Am I correct?

Yes.  In C++, the  operator is overloaded.  Judging by the
context in which I've seen it used, it does something like
write strings to a stream.

 What the hell
 would it actually do???

IIRC in C++, 

   cout  Hello world;

is equivalent to this in C:

   printf(Hellow world);

or this in Python:

   print hellow world

-- 
Grant Edwards   grante Yow! Bo Derek ruined
  at   my life!
   visi.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread MRAB

Grant Edwards wrote:

On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:

On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:


I saw `cout' being shifted Hello world times to the left and stopped
right there.  --Steve Gonedes
Assuming that's something real, and not invented for humour, I presume 
that's describing something possible in C++. Am I correct?


Yes.  In C++, the  operator is overloaded.  Judging by the
context in which I've seen it used, it does something like
write strings to a stream.


What the hell
would it actually do???


IIRC in C++, 


   cout  Hello world;


It also returns cout, so you can chain them:

cout  Hello,   name  '\n';


is equivalent to this in C:

   printf(Hellow world);

or this in Python:

   print hellow world



--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Douglas Alan
On Aug 14, 12:17 pm, Grant Edwards inva...@invalid wrote:

 On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:

  On Fri, 14 Aug 2009 07:07:31 -0700, Aahz wrote:

  I saw `cout' being shifted Hello world times to the left and stopped
  right there.  --Steve Gonedes

  Assuming that's something real, and not invented for humour, I presume
  that's describing something possible in C++. Am I correct?

 Yes.  In C++, the  operator is overloaded.  Judging by the
 context in which I've seen it used, it does something like
 write strings to a stream.

There's a persistent rumor that it is *this* very abuse of
overloading that caused Java to avoid operator overloading all
together.

But then then Java went and used + as the string concatenation
operator. Go figure!

|ouglas

P.S. Overloading left shift to mean output does indeed seem a bit
sketchy, but in 15 years of C++ programming, I've never seen it cause
any confusion or bugs.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-14 Thread Steven D'Aprano
I think I've spent enough time on this discussion, so I won't be directly 
responding to any of your recent points -- it's clear that I'm not 
persuading you that there's any justification for any behaviour for 
escape sequences other than the way C++ deals with them. That's your 
prerogative, of course, but I've done enough tilting at windmills for 
this week, so I'll just make one final comment and then withdraw from an 
unproductive argument. (I will make an effort to read any final comments 
you wish to make, so feel free to reply. Just don't expect an answer to 
any questions.)

Douglas, you and I clearly have a difference of opinion on this. Neither 
of us have provided even the tiniest amount of objective, replicable, 
reliable data on the error-proneness of the C++ approach versus that of 
Python. The supposed superiority of the C++ approach is entirely 
subjective and based on personal opinion instead of quantitative facts.

I prefer languages that permit anything that isn't explicitly forbidden, 
so I'm happy that Python treats non-special escape sequences as valid, 
and your attempts to convince me that this goes against the Zen have 
entirely failed to convince me. As I've done before, I will admit that 
one consequence of this design is that it makes it hard to introduce new 
escape sequences to Python. Given that it's vanishingly rare to want to 
do so, and that wanting to add backslashes to strings is common, I think 
that's a reasonable tradeoff. Other languages may make different 
tradeoffs, and that's fine by me.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Erik Max Francis

Grant Edwards wrote:

On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:

What the hell
would it actually do???


IIRC in C++, 


   cout  Hello world;

is equivalent to this in C:

   printf(Hellow world);

or this in Python:

   print hellow world


Well, plus or minus newlines.

--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Skype erikmaxfrancis
  It's hard to say what I want my legacy to be when I'm long gone.
   -- Aaliyah
--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Grant Edwards
On 2009-08-14, Erik Max Francis m...@alcyone.com wrote:
 Grant Edwards wrote:
 On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:
 What the hell
 would it actually do???
 
 IIRC in C++, 
 
cout  Hello world;
 
 is equivalent to this in C:
 
printf(Hellow world);
 
 or this in Python:
 
print hellow world

 Well, plus or minus newlines.

And a few miscellaneous typos...

-- 
Grant Edwards   grante Yow! I don't understand
  at   the HUMOUR of the THREE
   visi.comSTOOGES!!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Erik Max Francis

Grant Edwards wrote:

On 2009-08-14, Erik Max Francis m...@alcyone.com wrote:

Grant Edwards wrote:

On 2009-08-14, Steven D'Aprano st...@remove-this-cybersource.com.au wrote:

What the hell
would it actually do???
IIRC in C++, 


   cout  Hello world;

is equivalent to this in C:

   printf(Hellow world);

or this in Python:

   print hellow world

Well, plus or minus newlines.


And a few miscellaneous typos...


... and includes and namespaces :-).


--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Skype erikmaxfrancis
  It's hard to say what I want my legacy to be when I'm long gone.
   -- Aaliyah
--
http://mail.python.org/mailman/listinfo/python-list


Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Benjamin Kaplan
On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote:


 P.S. Overloading left shift to mean output does indeed seem a bit
 sketchy, but in 15 years of C++ programming, I've never seen it cause
 any confusion or bugs.



The only reason it hasn't is because people use it in Hello World. I bet
some newbie C++ programmers get confused the first time they see  used to
shift.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Re: OT Signature quote [was Re: Unrecognized escape sequences in string literals]

2009-08-14 Thread Dave Angel

Benjamin Kaplan wrote:

On Fri, Aug 14, 2009 at 12:42 PM, Douglas Alan darkwate...@gmail.comwrote:

  

P.S. Overloading left shift to mean output does indeed seem a bit
sketchy, but in 15 years of C++ programming, I've never seen it cause
any confusion or bugs.





The only reason it hasn't is because people use it in Hello World. I bet
some newbie C++ programmers get confused the first time they see  used to
shift.

  
Actually, I've seen it cause confusion, because of operator precedence.  
The logical shift operators have a fairly high level priority, so 
sometimes you need parentheses that aren't obvious.  Fortunately, most 
of those cases make compile errors.



C++ has about 17 levels of precedence, plus some confusing associative 
rules.  And operator overloading does *NOT* change precedence.


DaveA

--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-13 Thread Douglas Alan
On Aug 12, 7:19 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

 You are making an unjustified assumption: \y is not an error.

You are making in an unjustified assumption that I ever made such an
assumption!

My claim is and has always been NOT that \y is inately an error, but
rather that treating unrecognized escape sequences as legal escape
sequences is error PRONE.

 While I'm amused that you've made my own point for me, I'm less
 amused that you seem to be totally incapable of seeing past your
 parochial language assumptions,

Where do you get the notion that my assumptions are in any sense
parochial? They come from (1) a great deal of experience programming
very reliable software, and (2) having learned at least two dozen
different programming languages in my life.

 I disagree with nearly everything you say in this post. I think
 that a few points you make have some validity, but the vast
 majority are based on a superficial and confused understanding
 of language design principles.

Whatever. I've taken two graduate level classes at MIT on programming
languages design, and got an A in both classes, and designed my own
programming language as a final project, and received an A+. But I
guess I don't really know anything about the topic at all.

 But it's not the only reasonable design choice, and Bash has
 made a different choice, and Python has made yet a third
 reasonable choice, and Pascal made yet a fourth reasonable choice.

And so did Perl and PHP, and whatever other programming language you
happen to mention. In fact, all programming languages are equally
good, so we might as well just freeze all language design as it is
now. Clearly we can do no better.

 One party insisting that red is the only logical colour for a
 car, and that anybody who prefers white or black or blue is
 illogical, is unacceptable.

If having all cars be red saved a lot of lives, or increased gas
mileage significantly, then it might very well be the best color for a
car. But of course, that is not the case. With programming languages,
there is much more likely to be an actual fact of the matter on which
sorts of language design decisions make programmers more productive on
average, and which ones result in more reliable software.

I will certainly admit that obtaining objective data on such things is
very difficult, but it's a completely different thing that one's color
preference for their car.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Steven D'Aprano
On Tue, 11 Aug 2009 14:48:24 -0700, Douglas Alan wrote:

 In any case, my argument has consistently been that Python should have
 treated undefined escape sequences consistently as fatal errors, 

A reasonable position to take. I disagree with it, but it is certainly 
reasonable.


 not as warnings.

I don't know what language you're talking about here, because non-special 
escape sequences in Python aren't either errors or warnings:

 print ab\cd
ab\cd

No warning is made, because it's not considered an error that requires a 
warning. This matches the behaviour of other languages, including C and 
bash.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Steven D'Aprano
On Tue, 11 Aug 2009 13:20:52 -0700, Douglas Alan wrote:

 On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this-
 cybersource.com.au wrote:
 
  test.cpp:1:1: warning: unknown escape sequence '\y'

 Isn't that a warning, not a fatal error? So what does temp contain?
 
 My Annotated C++ Reference Manual is packed, and surprisingly in
 Stroustrup's Third Edition, there is no mention of the issue in the
 entire 1,000 pages. But Microsoft to the rescue:
 
  If you want a backslash character to appear within a string, you
  must type two backslashes (\\)
 
 (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx)

Should I assume that Microsoft's C++ compiler treats it as an error, not 
a warning? Or is is this *still* undefined behaviour, and MS C++ compiler 
will happily compile ab\cd whatever it feels like?

 
 The question of what any specific C++ does if you ignore the warning is
 irrelevant, as such behavior in C++ is almost *always* undefined. Hence
 the warning.

So a C++ compiler which follows Python's behaviour would be behaving 
within the language specifications.

I note that the bash shell, which claims to follow C semantics, also does 
what Python does:

$ echo $'a s\trin\g with escapes'
a s rin\g with escapes


Explain to me again why we're treating underspecified C++ semantics, 
which may or may not do *exactly* what Python does, as if it were the One 
True Way of treating escape sequences?


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Steven D'Aprano
On Tue, 11 Aug 2009 14:29:43 -0700, Douglas Alan wrote:

 I need to preface this entire post with the fact that I've already used
 ALL of the arguments that you've provided on my friend before I ever
 even came here with the topic, and my own arguments on why Python can be
 considered to be doing the right thing on this issue didn't even
 convince ME, much less him. When I can't even convince myself with an
 argument I'm making, then you know there's a problem with it!


I hear all your arguments, and to play Devil's Advocate I repeat them, 
and they don't convince me either. So by your logic, there's obviously a 
problem with your arguments as well!

That problem basically boils down to a deep-seated philosophical 
disagreement over which philosophy a language should follow in regard to 
backslash escapes:

Anything not explicitly permitted is forbidden

versus  

Anything not explicitly forbidden is permitted

Python explicitly permits all escape sequences, with well-defined 
behaviour, with the only ones forbidden being those explicitly forbidden:

* hex escapes with invalid hex digits;

* oct escapes with invalid oct digits;

* Unicode named escapes with unknown names;

* 16- and 32-bit Unicode escapes with invalid hex digits.

C++ apparently forbids all escape sequences, with unspecified behaviour 
if you use a forbidden sequence, except for a handful of explicitly 
permitted sequences.

That's not better, it's merely different.

Actually, that's not true -- that the C++ standard forbids a thing, but 
leaves the consequences of doing that thing unspecified, is clearly a Bad 
Thing.



[...]

 Apart from the lack of warning, what actually is the difference between
 Python's behavior and C++'s behavior?
 
 That question makes just about as much sense as, Apart from the lack of
 a fatal error, what actually is the difference between Python's behavior
 and C++'s?

This is what I get:

[steve ~]$ cat test.cc
#include iostream
int main(int argc, char* argv[])
{
std::cout  x\yz  std::endl;
return 0;
}
[steve ~]$ g++ test.cc -o test
test.cc:4:14: warning: unknown escape sequence '\y'
[st...@soy ~]$ ./test
xyz


So on at least one machine in the world, C++ simply strips out 
backslashes that it doesn't recognise, leaving the suffix. Unfortunately, 
we can't rely on that, because C++ is underspecified. Fortunately this is 
not a problem with Python, which does completely specify the behaviour of 
escape sequences so there are no surprises. 



[...]

 I disagree with your sense of aesthetics. I think that having to write
 \\y when I want \y just to satisfy a bondage-and-discipline compiler is
 ugly. That's not to deny that BD isn't useful on occasion, but in this
 case I believe the benefit is negligible, and so even a tiny cost is
 not worth the pain.
 
 EXPLICIT IS BETTER THAN IMPLICIT.

Quoting the Zen without understanding (especially shouting) doesn't 
impress anyone. There's nothing implicit about escape sequences. \y is 
perfectly explicit. Look Ma, there's a backslash, and a y, it gives a 
backslash and a y!

Implicit has an actual meaning. You shouldn't use it as a mere term of 
opprobrium for anything you don't like.



  (2) That argument disagrees with the Python reference manual, which
  explicitly states that unrecognized escape sequences are left in the
  string unchanged, and that the purpose for doing so is because it
  is useful when debugging.

 How does it disagree? \y in the source code mapping to \y in the string
 object is the sequence being left unchanged. And the usefulness of
 doing so is hardly a disagreement over the fact that it does so.
 
 Because you've stated that \y is a legal escape sequence, while the
 Python Reference Manual explicitly states that it is an unrecognized
 escape sequence, and that such unrecognized escape sequences are
 sources of bugs.

There's that reading comprehension problem again.

Unrecognised != illegal.

Useful for debugging != source of bugs. If they were equal, we could 
fix an awful lot of bugs by throwing away our debugging tools.

Here's the URL to the relevant page:
http://www.python.org/doc/2.5.2/ref/strings.html

It seems to me that the behaviour the Python designers were looking to 
avoid was the case where the coder accidentally inserted a backslash in 
the wrong place, and the language stripped the backslash out, e.g.:

Wanted a\bcd but accidentally typed ab\cd instead, and got abcd.

(This is what Bash does by design, and at least some C/C++ compilers do, 
perhaps by accident, perhaps by design.)

In that case, with no obvious backslash, the user may not even be aware 
that there was a problem:

s = ab\cd  # assume the backslash is silently discarded
assert len(s) == 4
assert s[3] == 'c'
assert '\\' not in s

All of these tests would wrongly pass, but with Python's behaviour of 
leaving the backslash in, they would all fail, and the string is visually 
distinctive (it has an obvious backslash in it).

Now, if you consider that \c should be 

Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Douglas Alan
On Aug 12, 3:08 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:

 On Tue, 11 Aug 2009 14:48:24 -0700, Douglas Alan wrote:
  In any case, my argument has consistently been that Python should have
  treated undefined escape sequences consistently as fatal errors,

 A reasonable position to take. I disagree with it, but it is certainly
 reasonable.

  not as warnings.

 I don't know what language you're talking about here, because non-special
 escape sequences in Python aren't either errors or warnings:

  print ab\cd

 ab\cd

I was talking about C++, whose compilers tend to generate warnings for
this usage. I think that the C++ compilers I've used take the right
approach, only ideally they should be *even* more emphatic, and
elevate the problem from a warning to an error.

I assume, however, that the warning is a middle ground between doing
the completely right thing, and, I assume, maintaining backward
compatibility with common C implementations. As Python never had to
worry about backward compatibility with C, Python didn't have to walk
such a middle ground.

On the other hand, *now* it has to worry about backward compatibility
with itself.

|ouglas


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Douglas Alan
On Aug 12, 3:36 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:

 On Tue, 11 Aug 2009 13:20:52 -0700, Douglas Alan wrote:

  My Annotated C++ Reference Manual is packed, and surprisingly in
  Stroustrup's Third Edition, there is no mention of the issue in the
  entire 1,000 pages. But Microsoft to the rescue:

       If you want a backslash character to appear within a string, you
       must type two backslashes (\\)

  (From http://msdn.microsoft.com/en-us/library/69ze775t.aspx)

 Should I assume that Microsoft's C++ compiler treats it as an error, not
 a warning?

In my experience, C++ compilers generally generate warnings for such
situations, where they can. (Clearly, they often can't generate
warnings for running off the end of an array, which is also undefined,
though a really smart C++ compiler might be able to generate a warning
in certain such circumstances.)

 Or is is this *still* undefined behaviour, and MS C++ compiler
 will happily compile ab\cd whatever it feels like?

If it's a decent compiler, it will generate a warning. Who can say
with Microsoft, however. It's clearly documented as illegal code,
however.

  The question of what any specific C++ does if you ignore the warning is
  irrelevant, as such behavior in C++ is almost *always* undefined. Hence
  the warning.

 So a C++ compiler which follows Python's behaviour would be behaving
 within the language specifications.

It might be, but there are also *recommendations* in the C++ standard
about what to do in such situations, and the recommendations say, I am
pretty sure, not to do that, unless the particular compiler in
question has to meet some very specific backward compatibility needs.

 I note that the bash shell, which claims to follow C semantics, also does
 what Python does:

 $ echo $'a s\trin\g with escapes'
 a s     rin\g with escapes

Really? Not on my computers. (One is a Mac, and the other is a Fedora
Core Linux box.) On my computers, bash doesn't seem to have *any*
escape sequences, other than \\, \, \$, and \`. It seems to treat
unknown escape sequences the same as Python does, but as there are
only four known escape sequences, and they are all meant merely to
guard against string interpolation, and the like, it's pretty darn
easy to keep straight.

 Explain to me again why we're treating underspecified C++ semantics,
 which may or may not do *exactly* what Python does, as if it were the One
 True Way of treating escape sequences?

I'm not saying that C++ does it right for Python. The right thing for
Python to do is to generate an error, as Python doesn't have to deal
with all the crazy complexities that C++ has to.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Douglas Alan
On Aug 12, 5:32 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:

 That problem basically boils down to a deep-seated
 philosophical disagreement over which philosophy a
 language should follow in regard to backslash escapes:

 Anything not explicitly permitted is forbidden

 versus

 Anything not explicitly forbidden is permitted

No, it doesn't. It boils down to whether a language should:

(1) Try it's best to detect errors as early as possible,
especially when the cost of doing so is low.

(2) Make code as readable as possible, in part by making
code as self-evident as possible by mere inspection and by
reducing the amount of stuff that you have to memorize. Perl
fails miserably in this regard, for instance.

(3) To quote Einstein, make everything as simple as
possible, and no simpler.

(4) Take innately ambiguous things and not force them to be
unambiguous by mere fiat.

Allowing a programmer to program using a completely
arbitrary resolution of unrecognized escape sequences
violates all of the above principles.

The fact that the meanings of unrecognized escape sequences
are ambiguous is proved by the fact that every language
seems to treat them somewhat differently, demonstrating that
there is no natural intuitive meaning for them.

Furthermore, allowing programmers to use unrecognized escape
sequences without raising an error violates:

(1) Explicit is better than implicit:

Python provides a way to explicitly specify that you want a
backslash. Every programmer should be encouraged to use
Python's explicit mechanism here.

(2) Simple is better than complex:

Python currently has two classes of ambiguously
interpretable escape sequences: unrecognized ones, and
illegal ones. Making a single class (i.e. just illegal
ones) is simpler.

Also, not having to memorize escape sequences that you
rarely have need to use is simpler.

(3) Readability counts:

See above comments on readability.

(4) Errors should never pass silently:

Even the Python Reference Manual indicates that unrecognized
escape sequences are a source of bugs. (See more comments on
this below.)

(5) In the face of ambiguity, refuse the temptation to
guess.

Every language, other than C++, is taking a guess at what
the programmer would find to be most useful expansion for
unrecognized escape sequences, and each of the languages is
guessing differently. This temptation should be refused!

You can argue that once it is in the Reference Manual it is
no longer a guess, but that is patently specious, as Perl
proves. For instance, the fact that Perl will quietly convert
an array into a scalar for you, if you assign the array to a
scalar variable is certainly a guess of the sort that this
Python koan is referring to. Likewise for an arbitrary
interpretation of unrecognized escape sequences.

(6) There should be one-- and preferably only one --obvious
way to do it.

What is the one obvious way to express \\y? It is \\y or
\y?

Python can easily make one of these ways the one obvious
way by making the other one raise an error.

(7) Namespaces are one honking great idea -- let's do more
of those!

Allowing \y to self-expand is intruding into the namespace
for special characters that require an escape sequence.

 C++ apparently forbids all escape sequences, with
 unspecified behaviour if you use a forbidden sequence,
 except for a handful of explicitly permitted sequences.

 That's not better, it's merely different.

It *is* better, as it catches errors early on at little
cost, and for all the other reasons listed above.

 Actually, that's not true -- that the C++ standard forbids
 a thing, but leaves the consequences of doing that thing
 unspecified, is clearly a Bad Thing.

Indeed. But C++ has backward compatibly issues that make
any that Python has to deal with, pale in comparison. The
recommended behavior for a C++ compiler, however, is to flag
the problem as an error or as a warning.

 So on at least one machine in the world, C++ simply strips
 out backslashes that it doesn't recognize, leaving the
 suffix. Unfortunately, we can't rely on that, because C++
 is underspecified.

No, *fortunately* you can't rely on it, forcing you to go
fix your code.

 Fortunately this is not a problem with
 Python, which does completely specify the behaviour of
 escape sequences so there are no surprises.

It's not a surprise when the C++ compiler issues a warning to
you. If you ignore the warning, then you have no one to
blame but yourself.

 Implicit has an actual meaning. You shouldn't use it as a
 mere term of opprobrium for anything you don't like.

Pardon me, but I'm using implicit to mean implicit, and
nothing more.

Python's behavior here is implicit in the very same way
that Perl implicitly converts an array into a scalar for
you. (Though that particular Perl behavior is a far bigger
wart than Python's behavior is here!)

  Because you've stated that \y is a legal escape
  sequence, while the Python Reference Manual explicitly
  states 

Re: Unrecognized escape sequences in string literals

2009-08-12 Thread Steven D'Aprano
On Wed, 12 Aug 2009 14:21:34 -0700, Douglas Alan wrote:

 On Aug 12, 5:32 am, Steven D'Aprano
 ste...@remove.this.cybersource.com.au wrote:
 
 That problem basically boils down to a deep-seated philosophical
 disagreement over which philosophy a language should follow in regard
 to backslash escapes:

 Anything not explicitly permitted is forbidden

 versus

 Anything not explicitly forbidden is permitted
 
 No, it doesn't. It boils down to whether a language should:
 
 (1) Try it's best to detect errors as early as possible, especially when
 the cost of doing so is low.

You are making an unjustified assumption: \y is not an error. It is only 
an error if you think that anything not explicitly permitted is forbidden.

While I'm amused that you've made my own point for me, I'm less amused 
that you seem to be totally incapable of seeing past your parochial 
language assumptions, even when those assumptions are explicitly pointed 
out to you. Am I wasting my time engaging you in discussion? 

There's a lot more I could say, but time is short, so let me just 
summarise:

I disagree with nearly everything you say in this post. I think that a 
few points you make have some validity, but the vast majority are based 
on a superficial and confused understanding of language design 
principles. (I won't justify that claim now, perhaps later, time 
permitting.) Nevertheless, I think that your ultimate wish -- for \y etc 
to be considered an error -- is a reasonable design choice, given your 
assumptions. But it's not the only reasonable design choice, and Bash has 
made a different choice, and Python has made yet a third reasonable 
choice, and Pascal made yet a fourth reasonable choice.

These are all reasonable choices, all have some good points and some bad 
points, but ultimately the differences between them are mostly arbitrary 
personal preference, like the colour of a car. Disagreements over 
preferences I can live with. One party insisting that red is the only 
logical colour for a car, and that anybody who prefers white or black or 
blue is illogical, is unacceptable.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-12 Thread MRAB

Steven D'Aprano wrote:

On Wed, 12 Aug 2009 14:21:34 -0700, Douglas Alan wrote:


On Aug 12, 5:32 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:


That problem basically boils down to a deep-seated philosophical
disagreement over which philosophy a language should follow in regard
to backslash escapes:

Anything not explicitly permitted is forbidden

versus

Anything not explicitly forbidden is permitted

No, it doesn't. It boils down to whether a language should:

(1) Try it's best to detect errors as early as possible, especially when
the cost of doing so is low.


You are making an unjustified assumption: \y is not an error. It is only 
an error if you think that anything not explicitly permitted is forbidden.


While I'm amused that you've made my own point for me, I'm less amused 
that you seem to be totally incapable of seeing past your parochial 
language assumptions, even when those assumptions are explicitly pointed 
out to you. Am I wasting my time engaging you in discussion? 

There's a lot more I could say, but time is short, so let me just 
summarise:


I disagree with nearly everything you say in this post. I think that a 
few points you make have some validity, but the vast majority are based 
on a superficial and confused understanding of language design 
principles. (I won't justify that claim now, perhaps later, time 
permitting.) Nevertheless, I think that your ultimate wish -- for \y etc 
to be considered an error -- is a reasonable design choice, given your 
assumptions. But it's not the only reasonable design choice, and Bash has 
made a different choice, and Python has made yet a third reasonable 
choice, and Pascal made yet a fourth reasonable choice.



IHMO, it would've been simpler in the long run to say that backslash
followed by one of [0-9A-Za-z] is an escape sequence, backslash followed
by newline is ignored, and backslash followed by anything else is that
something. That way there would be a way to introduce additional escape
sequences without breaking existing code.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Steven D'Aprano
On Mon, 10 Aug 2009 15:17:24 -0700, Douglas Alan wrote:

 From: Steven D'Aprano ste...@remove.this.cybersource.com.au wrote:
 
 On Mon, 10 Aug 2009 00:32:30 -0700, Douglas Alan wrote:
 
  In C++, if I know that the code I'm looking at compiles, then I never
  need worry that I've misinterpreted what a string literal means.
 
 If you don't know what your string literals are, you don't know what
 your program does. You can't expect the compiler to save you from
 semantic errors. Adding escape codes into the string literal doesn't
 change this basic truth.
 
 I grow weary of these semantic debates. The bottom line is that C++'s
 strategy here catches bugs early on that Python's approach doesn't. It
 does so at no additional cost.

 From a purely practical point of view, why would any language not want
 to adopt a zero-cost approach to catching bugs, even if they are
 relatively rare, as early as possible?

Because the cost isn't zero. Needing to write \\ in a string literal when 
you want \ is a cost, and having to read \\ in source code and mentally 
translate that to \ is also a cost. By all means argue that it's a cost 
that is worth paying, but please stop pretending that it's not a cost.

Having to remember that \n is a special escape and \y isn't is also a 
cost, but that's a cost you pay in C++ too, if you want your code to 
compile.


By the way, you've stated repeatedly that \y will compile with a warning 
in g++. So what precisely do you get if you ignore the warning? What do 
other C++ compilers do? Apart from the lack of warning, what actually is 
the difference between Python's behaviour and C++'s behaviour?



 (Other than the reason that adopting it *now* is sadly too late.)
 
 Furthermore, Python's strategy here is SPECIFICALLY DESIGNED, according
 to the reference manual to catch bugs. I.e., from the original posting
 on this issue:
 
  Unlike Standard C, all unrecognized escape sequences are left in
  the string unchanged, i.e., the backslash is left in the string.
  (This behavior is useful when debugging: if an escape sequence is
  mistyped, the resulting output is more easily recognized as
  broken.)

You need to work on your reading comprehension. It doesn't say anything 
about the motivation for this behaviour, let alone that it was 
SPECIFICALLY DESIGNED to catch bugs. It says it is useful for 
debugging. My shoe is useful for squashing poisonous spiders, but it 
wasn't designed as a poisonous-spider squashing device.



 The compiler can't save you from typing 1234 instead of 11234, or 31.45
 instead of 3.145, or My darling Ho instead of My darling Jo, so why
 do you expect it to save you from typing abc\d instead of abc\\d?
 
 Because in the former cases it can't catch the the bug, and in the
 latter case, it can.

I'm not convinced this is a bug that needs catching, but if you think it 
is, then that's a reasonable argument.



 Perhaps it can catch *some* errors of that type, but only at the cost
 of extra effort required to defeat the compiler (forcing the programmer
 to type \\d to prevent the compiler complaining about \d). I don't
 think the benefit is worth the cost. You and your friend do. Who is to
 say you're right?
 
 Well, Bjarne Stroustrup, for one.

Then let him design his own language *wink*


 All of these are value judgments, of course, but I truly doubt that
 anyone would have been bothered if Python from day one had behaved the
 way that C++ does. 

If I'm reading this page correctly, Python does behave as C++ does. Or at 
least as Larch/C++ does:

http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html




 In C++, if you see an escape you don't recognize, do you care?
 
 Yes, of course I do. If I need to know what the program does.

Precisely the same as in Python.


 Do you go running for the manual? If the answer is No, then why do it
 in Python?
 
 The answer is that I do in both cases.

You deleted without answer my next question:

And if the answer is Yes, then how is Python worse than C++?

Seems to me that the answer is It's not worse than C++, it's the same 
-- in both cases, you have to memorize the special escape sequences, 
and in both cases, if you see an escape you don't recognize, you need to 
look it up.



 No. \z *is* a legal escape sequence, it just happens to map to \z.
 
 If you stop thinking of \z as an illegal escape sequence that Python
 refuses to raise an error for, the problem goes away. It's a legal
 escape sequence that maps to backslash + z.
 
 (1) I already used that argument on my friend, and he wasn't buying it.
 (Personally, I find the argument technically valid, but commonsensically
 invalid. It's a language-lawyer kind of argument, rather than one that
 appeals to any notion of real aesthetics.)

I disagree with your sense of aesthetics. I think that having to write 
\\y when I want \y just to satisfy a bondage-and-discipline compiler is 
ugly. That's not to deny that BD isn't useful on occasion, but in 

Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Piet van Oostrum
 Steven D'Aprano ste...@remove.this.cybersource.com.au (SD) wrote:

SD If I'm reading this page correctly, Python does behave as C++ does. Or at 
SD least as Larch/C++ does:

SD http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html

They call them `non-standard escape sequences' for a reason: that they
are not in standard C++.

test.cpp:
char* temp = abc\yz;

TEMP g++ -c test.cpp
test.cpp:1:1: warning: unknown escape sequence '\y'

-- 
Piet van Oostrum p...@cs.uu.nl
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Ethan Furman

Steven D'Aprano wrote:

On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote:



But you're right, it's too late to change this now.



Not really. There is a procedure for making non-backwards compatible 
changes. If you care deeply enough about this, you could agitate for 
Python 3.2 to raise a PendingDepreciation warning for unexpected escape 
sequences like \z, Python 3.3 to raise a Depreciation warning, and Python 
3.4 to treat it as an error.


It may even be possible to skip the PendingDepreciation warning and go 
straight for Depreciation warning in 3.2.





And once it's fully depreciated you have to stop writing it off on your 
taxes.  *wink*


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Steven D'Aprano
On Tue, 11 Aug 2009 15:50:01 +0200, Piet van Oostrum wrote:

 Steven D'Aprano ste...@remove.this.cybersource.com.au (SD) wrote:
 
SD If I'm reading this page correctly, Python does behave as C++ does.
Or at SD least as Larch/C++ does:
 
SD http://www.cs.ucf.edu/~leavens/larchc++manual/lcpp_47.html
 
 They call them `non-standard escape sequences' for a reason: that they
 are not in standard C++.
 
 test.cpp:
 char* temp = abc\yz;
 
 TEMP g++ -c test.cpp
 test.cpp:1:1: warning: unknown escape sequence '\y'


Isn't that a warning, not a fatal error? So what does temp contain?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Douglas Alan
On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

  test.cpp:1:1: warning: unknown escape sequence '\y'

 Isn't that a warning, not a fatal error? So what does temp contain?

My Annotated C++ Reference Manual is packed, and surprisingly in
Stroustrup's Third Edition, there is no mention of the issue in the
entire 1,000 pages. But Microsoft to the rescue:

 If you want a backslash character to appear within a string,
 you must type two backslashes (\\)

(From http://msdn.microsoft.com/en-us/library/69ze775t.aspx)

The question of what any specific C++ does if you ignore the warning
is irrelevant, as such behavior in C++ is almost *always* undefined.
Hence the warning.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Ethan Furman

Douglas Alan wrote:

On Aug 11, 2:00 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:



test.cpp:1:1: warning: unknown escape sequence '\y'


Isn't that a warning, not a fatal error? So what does temp contain?



My Annotated C++ Reference Manual is packed, and surprisingly in
Stroustrup's Third Edition, there is no mention of the issue in the
entire 1,000 pages. But Microsoft to the rescue:

 If you want a backslash character to appear within a string,
 you must type two backslashes (\\)

(From http://msdn.microsoft.com/en-us/library/69ze775t.aspx)

The question of what any specific C++ does if you ignore the warning
is irrelevant, as such behavior in C++ is almost *always* undefined.
Hence the warning.

|ouglas


Almost always undefined?  Whereas with Python, and some memorization or 
a small table/list nearby, you can easily *know* what you will get.


Mind you, I'm not really vested in how Python *should* handle 
backslashes one way or the other, but I am glad it has rules that it 
follows for consitent results, and I don't have to break out a byte-code 
editor to find out what's in my string literal.


~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Douglas Alan
Steven D'Aprano wrote:

 Because the cost isn't zero. Needing to write \\ in a string
 literal when you want \ is a cost,

I need to preface this entire post with the fact that I've
already used ALL of the arguments that you've provided on my
friend before I ever even came here with the topic, and my
own arguments on why Python can be considered to be doing
the right thing on this issue didn't even convince ME, much
less him. When I can't even convince myself with an argument
I'm making, then you know there's a problem with it!

Now back the our regularly scheduled debate:

I think that the total cost of all of that extra typing for
all the Python programmers in the entire world is now
significantly less than the time it took to have this
debate. Which would have never happened if Python did things
the right way on this issue to begin with. Meaning that
we're now at LESS than zero cost for doing things right!

And we haven't even yet included all the useless heat that
is going to be generated during code reviews and in-house coding
standard debates.

That's why I stand by Python's motto:

   THERE SHOULD BE ONE-- AND PREFERABLY ONLY ONE --OBVIOUS
   WAY TO DO IT.

 and having to read \\ in source code and mentally
 translate that to \ is also a cost.

For me that has no mental cost. What does have a mental cost
is remembering whether \b is an unrecognized escape
sequence or not.

 By all means argue that it's a cost that is worth paying,
 but please stop pretending that it's not a cost.

I'm not pretending. I'm pwning you with logic and common
sense!

 Having to remember that \n is a special escape and \y
 isn't is also a cost, but that's a cost you pay in C++ too,
 if you want your code to compile.

Ummm, no I don't! I just always use \\ when I want a
backslash to appear, and I only think about the more obscure
escape sequences if I actually need them, or some code that
I am reading has used them.

 By the way, you've stated repeatedly that \y will compile
 with a warning in g++. So what precisely do you get if you
 ignore the warning?

A program with undefined behavior. That's typically what a
warning means from a C++ compiler. (Sometimes it means
use of a deprecated feature, though.)

 What do other C++ compilers do?

The Microsoft compilers also consider it to be incorrect
code, as I documented in a different post.

 Apart from the lack of warning, what actually is the
 difference between Python's behavior and C++'s behavior?

That question makes just about as much sense as, Apart
from the lack of a fatal error, what actually is the
difference between Python's behavior and C++'s?

Sure, warnings aren't fatal errors, but if you ignore them,
then you are almost always doing something very
wrong. (Unless you're building legacy code.)

  Furthermore, Python's strategy here is SPECIFICALLY
  DESIGNED, according to the reference manual to catch
  bugs. I.e., from the original posting on this issue:

   Unlike Standard C, all unrecognized escape sequences
   are left in the string unchanged, i.e., the backslash
   is left in the string.  (This behavior is useful when
   debugging: if an escape sequence is mistyped, the
   resulting output is more easily recognized as
   broken.)

 You need to work on your reading comprehension. It doesn't
 say anything about the motivation for this behaviour, let
 alone that it was SPECIFICALLY DESIGNED to catch bugs. It
 says it is useful for debugging. My shoe is useful for
 squashing poisonous spiders, but it wasn't designed as a
 poisonous-spider squashing device.

As I have a BS from MIT in BS-ology, I can readily set aside
your aspersions to my intellect, and point out the gross
errors of your ways: Natural language does not work the way
you claim. It is is much more practical, implicit, and
elliptical.

More specifically, if your shoe came with a reference manual
claiming that it was useful for squashing poisonous spiders,
then you may now validly assume poisonous spider squashing
was a design requirement of the shoe. (Or at least it has
become one, even if ipso facto.) Furthermore, if it turns out
that the shoe is deficient at poisonous spider squashing,
and consequently causes you to get bitten by a poisonous
spider, then you now have grounds for a lawsuit.

  Because in the former cases it can't catch the the bug,
  and in the latter case, it can.

 I'm not convinced this is a bug that needs catching, but if
 you think it is, then that's a reasonable argument.

All my arguments are reasonable.

  Perhaps it can catch *some* errors of that type, but
  only at the cost of extra effort required to defeat the
  compiler (forcing the programmer to type \\d to prevent
  the compiler complaining about \d). I don't think the
  benefit is worth the cost. You and your friend do. Who
  is to say you're right?

  Well, Bjarne Stroustrup, for one.

 Then let him design his own language *wink*

Oh, I'm not sure that's such a good idea. He might come up

Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Douglas Alan
On Aug 10, 11:27 pm, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
 On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote:
  But you're right, it's too late to change this now.

 Not really. There is a procedure for making non-backwards compatible
 changes. If you care deeply enough about this, you could agitate for
 Python 3.2 to raise a PendingDepreciation warning for unexpected escape
 sequences like \z,

How does one do this?

Not that I necessarily think that it is important enough a nit to
break a lot of existing code.

Also, if I agitate for change, then in the future people might
actually accurately accuse me of agitating for change, when typically
I just come here for a good argument, and I provide a connected series
of statements intended to establish a proposition, but in return I
receive merely the automatic gainsaying of any statement I make.

|ouglas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Douglas Alan
On Aug 11, 4:38 pm, Ethan Furman et...@stoneleaf.us wrote:

 Mind you, I'm not really vested in how Python *should* handle
 backslashes one way or the other, but I am glad it has rules that it
 follows for consitent results, and I don't have to break out a byte-code
 editor to find out what's in my string literal.

I don't understand your comment. C++ generates a warning if you use an
undefined escape sequence, which indicates that your program should be
fixed. If the escape sequence isn't undefined, then C++ does the same
thing as Python.

It would be *even* better if C++ generated a fatal error in this
situation. (g++ probably has an option to make warnings fatal, but I
don't happen to know what that option is.) g++ might not generate an
error so that you can compile legacy C code with it.

In any case, my argument has consistently been that Python should have
treated undefined escape sequences consistently as fatal errors, not
as warnings.

|ouglas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-11 Thread Douglas Alan
I wrote:

 But you're right, it's too late to change this now.

P.S. But if it weren't too late, I think that your idea to have \s
be the escape sequence for a backslash instead of \\ might be a good
one.

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread John Nagle

Carl Banks wrote:

IOW it's an error-prone mess.  It would be better if Python (like C)
treated \ consistently as an escape character.  (And in raw strings,
consistently as a literal.)


   Agreed.  For one thing, if another escape character ever has to be
added to the language, that may change the semantics of previously
correct strings.  If \ followed by a non-special character is treated
as an error, that doesn't happen.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:

 Steven D'Aprano wrote:
 
 Why should a backslash in a string literal be an error?
 
 Because in Python, if my friend sees the string foo\xbar\n, he has no
 idea whether the \x is an escape sequence, or if it is just the
 characters \x, unless he looks it up in the manual, or tries it out in
 the REPL, or what have you. 

Fair enough, but isn't that just another way of saying that if you look 
at a piece of code and don't know what it does, you don't know what it 
does unless you look it up or try it out?


 My friend is adamant that it would be better
 if he could just look at the string literal and know. He doesn't want to
 be bothered to have to store stuff like that in his head. He wants to be
 able to figure out programs just by looking at them, to the maximum
 degree that that is feasible.

I actually sympathize strongly with that attitude. But, honestly, your 
friend is a programmer (or at least pretends to be one *wink*). You can't 
be a programmer without memorizing stuff: syntax, function calls, modules 
to import, quoting rules, blah blah blah. Take C as an example -- there's 
absolutely nothing about () that says group expressions or call a 
function and {} that says group a code block. You just have to 
memorize it. If you don't know what a backslash escape is going to do, 
why would you use it? I'm sure your friend isn't in the habit of randomly 
adding backslashes to strings just to see whether it will still compile.

This is especially important when reading (as opposed to writing) code. 
You read somebody else's code, and see foo\xbar\n. Let's say you know 
it compiles without warning. Big deal -- you don't know what the escape 
codes do unless you've memorized them. What does \n resolve to? chr(13) 
or chr(97) or chr(0)? Who knows? 

Unless you know the rules, you have no idea what is in the string. 
Allowing \y to resolve to a literal backslash followed by y doesn't 
change that. All it means is that some \c combinations return a single 
character, and some return two.



 In comparison to Python, in C++, he can just look foo\xbar\n and know
 that \x is a special character. (As long as it compiles without
 warnings under g++.)

So what you mean is, he can just look at foo\xbar\n AND COMPILE IT 
USING g++, and know whether or not \x is a special character.

[sarcasm] Gosh. That's an enormous difference from Python, where you have 
to print the string at the REPL to know what it does. [/sarcasm]

Aside:
\x isn't a special character:

 \x
ValueError: invalid \x escape

However, \xba is:

 \xba
'\xba'
 len(\xba)
1
 ord(\xba)
186



 He's particularly annoyed too, that if he types foo\xbar at the REPL,
 it echoes back as foo\\xbar. He finds that to be some sort of annoying
 DWIM feature, and if Python is going to have DWIM features, then it
 should, for example, figure out what he means by \ and not bother him
 with a syntax error in that case.

Now your friend is confused. This is a good thing. Any backslash you see 
in Python's default string output is *always* an escape:

 a string with a 'proper' escape \t (tab)
a string with a 'proper' escape \t (tab)
 a string with an 'improper' escape \y (backslash-y)
a string with an 'improper' escape \\y (backslash-y)

The REPL is actually doing him a favour. It always escapes backslashes, 
so there is no ambiguity. A backslash is displayed as \\, any other \c is 
a special character.


 Of course I think that he's overreacting a bit. 

:)


 My point of view is that
 every language has *some* warts; Python just has a bit fewer than most.
 It would have been nice, I should think, if this wart had been fixed
 in Python 3, as I do consider it to be a minor wart.

And if anyone had cared enough to raise it a couple of years back, it 
possibly might have been.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote:

 Why should a backslash in a string literal be an error?
 
 Because the behavior of \ in a string is context-dependent, which means
 a reader can't know if \ is a literal character or escape character
 without knowing the context, and it means an innocuous change in context
 can cause a rather significant change in \.

*Any* change in context is significant with escapes.

this \nhas two lines

If you change the \n to a \t you get a significant difference. If you 
change the \n to a \y you get a significant difference. Why is the first 
one acceptable but the second not?


 IOW it's an error-prone mess.

I've never had any errors caused by this. I've never seen anyone write to 
this newsgroup confused over escape behaviour, or asking for help with an 
error caused by it, and until this thread, never seen anyone complain 
about it either.

Excuse my cynicism, but I believe that you are using error-prone to 
mean I don't like this behaviour rather than it causes lots of errors.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Sun, 09 Aug 2009 23:03:14 -0700, John Nagle wrote:

 if another escape character ever has to be
 added to the language, that may change the semantics of previously
 correct strings.

And that's the only argument in favour of prohibiting non-special 
backslash sequences I've seen yet that is even close to convincing.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Douglas Alan
On Aug 10, 2:03 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:

 On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:

  Because in Python, if my friend sees the string foo\xbar\n, he has no
  idea whether the \x is an escape sequence, or if it is just the
  characters \x, unless he looks it up in the manual, or tries it out in
  the REPL, or what have you.

 Fair enough, but isn't that just another way of saying that if you look
 at a piece of code and don't know what it does, you don't know what it
 does unless you look it up or try it out?

Not really. It's more like saying that easy things should be easy, and
hard things should possible. But in this case, Python is making
something that should be really easy, a bit harder and more error
prone than it should be.

In C++, if I know that the code I'm looking at compiles, then I never
need worry that I've misinterpreted what a string literal means. At
least not if it doesn't have any escape characters in it that I'm not
familiar with. But in Python, if I see, \f\o\o\b\a\z, I'm not really
sure what I'm seeing, as I surely don't have committed to memory some
of the more obscure escape sequences. If I saw this in C++, and I knew
that it was in code that compiled, then I'd at least know that there
are some strange escape codes that I have to look up. Unlike with
Python, it would never be the case in C++ code that the programmer who
wrote the code was just too lazy to type in \\f\\o\\o\\b\\a\\z
instead.

  My friend is adamant that it would be better
  if he could just look at the string literal and know. He doesn't want to
  be bothered to have to store stuff like that in his head. He wants to be
  able to figure out programs just by looking at them, to the maximum
  degree that that is feasible.

 I actually sympathize strongly with that attitude. But, honestly, your
 friend is a programmer (or at least pretends to be one *wink*).

Actually, he's probably written more code than you, me, and ten other
random decent programmers put together. As he can slap out massive
amounts of code very quickly, he'd prefer not to have crap getting in
his way. In the time it takes him to look something up, he might have
written another page of code.

He's perfectly capable of dealing with crap, as years of writing large
programs in Perl and PHP quickly proves, but his whole reason for
learning Python, I take it, is so that he will be bothered with less
crap and therefore write code even faster.

 You can't be a programmer without memorizing stuff: syntax, function
 calls, modules to import, quoting rules, blah blah blah. Take C as
 an example -- there's absolutely nothing about () that says group
 expressions or call a function and {} that says group a code
 block.

I don't really think that this is a good analogy. It's like the
difference between remembering rules of grammar and remembering
English spelling. As a kid, I was the best in my school at grammar,
and one of the worst at speling.

 You just have to memorize it. If you don't know what a backslash
 escape is going to do, why would you use it?

(1) You're looking at code that someone else wrote, or (2) you forget
to type \\ instead of \ in your code (or get lazy sometimes), as
that is okay most of the time, and you inadvertently get a subtle bug.

 This is especially important when reading (as opposed to writing) code.
 You read somebody else's code, and see foo\xbar\n. Let's say you know
 it compiles without warning. Big deal -- you don't know what the escape
 codes do unless you've memorized them. What does \n resolve to? chr(13)
 or chr(97) or chr(0)? Who knows?

It *is* a big deal. Or at least a non-trivial deal. It means that you
can tell just by looking at the code that there are funny characters
in the string, and not just a backslashes. You don't have to go
running for the manual every time you see code with backslashes, where
the upshot might be that the programmer was merely saving themselves
some typing.

  In comparison to Python, in C++, he can just look foo\xbar\n and know
  that \x is a special character. (As long as it compiles without
  warnings under g++.)

 So what you mean is, he can just look at foo\xbar\n AND COMPILE IT
 USING g++, and know whether or not \x is a special character.

I'm not sure that your comments are paying due diligence to full
life-cycle software development issues that involve multiple
programmers (or even just your own program that you wrote a year ago,
and you don't remember all the details of what you did) combined with
maintaining and modifying existing code, etc.

 Aside:
 \x isn't a special character:

  \x

 ValueError: invalid \x escape

I think that this all just goes to prove my friend's point! Here I've
been programming in Python for more than a decade (not full time, mind
you, as I also program in other languages, like C++), and even I
didn't know that \xba was an escape sequence, and I inadvertently
introduced a subtle bug into my argument 

Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Carl Banks
On Aug 9, 11:10 pm, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
 On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote:
  Why should a backslash in a string literal be an error?

  Because the behavior of \ in a string is context-dependent, which means
  a reader can't know if \ is a literal character or escape character
  without knowing the context, and it means an innocuous change in context
  can cause a rather significant change in \.

 *Any* change in context is significant with escapes.

 this \nhas two lines

 If you change the \n to a \t you get a significant difference. If you
 change the \n to a \y you get a significant difference. Why is the first
 one acceptable but the second not?

Because when you change \n to \t, you've haven't changed the meaning
of the \ character; but when you change \n to \y, you have, and you
did so without even touching the backslash.


  IOW it's an error-prone mess.

 I've never had any errors caused by this.

Thank you for your anecdotal evidence.  Here's mine: This has gotten
me at least twice, and a compiler complaint would have reduced my bug-
hunting time from tens of minutes to ones of seconds.  [Aside: it was
when I was using Python on Windows for the first time]


 I've never seen anyone write to
 this newsgroup confused over escape behaviour, or asking for help with an
 error caused by it, and until this thread, never seen anyone complain
 about it either.

More anecdotal evidence.  Here's mine: I have.


 Excuse my cynicism, but I believe that you are using error-prone to
 mean I don't like this behaviour rather than it causes lots of errors.

No, I'm using error-prone to mean error-prone.

Someone (obviously not you because you're have perfect knowledge of
the language and 100% situation awareness at all times) might have a
string like abcd\stuv  and change it to abcd\tuvw without even
thinking about the fact that the s comes after the backslash.

Worst of all: they might not even notice the error, because the repr
of this string is:

'abcd\tuwv'

They might not notice that the backslash is single, because (unlike
you) mortal fallible human beings don't always register tiny details
like a backslash being single when it should be double.

Point is, this is a very bad inconsistency.  It makes the behavior of
\ impossible to learn by analogy, now you have to memorize a list of
situations where it behaves one way or another.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Douglas Alan
On Aug 10, 2:10 am, Steven D'Aprano

 I've never had any errors caused by this.

But you've seen an error caused by this, in this very discussion.
I.e., foo\xbar.

\xba isn't an escape sequence in any other language that I've used,
which is one reason I made this error... Oh, wait a minute -- it *is*
an escape sequence in JavaScript. But in JavaScript, while \xba is a
special character, \xb is synonymous with xb.

The fact that every language seems to treat these things similarly but
differently, is yet another reason why they should just be treated
utterly consistently by all of the languages: I.e., escape sequences
that don't have a special meaning should be an error!

 I've never seen anyone write to
 this newsgroup confused over escape behaviour,

My friend objects strongly the claim that he is confused by it, so I
guess you are right that no one is confused. He just thinks that it
violates the beautiful sense of aesthetics that he was sworn over and
over again Python to have.

But aesthetics is a non-negligible issue with practical ramifications.
(Not that anything can be done about this wart at this point,
however.)

 or asking for help with an error caused by it, and until
 this thread, never seen anyone complain about it either.

Oh, this bothered me too when I first learned Python, and I thought it
was stupid. It just didn't bother me enough to complain publicly.

Besides, the vast majority of Python noobs don't come here, despite
appearance sometimes, and by the time most people get here, they've
probably got bigger fish to fry.

|ouglas


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Mon, 10 Aug 2009 00:37:33 -0700, Carl Banks wrote:

 On Aug 9, 11:10 pm, Steven D'Aprano
 ste...@remove.this.cybersource.com.au wrote:
 On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote:
  Why should a backslash in a string literal be an error?

  Because the behavior of \ in a string is context-dependent, which
  means a reader can't know if \ is a literal character or escape
  character without knowing the context, and it means an innocuous
  change in context can cause a rather significant change in \.

 *Any* change in context is significant with escapes.

 this \nhas two lines

 If you change the \n to a \t you get a significant difference. If you
 change the \n to a \y you get a significant difference. Why is the
 first one acceptable but the second not?
 
 Because when you change \n to \t, you've haven't changed the meaning of
 the \ character; 

I assume you mean the \ character in the literal, not the (non-existent) 
\ character in the string.


 but when you change \n to \y, you have, and you did so
 without even touching the backslash.

Not at all.

'\n' maps to the string chr(10).
'\y' maps to the string chr(92) + chr(121).

In both cases the backslash in the literal have the same meaning: grab 
the next token (usually a single character, but not always), look it up 
in a mapping somewhere, and insert the result in the string object being 
built.

(I don't know if the *implementation* is precisely as described, but 
that's irrelevant. It's still functionally a mapping.) 



  IOW it's an error-prone mess.

 I've never had any errors caused by this.
 
 Thank you for your anecdotal evidence.  Here's mine: This has gotten me
 at least twice, and a compiler complaint would have reduced my bug-
 hunting time from tens of minutes to ones of seconds.  [Aside: it was
 when I was using Python on Windows for the first time]

Okay, that's twice in, how many years have you been programming?

I've mistyped xrange as xrnage two or three times. Does that make 
xrange() an error-prone mess too? Probably not. Why is my mistake my 
mistake, but your mistake the language's fault?


[...]

Oh, wait, no, I tell I lie -- I *have* seen people reporting bugs here 
caused by backslashes. They're invariably Windows programmers writing 
pathnames using backslashes, so I'll give you that one: if you don't know 
that Python treats backslashes as special in string literals, you will 
screw up your Windows pathnames.

Interestingly, the problem there is not that \y resolves to literal 
backslash followed by y, but that \t DOESN'T resolve to the expected 
backslash-t. So it seems to me that the problem for Windows coders is not 
that \y doesn't raise an error, but the mere existence of backslash 
escapes.



 Someone (obviously not you because you're have perfect knowledge of the
 language and 100% situation awareness at all times) might have a string
 like abcd\stuv  and change it to abcd\tuvw without even thinking
 about the fact that the s comes after the backslash.

Deary me. And they might type 4+15 instead of 4*51, and now 
arithmetic is an error-prone mess too. If you know of a programming 
language which can prevent you making semantic errors, please let us all 
know what it is.

If you edit code without thinking, you will be burnt, and you get *zero* 
sympathy from me.


 Worst of all: they might not even notice the error, because the repr of
 this string is:
 
 'abcd\tuwv'
 
 They might not notice that the backslash is single, because (unlike you)
 mortal fallible human beings don't always register tiny details like a
 backslash being single when it should be double.

Help help, 123145 looks too similar to 1231145, and now I calculated my 
taxes wrong and will go to jail!!!


 Point is, this is a very bad inconsistency.  It makes the behavior of \
 impossible to learn by analogy, now you have to memorize a list of
 situations where it behaves one way or another.

No, you don't have to memorize anything, you can go right ahead and 
escape every backslash, as I did for years. Your code will still work 
fine.

You already have to memorize what escape codes return special characters. 
The only difference is whether you learn ...and everything else raises 
an exception or ...and everything else is returned unchanged. 

There is at least one good reason for preferring an error, namely that it 
allows Python to introduce new escape codes without going through a long, 
slow process. But the rest of these complaints are terribly unconvincing.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Mon, 10 Aug 2009 00:57:18 -0700, Douglas Alan wrote:

 On Aug 10, 2:10 am, Steven D'Aprano
 
 I've never had any errors caused by this.
 
 But you've seen an error caused by this, in this very discussion. I.e.,
 foo\xbar.


Your complaint is that invalid escapes like \y resolve to a literal 
backslash-y instead of raising an error. But \xbar doesn't contain an 
invalid escape, it contains a valid hex escape. Your ignorance that \xHH 
is a valid hex escape (for suitable hex digits) isn't an example of an 
error caused by invalid escapes like \y.



 \xba isn't an escape sequence in any other language that I've used,
 which is one reason I made this error... Oh, wait a minute -- it *is* an
 escape sequence in JavaScript. But in JavaScript, while \xba is a
 special character, \xb is synonymous with xb.
 
 The fact that every language seems to treat these things similarly but
 differently, is yet another reason why they should just be treated
 utterly consistently by all of the languages: I.e., escape sequences
 that don't have a special meaning should be an error!

Perhaps all the other languages should follow Python's lead instead?

Or perhaps they should follow bash's lead, and map \C to C for every 
character. If there were no special escapes at all, Windows programmers 
wouldn't keep getting burnt when they write C:\\Documents\today\foo and 
end up with something completely unexpected.

Oh wait, no, that still wouldn't work, because they'd end up with 
C:\Documentstodayfoo. So copying bash doesn't work.

But copying C will upset the bash coders, because they'll write 
some\ file\ with\ spaces and suddenly their code won't even compile!!!

Seems like no matter what you do, you're going to upset *somebody*.



 I've never seen anyone write to
 this newsgroup confused over escape behaviour,
 
 My friend objects strongly the claim that he is confused by it, so I
 guess you are right that no one is confused. He just thinks that it
 violates the beautiful sense of aesthetics that he was sworn over and
 over again Python to have.

Fair enough.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Duncan Booth
Douglas Alan darkwate...@gmail.com wrote:

 \xba isn't an escape sequence in any other language that I've used,
 which is one reason I made this error... Oh, wait a minute -- it *is*
 an escape sequence in JavaScript. But in JavaScript, while \xba is a
 special character, \xb is synonymous with xb.
 

\xba is an escape sequence in c, c++, c#, python, javascript, perl and 
probably many others.

\xb is an escape sequence in c, c++, c# but not in Python, Javascript, or 
Perl. Python will throw ValueError if you try to use \xb in a string, 
Javascript simply ignores the backslash.

 The fact that every language seems to treat these things similarly but
 differently, is yet another reason why they should just be treated
 utterly consistently by all of the languages: I.e., escape sequences
 that don't have a special meaning should be an error!

It would be nice if these things were treated consistently, but they aren't 
and it seems unlikely to change.



-- 
Duncan Booth http://kupuguy.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Duncan Booth
Steven D'Aprano ste...@remove.this.cybersource.com.au wrote:

 Or perhaps they should follow bash's lead, and map \C to C for every 
 character. If there were no special escapes at all, Windows
 programmers wouldn't keep getting burnt when they write
 C:\\Documents\today\foo and end up with something completely
 unexpected. 
 
 Oh wait, no, that still wouldn't work, because they'd end up with 
 C:\Documentstodayfoo. So copying bash doesn't work.
 

There is of course no problem at all so long as you stick to writing your 
paths as MS intended them to be written: 8.3 and UPPERCASE

 C:\DOCUME~1\TODAY\FOO
'C:\\DOCUME~1\\TODAY\\FOO'

:^)


-- 
Duncan Booth http://kupuguy.blogspot.com
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Mon, 10 Aug 2009 00:32:30 -0700, Douglas Alan wrote:

 In C++, if I know that the code I'm looking at compiles, then I never
 need worry that I've misinterpreted what a string literal means.

If you don't know what your string literals are, you don't know what your 
program does. You can't expect the compiler to save you from semantic 
errors. Adding escape codes into the string literal doesn't change this 
basic truth.

Semantics matters, and unlike syntax, the compiler can't check it. 
There's a difference between a program that does the equivalent of:

os.system(cp myfile myfile~)

and one which does this

os.system(rm myfile myfile~)


The compiler can't save you from typing 1234 instead of 11234, or 31.45 
instead of 3.145, or My darling Ho instead of My darling Jo, so why 
do you expect it to save you from typing abc\d instead of abc\\d?

Perhaps it can catch *some* errors of that type, but only at the cost of 
extra effort required to defeat the compiler (forcing the programmer to 
type \\d to prevent the compiler complaining about \d). I don't think the 
benefit is worth the cost. You and your friend do. Who is to say you're 
right?



 At
 least not if it doesn't have any escape characters in it that I'm not
 familiar with. But in Python, if I see, \f\o\o\b\a\z, I'm not really
 sure what I'm seeing, as I surely don't have committed to memory some of
 the more obscure escape sequences. If I saw this in C++, and I knew that
 it was in code that compiled, then I'd at least know that there are some
 strange escape codes that I have to look up. 

And if you saw that in Python, you'd also know that there are some 
strange escape codes that you have to look up. Fortunately, in Python, 
that's really simple:

 \f\o\o\b\a\z
'\x0c\\o\\o\x08\x07\\z'

Immediately you can see that the \o and \z sequences resolve to 
themselves, and the \f \b and \a don't.



 Unlike with Python, it
 would never be the case in C++ code that the programmer who wrote the
 code was just too lazy to type in \\f\\o\\o\\b\\a\\z instead.

But if you see abc\n, you can't be sure whether the lazy programmer 
intended abc+newline, or abc+backslash+n. Either way, the compiler 
won't complain.


 
 You just have to memorize it. If you don't know what a backslash escape
 is going to do, why would you use it?
 
 (1) You're looking at code that someone else wrote, or (2) you forget to
 type \\ instead of \ in your code (or get lazy sometimes), as that
 is okay most of the time, and you inadvertently get a subtle bug.

The same error can occur in C++, if you intend \\n but type \n by 
mistake. Or vice versa. The compiler won't save you from that.



 This is especially important when reading (as opposed to writing) code.
 You read somebody else's code, and see foo\xbar\n. Let's say you know
 it compiles without warning. Big deal -- you don't know what the escape
 codes do unless you've memorized them. What does \n resolve to? chr(13)
 or chr(97) or chr(0)? Who knows?
 
 It *is* a big deal. Or at least a non-trivial deal. It means that you
 can tell just by looking at the code that there are funny characters in
 the string, and not just a backslashes. 

I'm not entirely sure why you think that's a big deal. Strictly speaking, 
there are no funny characters, not even \0, in Python. They're all just 
characters. Perhaps the closest is newline (which is pretty obvious).



 You don't have to go running for
 the manual every time you see code with backslashes, where the upshot
 might be that the programmer was merely saving themselves some typing.

Why do you care if there are funny characters?

In C++, if you see an escape you don't recognize, do you care? Do you go 
running for the manual? If the answer is No, then why do it in Python?

And if the answer is Yes, then how is Python worse than C++?


[...]
 Also, it seems that Python is being inconsistent here. Python knows that
 the string \x doesn't contain a full escape sequence, so why doesn't
 it
 treat the string \x the same way that it treats the string \z?
[...]
 I.e., \z is not a legal escape sequence, so it gets left as \\z.

No. \z *is* a legal escape sequence, it just happens to map to \z.

If you stop thinking of \z as an illegal escape sequence that Python 
refuses to raise an error for, the problem goes away. It's a legal escape 
sequence that maps to backslash + z.



 \x is not a legal escape sequence. Shouldn't it also get left as
 \\x?

No, because it actually is an illegal escape sequence.



  He's particularly annoyed too, that if he types foo\xbar at the
  REPL, it echoes back as foo\\xbar. He finds that to be some sort of
  annoying DWIM feature, and if Python is going to have DWIM features,
  then it should, for example, figure out what he means by \ and not
  bother him with a syntax error in that case.

 Now your friend is confused. This is a good thing. Any backslash you
 see in Python's default string output is *always* an escape:
 
 Well, I think he's more 

Re: Unrecognized escape sequences in string literals

2009-08-10 Thread MRAB

Steven D'Aprano wrote:

On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:


[snip]

My point of view is that
every language has *some* warts; Python just has a bit fewer than most.
It would have been nice, I should think, if this wart had been fixed
in Python 3, as I do consider it to be a minor wart.


And if anyone had cared enough to raise it a couple of years back, it 
possibly might have been.



My preference would've been that a backslash followed by A-Z, a-z, or
0-9 is special, but a backslash followed by any other character is just
the character, except for backslash followed by a newline, which
suppresses the newline.

I would also have preferred a backslash in a raw string to always be a
literal.

Ah well, something for Python 4.x. :-)
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Douglas Alan
On Aug 10, 4:37 am, Steven D'Aprano

 There is at least one good reason for preferring an error, namely that it
 allows Python to introduce new escape codes without going through a long,
 slow process. But the rest of these complaints are terribly unconvincing.


What about:

   o Beautiful is better than ugly
   o Explicit is better than implicit
   o Simple is better than complex
   o Readability counts
   o Special cases aren't special enough to break the rules
   o Errors should never pass silently

?

And most importantly:

   o In the face of ambiguity, refuse the temptation to guess.
   o There should be one -- and preferably only one -- obvious way to
do it.

?

So, what's the one obvious right way to express foo\zbar? Is it

   foo\zbar

or

   foo\\zbar

And if it's the latter, what possible benefit is there in allowing the
former?  And if it's the former, why does Python echo the latter?

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Scott David Daniels

Douglas Alan wrote:

So, what's the one obvious right way to express foo\zbar? Is it
   foo\zbar
or
   foo\\zbar
And if it's the latter, what possible benefit is there in allowing the
former?  And if it's the former, why does Python echo the latter?


Actually, if we were designing from fresh (with no C behind us), I might
advocate for \s to be the escape sequence for a backslash.  I don't
particularly like that it is hard to see if the following string
contains a tab:   abc\table.  The string rules reflect C's
rules, and I see little excuse for trying to change them now.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Douglas Alan
On Aug 10, 10:58 am, Scott David Daniels scott.dani...@acm.org
wrote:

 The string rules reflect C's rules, and I see little
 excuse for trying to change them now.

No they don't. Or at least not C++'s rules. C++ behaves exactly as I
should like.

(Or at least g++ does. Or rather *almost* as I would like, as by
default it generates a warning for foo\zbar, while I think that an
error would be somewhat preferable.)

But you're right, it's too late to change this now.

|ouglas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Carl Banks
On Aug 10, 4:41 am, MRAB pyt...@mrabarnett.plus.com wrote:
 Steven D'Aprano wrote:
  On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:

 [snip]
  My point of view is that
  every language has *some* warts; Python just has a bit fewer than most.
  It would have been nice, I should think, if this wart had been fixed
  in Python 3, as I do consider it to be a minor wart.

  And if anyone had cared enough to raise it a couple of years back, it
  possibly might have been.

 My preference would've been that a backslash followed by A-Z, a-z, or
 0-9 is special, but a backslash followed by any other character is just
 the character, except for backslash followed by a newline, which
 suppresses the newline.

That would be reasonable; it'd match the behavior of regexps.


Carl Banks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Carl Banks
On Aug 10, 1:37 am, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:
 On Mon, 10 Aug 2009 00:37:33 -0700, Carl Banks wrote:
  On Aug 9, 11:10 pm, Steven D'Aprano
  ste...@remove.this.cybersource.com.au wrote:
  On Sun, 09 Aug 2009 18:34:14 -0700, Carl Banks wrote:
   Why should a backslash in a string literal be an error?

   Because the behavior of \ in a string is context-dependent, which
   means a reader can't know if \ is a literal character or escape
   character without knowing the context, and it means an innocuous
   change in context can cause a rather significant change in \.

  *Any* change in context is significant with escapes.

  this \nhas two lines

  If you change the \n to a \t you get a significant difference. If you
  change the \n to a \y you get a significant difference. Why is the
  first one acceptable but the second not?

  Because when you change \n to \t, you've haven't changed the meaning of
  the \ character;

 I assume you mean the \ character in the literal, not the (non-existent)
 \ character in the string.

  but when you change \n to \y, you have, and you did so
  without even touching the backslash.

 Not at all.

 '\n' maps to the string chr(10).
 '\y' maps to the string chr(92) + chr(121).

 In both cases the backslash in the literal have the same meaning: grab
 the next token (usually a single character, but not always), look it up
 in a mapping somewhere, and insert the result in the string object being
 built.

That is a ridiculous rationalization.  Nobody sees \y in a string
and thinks it's an escape sequence that returns the bytes '\y'.


[snip rest, because an argument in favor inconsistent, context-
dependent behavior doesn't need any further refutation than to point
out that it is an argument in favor of inconsistent, context-dependent
behavior]


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-10 Thread Steven D'Aprano
On Mon, 10 Aug 2009 08:21:03 -0700, Douglas Alan wrote:

 But you're right, it's too late to change this now.

Not really. There is a procedure for making non-backwards compatible 
changes. If you care deeply enough about this, you could agitate for 
Python 3.2 to raise a PendingDepreciation warning for unexpected escape 
sequences like \z, Python 3.3 to raise a Depreciation warning, and Python 
3.4 to treat it as an error.

It may even be possible to skip the PendingDepreciation warning and go 
straight for Depreciation warning in 3.2.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-09 Thread Steven D'Aprano
On Sun, 09 Aug 2009 12:26:54 -0700, Douglas Alan wrote:

 A friend of mine is just learning Python, and he's a bit tweaked about
 how unrecognized escape sequences are treated in Python.
...
 In any case, I think my friend should mellow out a bit, but we both
 consider this something of a wart. He's just more wart-phobic than I am.
 Is there any way that this behavior can be considered anything other
 than a wart? Other than the unconvincing claim that you can use this
 feature to save you a bit of typing sometimes when you actually want a
 backslash to be in your string?

I'd put it this way: a backslash is just an ordinary character, except 
when it needs to be special. So Python's behaviour is treat backslash as 
a normal character, except for these exceptions while the behaviour your 
friend wants is treat a backslash as an error, except for these 
exceptions.

Why should a backslash in a string literal be an error?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-09 Thread Douglas Alan
Steven D'Aprano wrote:

 Why should a backslash in a string literal be an error?

Because in Python, if my friend sees the string foo\xbar\n, he has
no idea whether the \x is an escape sequence, or if it is just the
characters \x, unless he looks it up in the manual, or tries it out
in the REPL, or what have you. My friend is adamant that it would be
better if he could just look at the string literal and know. He
doesn't want to be bothered to have to store stuff like that in his
head. He wants to be able to figure out programs just by looking at
them, to the maximum degree that that is feasible.

In comparison to Python, in C++, he can just look foo\xbar\n and
know that \x is a special character. (As long as it compiles without
warnings under g++.)

He's particularly annoyed too, that if he types foo\xbar at the
REPL, it echoes back as foo\\xbar. He finds that to be some sort of
annoying DWIM feature, and if Python is going to have DWIM features,
then it should, for example, figure out what he means by \ and not
bother him with a syntax error in that case.

Another reason that Python should not behave the way that it does, is
that it pegs Python into a corner where it can't add new escape
sequences in the future, as doing so will break existing code.
Generating a syntax error instead for unknown escape sequences would
allow for future extensions.

Now not to pick on Python unfairly, most other languages have similar
issues with escape sequences. (Except for the Bourne Shell and bash,
where \x always just means x, no matter what character x happens
to be.) But I've been telling my friend for years to switch to Python
because of how wonderful and consistent Python is in comparison to
most other languages, and now he seems disappointed and seems to think
that Python is just more of the same.

Of course I think that he's overreacting a bit. My point of view is
that every language has *some* warts; Python just has a bit fewer than
most. It would have been nice, I should think, if this wart had been
fixed in Python 3, as I do consider it to be a minor wart.

|ouglas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-09 Thread Carl Banks
On Aug 9, 5:06 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:
 On Sun, 09 Aug 2009 12:26:54 -0700, Douglas Alan wrote:
  A friend of mine is just learning Python, and he's a bit tweaked about
  how unrecognized escape sequences are treated in Python.
 ...
  In any case, I think my friend should mellow out a bit, but we both
  consider this something of a wart. He's just more wart-phobic than I am.
  Is there any way that this behavior can be considered anything other
  than a wart? Other than the unconvincing claim that you can use this
  feature to save you a bit of typing sometimes when you actually want a
  backslash to be in your string?

 I'd put it this way: a backslash is just an ordinary character, except
 when it needs to be special. So Python's behaviour is treat backslash as
 a normal character, except for these exceptions while the behaviour your
 friend wants is treat a backslash as an error, except for these
 exceptions.

 Why should a backslash in a string literal be an error?

Because the behavior of \ in a string is context-dependent, which
means a reader can't know if \ is a literal character or escape
character without knowing the context, and it means an innocuous
change in context can cause a rather significant change in \.

IOW it's an error-prone mess.  It would be better if Python (like C)
treated \ consistently as an escape character.  (And in raw strings,
consistently as a literal.)

It's kind of a minor issue in terms of overall real-world importance,
but in terms of raw unPythonicness this might be the worst offense the
language makes.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Unrecognized escape sequences in string literals

2009-08-09 Thread Douglas Alan
On Aug 9, 8:06 pm, Steven D'Aprano wrote:

 while the behaviour your
 friend wants is treat a backslash as an error, except for these
 exceptions.

Besides, can't all error situations be described as, treat the error
situation as an error, except for the exception of when the situation
isn't an error???

The behavior my friend wants isn't any more exceptional than that!

|ouglas
-- 
http://mail.python.org/mailman/listinfo/python-list