Re: Literal concatenation, strings vs. numbers (was: Numeric literals in other than base 10 - was Annoying octal notation)

2009-08-24 Thread Carl Banks
On Aug 23, 7:45 pm, Ben Finney ben+pyt...@benfinney.id.au wrote:
 greg g...@cosc.canterbury.ac.nz writes:
  J. Cliff Dyer wrote:

   What happens if you use a literal like 0x10f 304?

  To me the obvious thing to do is concatenate them textually and then
  treat the whole thing as a single numeric literal. Anything else
  wouldn't be sane, IMO.

 Yet, as was pointed out, that behaviour would be inconsistent with the
 concatenation of string literals::

      abc r'def' ughi 'jkl'
     u'abcdefghijkl'

Well my take on it is that this would not be the same as string
concatenation, the series of digits would be parsed as a single token
with spaces automatically removed.  That does make a difference to the
users (it's not just under the covers).

For instance, string concatenation works across lines:

abc
def

but if the numbers were parsed as a single token it wouldn't
necessarily be allowed, and would be unwise, so this is out:

100
200

You might want to also enforce rules such as only a single space can
separate digits, no tabs, not multiple spaces, so this

100  200

would also be right out.  You might even want to enforce that spaces
be at regular intervals.  I don't think it would matter too much that
digit separation can superficially resemble string concatenation if
you don't break the strings across lines, it's not too difficult to
explain what the difference is, and there's really not much chance
anyone would be confused by their meanings.

Having said all that, I would favor _ as a digit separator in Python
any day of the week, and I don't think it's all that important to have
one at all.

HOWEVER, I once proposed that if I were designing a new language I'd
consider allowing spaces in identifiers.  (That didn't stop people
from arguing why it would be confusing in Python, but never mind
that.)  If spaces were allowed in identifiers, then I'd be also in
favor of spaces in numeric literals.


 So, different representations of literals are parsed as separate
 literals, then concatenated. To have the behaviour you describe, the
 case needs to be made separately that digit concatenation should not be
 consistent with the established string literal parsing behaviour.

Well, one doesn't really *need* to make that case, they just might not
care about consistency.

But if they did I think Erik's case is a good one: very little chance
of confusion because there's really only one reasonable
interpretation.  The point of consistency is to help understand things
by analogy, but if analogy doesn't help understanding--and it wouldn't
in this case--there's no point.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Literal concatenation, strings vs. numbers (was: Numeric literals in other than base 10 - was Annoying octal notation)

2009-08-24 Thread Steven D'Aprano
On Mon, 24 Aug 2009 12:45:25 +1000, Ben Finney wrote:

 greg g...@cosc.canterbury.ac.nz writes:
 
 J. Cliff Dyer wrote:

  What happens if you use a literal like 0x10f 304?

 To me the obvious thing to do is concatenate them textually and then
 treat the whole thing as a single numeric literal. Anything else
 wouldn't be sane, IMO.

Agreed. It's the only sane way to deal with concatenating numeric 
literals. It makes it simple and easy to understand: remove the 
whitespace from inside the literal, and parse as normal.

123 4567 = 1234567  # legal
0xff 123 = 0xff123  # legal
123 0xff = 1230xff  # illegal

The first two examples would be legal, the last would raise a syntax 
error, for obvious reasons. This would also work for floats:

1.23 4e5 = 1.234e5  # legal
1.23 4.5 = 1.234.5  # illegal
1e23 4e5 = 1e234e5  # illegal



 Yet, as was pointed out, that behaviour would be inconsistent with the
 concatenation of string literals::
 
  abc r'def' ughi 'jkl'
 u'abcdefghijkl'

Unicode/byte conversion is obviously a special case, and arguably should 
have been prohibited, although practicality beats purity suggests that 
a single unicode string in the sequence should make the lot unicode. 
(What else could it mean?)

In any case, numeric concatenation and string concatenation are very 
different beasts. With strings, you have to interpret each piece as 
either bytes or characters, you have to treat escapes specially, you have 
to deal with matching delimiters. For numeric concatenation, none of 
those complications is relevant: there is no equivalent to the byte/
character dichotomy, there are no escape sequences, there are no 
delimiters.

Numeric literals are much simpler than string literals, consequently the 
concatenation rule can be correspondingly simpler too. There's no need to 
complicate it by *adding* complexity: you can't have mixed bases in a 
single numeric literal without spaces, why would you expect to have mixed 
bases in one with spaces?




-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Literal concatenation, strings vs. numbers (was: Numeric literals in other than base 10 - was Annoying octal notation)

2009-08-23 Thread Ben Finney
greg g...@cosc.canterbury.ac.nz writes:

 J. Cliff Dyer wrote:

  What happens if you use a literal like 0x10f 304?

 To me the obvious thing to do is concatenate them textually and then
 treat the whole thing as a single numeric literal. Anything else
 wouldn't be sane, IMO.

Yet, as was pointed out, that behaviour would be inconsistent with the
concatenation of string literals::

 abc r'def' ughi 'jkl'
u'abcdefghijkl'

So, different representations of literals are parsed as separate
literals, then concatenated. To have the behaviour you describe, the
case needs to be made separately that digit concatenation should not be
consistent with the established string literal parsing behaviour.

-- 
 \“What if the Hokey Pokey IS what it's all about?” —anonymous |
  `\   |
_o__)  |
Ben Finney
-- 
http://mail.python.org/mailman/listinfo/python-list