Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-23 Thread Nick Coghlan
John J Lee wrote:
 On Mon, 21 Aug 2006, Nick Coghlan wrote:
 
 John J Lee wrote:
 And once the result has been promoted to unicode, __unicode__ is used 
 directly:

  print repr(%s%s % (a(), a()))
 __str__
 accessing __main__.a object at 0x00AF66F0.__unicode__
 __str__
 accessing __main__.a object at 0x00AF6390.__unicode__
 __str__
 u'hihi'
 I don't understand this part.  Why is __unicode__ called?  Your example 
 doesn't appear to show this happening once [i.e., because?] the result has 
 been promoted to unicode -- if that were true, it would stand to reason 
 wink that the interpreter would then conclude it should call
 __unicode__ for all remaining %s, and not bother with __str__.
 It does try to call unicode directly, but because the example object doesn't 
 supply __unicode__ it ends up falling back to __str__ instead. The behaviour 
 is clearer when the example object provides both methods:
 [...]
 
 If the interpreter is falling back from __unicode__ to __str__ (rather 
 than the other way around, kind-of), that makes much more sense.  I 
 understood that __unicode__ was not provided, of course -- what wasn't 
 clear to me was why the interpreter was calling/accessing those 
 methods/attributes in the sequence it does.  Still not sure I understand 
 what the third __str__ above comes from, but until I've thought it through 
 again, that's my problem.

The sequence is effectively:
x, y = a(), a()
str(x) # calls x.__str__
unicode(x) # tries x.__unicode__, fails, falls back to x.__str__
unicode(y) # tries y.__unicode__, fails, falls back to y.__str__

The trick in 2.5 is that the '%s' format code, instead of actually calling
str(x), calls x.__str__() directly, and promotes the result to Unicode if
x.__str__() returns a Unicode result.

I'll try to do something to clear up that section of the documentation before
2.5 final.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-22 Thread John J Lee
On Mon, 21 Aug 2006, Nick Coghlan wrote:

 John J Lee wrote:
 And once the result has been promoted to unicode, __unicode__ is used 
 directly:

  print repr(%s%s % (a(), a()))
 __str__
 accessing __main__.a object at 0x00AF66F0.__unicode__
 __str__
 accessing __main__.a object at 0x00AF6390.__unicode__
 __str__
 u'hihi'
 
 I don't understand this part.  Why is __unicode__ called?  Your example 
 doesn't appear to show this happening once [i.e., because?] the result has 
 been promoted to unicode -- if that were true, it would stand to reason 
 wink that the interpreter would then conclude it should call
 __unicode__ for all remaining %s, and not bother with __str__.

 It does try to call unicode directly, but because the example object doesn't 
 supply __unicode__ it ends up falling back to __str__ instead. The behaviour 
 is clearer when the example object provides both methods:
[...]

If the interpreter is falling back from __unicode__ to __str__ (rather 
than the other way around, kind-of), that makes much more sense.  I 
understood that __unicode__ was not provided, of course -- what wasn't 
clear to me was why the interpreter was calling/accessing those 
methods/attributes in the sequence it does.  Still not sure I understand 
what the third __str__ above comes from, but until I've thought it through 
again, that's my problem.


John
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-21 Thread Nick Coghlan
John J Lee wrote:
 And once the result has been promoted to unicode, __unicode__ is used 
 directly:

  print repr(%s%s % (a(), a()))
 __str__
 accessing __main__.a object at 0x00AF66F0.__unicode__
 __str__
 accessing __main__.a object at 0x00AF6390.__unicode__
 __str__
 u'hihi'
 
 I don't understand this part.  Why is __unicode__ called?  Your example 
 doesn't appear to show this happening once [i.e., because?] the result 
 has been promoted to unicode -- if that were true, it would stand to 
 reason wink that the interpreter would then conclude it should call
 __unicode__ for all remaining %s, and not bother with __str__.

It does try to call unicode directly, but because the example object doesn't 
supply __unicode__ it ends up falling back to __str__ instead. The behaviour 
is clearer when the example object provides both methods:

  # Example (2.5b3)
... class a(object):
...  def __str__(self):
...  print running __str__
...  return u'hi'
...  def __unicode__(self):
...  print running __unicode__
...  return u'hi'
...
  print repr(%s%s % (a(), a()))
running __str__
running __unicode__
running __unicode__
u'hihi'

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-20 Thread John J Lee
On Sun, 20 Aug 2006, Nick Coghlan wrote:

 John J Lee wrote:
  Is this a bug?

 I don't believe so - the string formatting documentation states that the 
 result will be unicode if either the format string is unicode or any of the 
 objects passed to a %s format code is unicode.

 That latter part has just been extended to include any object that returns 
 Unicode from __str__, instead of being restricted to actual Unicode 
 instances.

 Note that the following behaves the same way regardless of whether you use 
 2.4 or 2.5:
 %s % 'hi'
 %s % u'hi'

Given that, the following wording should be changed:

http://docs.python.org/lib/typesseq-strings.html

Conversion  Meaning   Notes
...
s   String (converts any python object using str()).  (4)
...
(4) If the object or format provided is a unicode string, the resulting 
string will also be unicode.


The note (4) says that the result will be unicode, but it doesn't say how, 
in this case, that comes about.  This case is confusing because the docs 
claim string formatting with %s converts ... using str(), and yet 
str(a()) returns a bytestring.  Does it *really* use str, or just __str__? 
Surely the latter? (given the observed behaviour, and not reading the C 
source)


FWIW, this change broke epydoc (fails with an AssertionError -- so perhaps 
without the assert it would still work, dunno).


 And once the result has been promoted to unicode, __unicode__ is used 
 directly:

print repr(%s%s % (a(), a()))
 __str__
 accessing __main__.a object at 0x00AF66F0.__unicode__
 __str__
 accessing __main__.a object at 0x00AF6390.__unicode__
 __str__
 u'hihi'

I don't understand this part.  Why is __unicode__ called?  Your example 
doesn't appear to show this happening once [i.e., because?] the result 
has been promoted to unicode -- if that were true, it would stand to 
reason wink that the interpreter would then conclude it should call
__unicode__ for all remaining %s, and not bother with __str__.  If OTOH 
__unicode__ is called because __str__ returned a unicode object, it makes
(very slightly) more sense that it goes through the same 
__str__-then-__unicode__ rigmarole for each object on the RHS of the %.

But none of that seems to make a huge amount of sense.  I've now found the 
September 2004 discussion of this, and I'm none the wiser.


John

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-20 Thread Neil Schemenauer
John J Lee [EMAIL PROTECTED] wrote:
 The note (4) says that the result will be unicode, but it doesn't say how, 
 in this case, that comes about.  This case is confusing because the docs 
 claim string formatting with %s converts ... using str(), and yet 
 str(a()) returns a bytestring.  Does it *really* use str, or just __str__? 
 Surely the latter? (given the observed behaviour, and not reading the C 
 source)

It uses __str__ and confirms that the returned object is a 'str' or
'unicode'.  The docs are not precise but they were not for 2.4
either.  Note the following case:

'%s' % u'Hello!'

The operand is not forced to be a str.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-20 Thread John J Lee
On Sun, 20 Aug 2006, Neil Schemenauer wrote:

 John J Lee [EMAIL PROTECTED] wrote:
 The note (4) says that the result will be unicode, but it doesn't say how,
 in this case, that comes about.  This case is confusing because the docs
 claim string formatting with %s converts ... using str(), and yet
 str(a()) returns a bytestring.  Does it *really* use str, or just __str__?
 Surely the latter? (given the observed behaviour, and not reading the C
 source)

 It uses __str__ and confirms that the returned object is a 'str' or
 'unicode'.  The docs are not precise but they were not for 2.4
 either.  Note the following case:
[...]

OK, but I assume you're not saying that the fact that the docs were broken 
in 2.4 implies they shouldn't be fixed now?

I would suggest revised wording, but I'm clearly confused about what 
actually goes on under the hood...


John

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] String formatting / unicode 2.5 bug?

2006-08-19 Thread John J Lee
Is this a bug?

# run with 2.4 and then with 2.5 (I'm running release25-maint:51410)
class a(object):

 def __getattribute__(self, name):
 print accessing %r.%s % (self, name)
 return object.__getattribute__(self, name)

 def __str__(self):
 print __str__
 return u'hi'

print repr(str(a()))
print
print repr(%s % a())


John

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com