[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2012-01-05 Thread Benjamin Peterson
Benjamin Peterson benja...@python.org added the comment: I'm just going to close this and say use 3.3. -- nosy: +benjamin.peterson resolution: - out of date status: open - closed ___ Python tracker rep...@bugs.python.org

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2011-09-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: This issue has been fixed in Python 3.3 thanks to the PEP 393. -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10521

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2011-09-29 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: It can still be fixed on 2.7/3.2 though. -- versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10521 ___

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-27 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: I agree that s.center(char, n).encode('utf-8') should be the same on both the builds -- even if their len() will be different -- for the following reasons: 1) the string will eventually be encoded, and if they the result is the same on

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-27 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: After reading the additional messages here and on a similar issue Alexander opened after this, I seem the point of wanting to make the difference between the two types of builds as transparent as sensibly possible. From that viewpoint,

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-26 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: As a practical matter, I think that for at least the next decade, people are at least as likely to want to fill with a composed, multi-BMP-codepoint 'char' (grapheme) as with a non-BMP char. So to me, failure with the latter is no worse than

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-26 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Fri, Nov 26, 2010 at 6:37 PM, Terry J. Reedy rep...@bugs.python.org wrote: Terry J. Reedy tjre...@udel.edu added the comment: As a practical matter, I think that for at least the next decade, people are at least as

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-26 Thread Eric Smith
Eric Smith e...@trueblade.com added the comment: I think these macros would be a reasonable approach. I think str.center, etc. should support non-BMP chars, because to not do so can raise an exception. Supporting composed graphemes seems like another problem altogether. And while we could fix

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
New submission from Alexander Belopolsky belopol...@users.sourceforge.net: 'xyz'.center(20, '\U00100140') Traceback (most recent call last): File stdin, line 1, in module TypeError: The fill character must be exactly one character long str.ljust and str.rjust are similarly affected.

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The question is, what should it do with such an input? Pretend it's a single char (but other chars in the source string won't get the same treatment)? Treat it as a two-char string (but then center() and friends should logically be extended to

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Eric Smith
Eric Smith e...@trueblade.com added the comment: str.__format__ and friends (int, float, complex) also have this same problem. For example, when they're computing the fill character: format('', 'x^') '' format('', '\U00100140^') Traceback (most recent call last): File stdin, line 1, in

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Wed, Nov 24, 2010 at 10:33 AM, Antoine Pitrou rep...@bugs.python.org wrote: .. The question is, what should it do with such an input? I think the rule for such functions should be that if input.encode('utf-8') is the

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10521 ___ ___

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg m...@egenix.com added the comment: Alexander Belopolsky wrote: New submission from Alexander Belopolsky belopol...@users.sourceforge.net: 'xyz'.center(20, '\U00100140') Traceback (most recent call last): File stdin, line 1, in module TypeError: The fill character

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Wed, Nov 24, 2010 at 3:37 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. I don't think we should change that for the formatting methods. That's a reasonable position. What about 'Lo' '\N{OLD ITALIC LETTER

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: On Wed, Nov 24, 2010 at 3:37 PM, Marc-Andre Lemburg rep...@bugs.python.org wrote: .. I don't think we should change that for the formatting methods. That's a reasonable position. What about unicodedata.category('\N{OLD

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Here is another str method not ready for non-BMP chars: u = '\U00010140' u.translate({ord(u):ord('A')}) 'ŀ' (expected 'A') u = 'B' u.translate({ord(u):ord('A')}) 'A' --

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: I think that methods like str.isalpha can and should be fixed. Since _PyUnicode_IsAlpha now accepts a Py_UCS4, the body of unicode_isalpha can be changed to convert normal chars and surrogates pairs to a Py_UCS4 before calling

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Alexander Belopolsky
Alexander Belopolsky belopol...@users.sourceforge.net added the comment: Here is another proof of concept patch for the isalpha issue that introduces a higher level abstraction macro - Py_UNICODE_NEXT. It should be possible to reuse this macro in all isxyz methods and other places where

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10521 ___ ___

[issue10521] str methods don't accept non-BMP fillchar on a narrow Unicode build

2010-11-24 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: issue9200 already proposes a similar change to str.is* methods. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10521 ___