[issue30717] str.center() is not unicode aware

2017-08-02 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Hi, Are you guys still interested? I haven't heard from you in a while -- ___ Python tracker ___

[issue30717] str.center() is not unicode aware

2017-07-23 Thread Socob
Changes by Socob <206a8...@opayq.com>: -- nosy: +Socob ___ Python tracker ___ ___ Python-bugs-list mailing

[issue30717] str.center() is not unicode aware

2017-07-15 Thread Christian Heimes
Changes by Christian Heimes : -- assignee: christian.heimes -> components: +Interpreter Core -SSL, Tests, Tkinter nosy: -christian.heimes ___ Python tracker

[issue30717] str.center() is not unicode aware

2017-07-13 Thread Terry J. Reedy
Terry J. Reedy added the comment: I think it at least plausible that we should add implementations of some of the unicode standard's algorithms. Victor and Serhiy, as two of the active core devs most involved with unicode issues, what do you think? -- nosy: +haypo, serhiy.storchaka,

[issue30717] str.center() is not unicode aware

2017-07-13 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Hello Steven! Thanks for your reactivity! unicodedata.grapheme_cluster_break() takes a unicode code point as an argument and return its GraphemeBreakProperty as a string. Possible values are listed here: http://www.unicode.org/reports/tr29/#CR

[issue30717] str.center() is not unicode aware

2017-07-13 Thread Steven D'Aprano
Steven D'Aprano added the comment: Thank you, but I cannot review your C code. Can you start by telling us what the two functions: unicodedata.grapheme_cluster_break() unicodedata.break_graphemes() take as arguments, and what they return? If we were to call help(function), what would we see?

[issue30717] str.center() is not unicode aware

2017-07-13 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Hello, I implemented unicodedata.break_graphemes() that returns an iterators that spits consecutive graphemes. This is a "test" implementation meant to see what doesn't fits Python's style and design, to discuss naming and implementation details.

[issue30717] str.center() is not unicode aware

2017-07-11 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Hello to all of you, sorry for the delay. Been busy. I added the base code needed to built the grapheme cluster break algorithm. We now have the GraphemeBreakProperty available via unicodedata.grapheme_cluster_break() Can you check that the implementation

[issue30717] str.center() is not unicode aware

2017-07-11 Thread Roundup Robot
Changes by Roundup Robot : -- pull_requests: +2741 ___ Python tracker ___

[issue30717] str.center() is not unicode aware

2017-07-01 Thread R. David Murray
R. David Murray added the comment: See also issue 12568. -- nosy: +r.david.murray ___ Python tracker ___ ___

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Mariatta Wijaya
Changes by Mariatta Wijaya : -- stage: -> needs patch type: -> enhancement ___ Python tracker ___

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Steven D'Aprano
Steven D'Aprano added the comment: http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries talks about *grapheme clusters*, not "graphemes" alone, and it seems clear to me that they are language dependent. For example, it says: The Unicode Standard provides default algorithms for

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Thanks for all those interesting cases you brought here! I didn't think of that at all! I'm using the word "grapheme" as per the definition given in UAX TR29 which is *not* language/locale dependant [1]. This annex is very specific and precise about where

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Steven D'Aprano
Steven D'Aprano added the comment: I don't think graphemes is the right term here. Graphemes are language dependent, for instance "dž" may be considered a grapheme in Croatian. https://en.wikipedia.org/wiki/D%C5%BE http://www.unicode.org/glossary/#grapheme I believe you are referring to

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Guillaume Sanchez
Guillaume Sanchez added the comment: Obviously, I'm talking about str.center() but all functions needing a count of graphemes are then not totally correct. I can fix that and add the corresponding function, or an iterator over graphemes, or whatever seems right :) --

[issue30717] str.center() is not unicode aware

2017-06-20 Thread Guillaume Sanchez
New submission from Guillaume Sanchez: "a⃑".center(width=5, fillchar=".") produces '..a⃑.' instead of '..a⃑..' The reason is that "a⃑" is composed of two code points (2 UCS4 chars), one 'a' and one combining code point "above arrow". str.center() counts the size of the string and fills it