[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2013-06-23 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: about the problems you mentioned in msg144836, can you report it in a new issue or, if there are already issues about them, add a message there? I believe that would be #4610. -- nosy: +belopolsky superseder: - Unicode case mappings are

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-21 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: LGTM -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___ ___ Python-bugs-list mailing

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-21 Thread Roundup Robot
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset a985d733b3a3 by Ezio Melotti in branch 'default': #12753: Add support for Unicode name aliases and named sequences. http://hg.python.org/cpython/rev/a985d733b3a3 -- nosy: +python-dev

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-21 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: I committed the patch and the buildbots seem happy. Thanks for the report and the feedback! Tom, about the problems you mentioned in msg144836, can you report it in a new issue or, if there are already issues about them, add a message

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-21 Thread Roundup Robot
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 329b96fe4472 by Ezio Melotti in branch 'default': #12753: fix compilation on Windows. http://hg.python.org/cpython/rev/329b96fe4472 -- ___ Python tracker

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-20 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: If the latest patch is fine I'll commit it shortly. -- stage: patch review - commit review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-20 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Yes, it looks good. Thank you very much. -tom -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-20 Thread Florent Xicluna
Changes by Florent Xicluna florent.xicl...@gmail.com: -- nosy: +flox ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___ ___ Python-bugs-list

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-12 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: If you don't use git-style diffs, Rietveld will much better accommodate patches that don't apply to tip cleanly. Unfortunately, hg git-style diffs don't indicate the base revision, so Rietveld guesses that the base line is tip, and then

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-10 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Removed file: http://bugs.python.org/file23355/issue12753-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-10 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Added file: http://bugs.python.org/file23365/issue12753-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-10 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Removed file: http://bugs.python.org/file23365/issue12753-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-10 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: Added file: http://bugs.python.org/file23374/issue12753-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-10 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: (I had to re-upload the patch a couple of time to get the review button to work. Apparently if there are some conflicts rietveld fails to apply the patch, whereas hg is able to merge files without problems here. Sorry for the noise.)

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-09 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Here is a new patch that stores the names of aliases and named sequences in the Private Use Area. To summarize a bit, this is what we want: | 6.0.0 | 3.2.0 | +---+---+ \N{...} | A | - | .name | - | -

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-09 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Ezio Melotti rep...@bugs.python.org wrote on Sun, 09 Oct 2011 13:21:00 -: Here is a new patch that stores the names of aliases and named sequences in the Private Use Area. Looks good! Thanks! --tom -- title: \N{...}

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-03 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: There are no official English titling rules and as you noted, publishers vary. If there aren't any rules, then how come all book and movie titles always look the same? :) Can we please leave the English language out of this issue?

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-03 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: The patch is pretty much complete, it just needs a review (I left some comments on the review page). One thing that can be added is some compression for the names of the named sequences. I'm not sure I can reuse the same compression used

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-03 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: The patch needs to take versioning into account. It seems that NamedSequences where added in 4.1, and NameAliases in 5.0. So for the moment, when using 3.2 (i.e. when self is not NULL), it is fine to lookup neither. Please put an assertion

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-03 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Ezio Melotti rep...@bugs.python.org wrote on Mon, 03 Oct 2011 04:15:51 -: But it still has to happen at compile time, of course, so I don't know what you could do in Python. Is there any way to change how the compiler behaves even

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-03 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: The main underlying problem is that the internal macros are defined in a way that made sense a long time ago, but no longer do ever since (for example) the Unicode lowercase property stopped being synonymous with GC=Ll and started also

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: The problem with official names is that they have things in them that you are not expected in names. Do you really and truly mean to tell me you think it is somehow **good** that people are forced to write \N{LINE FEED (LF)}

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Attached a new patch with more tests and doc. -- Added file: http://bugs.python.org/file23291/issue12753-3.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Ezio Melotti rep...@bugs.python.org wrote on Sun, 02 Oct 2011 06:46:26 -: Actually Python doesn't seem to support \N{LINE FEED (LF)}, most likely bec= ause that's a Unicode 1 name, and nowadays these codepoints are simply mark= ed

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: Really? White space makes things harder to read? I thought Pythonistas believed the opposite of that. I was surprised at that too ;-). One person's opinion in a specific context. Don't generaliza. English titling rules only capitalize

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Really? White space makes things harder to read? I thought Pythonistas believed the opposite of that. I was surprised at that too ;-). One person's opinion in a specific context. Don't generalize. The example I initially showed

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-02 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: But it still has to happen at compile time, of course, so I don't know what you could do in Python. Is there any way to change how the compiler behaves even vaguely along these lines? I think things like from __future__ import ... do

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: Does that sound fine? Yes, that's fine as well. -- title: \N{...} neglects formal aliases and named sequences from Unicode charnames namespace - \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: You may wish unicode.name() to return the alias in preference, however. -1. .name() is documented (and users familiar with it expect it) as returning the name of the character from the UCD. It doesn't really matter much to me if it's

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-10-01 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Perl does not provide the old 1.0 names at all. We don't have a Unicode 1.0 legacy to support, which makes this cleaner. However, we do provide for the names of the C0 and C1 Control Codes, because apart from Unicode 1.0, they don't

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: The attached patch changes Tools/unicode/makeunicodedata.py to create a list of names and codepoints taken from http://www.unicode.org/Public/6.0.0/ucd/NameAliases.txt and adds it to Modules/unicodename_db.h. During the lookup the

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- assignee: - ezio.melotti stage: needs patch - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: I propose to use a better lookup algorithm using binary search, and then integrate the NamedSequences into this as well. The search result could be a record struct { char *name; int len; Py_UCS4 chars[3]; /* no sequence is more

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Leaving named sequences for unicodedata.lookup() only (and not for \N{}) makes sense. The list of aliases is so small (11 entries) that I'm not sure using a binary search for it would bring any advantage. Having a single lookup algorithm

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Ezio Melotti ezio.melo...@gmail.com added the comment: Leaving named sequences for unicodedata.lookup() only (and not for \N{}) makes sense. There are certainly advantages to that strategy: you don't have to deal with [\N{sequence}]

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-09-30 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: Attached a new patch that adds support for named sequences (still needs some test and can probably be improved). There are certainly advantages to that strategy: you don't have to deal with [\N{sequence}] issues. I assume with [] you

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-26 Thread Guido van Rossum
Guido van Rossum gu...@python.org added the comment: +1 on the feature request. -- nosy: +gvanrossum ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753 ___

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Terry J. Reedy
Terry J. Reedy tjre...@udel.edu added the comment: I verified that the test file raises the quoted SyntaxError on 3.2 on Win7. This: \N{LATIN CAPITAL LETTER GHA} SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-27: unknown Unicode character name is most

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Terry J. Reedy rep...@bugs.python.org wrote on Fri, 19 Aug 2011 22:50:58 -: My current opinion is that adding the aliases might be done in current releases. It certainly would serve the any user who does not know to misspell

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Matthew Barnett
Matthew Barnett pyt...@mrabarnett.plus.com added the comment: For the Line_Break property, one of the possible values is Inseparable, with 2 permitted aliases, the shorter IN (which is reasonable) and Inseperable (ouch!). -- ___ Python tracker

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-19 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Matthew Barnett rep...@bugs.python.org wrote on Fri, 19 Aug 2011 23:36:45 -: For the Line_Break property, one of the possible values is Inseparable, with 2 permitted aliases, the shorter IN (which is reasonable) and Inseperable

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-15 Thread Tom Christiansen
New submission from Tom Christiansen tchr...@perl.com: Unicode character names share a common namespace with formal aliases and with named sequences, but Python recognizes only the original name. That means not everything in the namespace is accessible from Python. (If this is construed to

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-15 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- components: +Unicode nosy: +ezio.melotti stage: - test needed versions: -Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-15 Thread Tom Christiansen
Tom Christiansen tchr...@perl.com added the comment: Here’s the right test file for the right ticket. -- Added file: http://bugs.python.org/file22903/nametests.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12753