[issue10552] Tools/unicode/gencodec.py error

2021-11-24 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2021-11-24 Thread Irit Katriel


Irit Katriel  added the comment:

I don't think Martin's patch has been applied. Is it needed?

--
nosy: +iritkatriel

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2015-01-12 Thread Martin Panter

Martin Panter added the comment:

Here is a new version of Kuchling’s patch. I restored some mapping files which 
do not give any errors (including the mac_turkish codec, which is actually 
documented), and removed both readme files.

--
components: +Unicode
nosy: +haypo, vadmium
versions: +Python 3.4
Added file: http://bugs.python.org/file37687/10552-remove-apple-files-v2.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2014-12-31 Thread A.M. Kuchling

Changes by A.M. Kuchling a...@amk.ca:


--
nosy:  -akuchling

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2014-06-29 Thread Alexander Belopolsky

Changes by Alexander Belopolsky alexander.belopol...@gmail.com:


--
assignee: belopolsky - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2014-06-29 Thread Alexander Belopolsky

Changes by Alexander Belopolsky alexander.belopol...@gmail.com:


--
nosy: +hynek, ned.deily, ronaldoussoren

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2013-11-10 Thread A.M. Kuchling

A.M. Kuchling added the comment:

For the Mac issue, we could just delete the mapping files before processing 
them.  I've attached a patch that modifies the Makefile.

--
nosy: +akuchling
Added file: http://bugs.python.org/file32565/10552-remove-apple-files.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-30 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Committed in revision 86891.  Keeping open to address Mac issue.

--
assignee:  - belopolsky
components: +Macintosh
priority: normal - low
stage: commit review - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Martin,

I believe you were the last to update the unicode database. (See r85371.)  Did 
you use python2.x to generate it or you have your own private copy of these 
tools?

I noticed that genwincodecs.bat refers to c:\python26\python in 2.7 branch and 
c:\python30\python in py3k.  Could this be an indication that these tools are 
out of date?

What is the plan for maintaining these tools?  Should fixes be done in 2.7 and 
3.x be generated by 2to3? Or should fixes go to py3k and backported to 2.7 when 
they don't add new features?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

gencodec.py is only rarely used, namely when adding new codecs based
on Unicode mapping files.

It is not run regularly on the files from ftp.unicode.org and only
updated on demand.

AFAIK, it was last used on Python2 and never on Python3, hence the
errors you find with it.

BTW: You appear to have a comma appended to the constant, that doesn't
belong there:

+# Placeholder for a missing codepoint
+MISSING_CODE = -1,
+

Perhaps that's causing the second error you are seeing.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

On Mon, Nov 29, 2010 at 1:21 PM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
 BTW: You appear to have a comma appended to the constant, that doesn't
 belong there:

 +# Placeholder for a missing codepoint
 +MISSING_CODE = -1,
 +

 Perhaps that's causing the second error you are seeing.

No, that comma was a left-over from the attempt to fix the
mac_chinsimp error.  The trace that I reported was generated with
MISSING_CODE = -1.   I am replacing the patch.

Is it ok to commit a partial fix?  It may take longer to fix the mac error.

--
Added file: http://bugs.python.org/file19874/issue10552a.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___Index: Tools/unicode/gencodec.py
===
--- Tools/unicode/gencodec.py   (revision 86837)
+++ Tools/unicode/gencodec.py   (working copy)
@@ -34,6 +34,9 @@
 # Standard undefined Unicode code point
 UNI_UNDEFINED = chr(0xFFFE)
 
+# Placeholder for a missing codepoint
+MISSING_CODE = -1
+
 mapRE = re.compile('((?:0x[0-9a-fA-F]+\+?)+)'
'\s+'
'((?:(?:0x[0-9a-fA-Z]+|[A-Za-z]+)\+?)*)'
@@ -52,7 +55,7 @@
 
 
 if not codes:
-return None
+return MISSING_CODE
 l = codes.split('+')
 if len(l) == 1:
 return int(l[0],16)
@@ -60,8 +63,8 @@
 try:
 l[i] = int(l[i],16)
 except ValueError:
-l[i] = None
-l = [x for x in l if x is not None]
+l[i] = MISSING_CODE
+l = [x for x in l if x != MISSING_CODE]
 if len(l) == 1:
 return l[0]
 else:
@@ -113,7 +116,7 @@
 # mappings to None for the rest
 if len(identity) = len(unmapped):
 for enc in unmapped:
-enc2uni[enc] = (None, )
+enc2uni[enc] = (MISSING_CODE, )
 enc2uni['IDENTITY'] = 256
 
 return enc2uni
@@ -211,7 +214,7 @@
 (mapkey, mapcomment) = mapkey
 if isinstance(mapvalue, tuple):
 (mapvalue, mapcomment) = mapvalue
-if mapkey is None:
+if mapkey == MISSING_CODE:
 continue
 table[mapkey] = (mapvalue, mapcomment)
 if mapkey  maxkey:
@@ -223,11 +226,11 @@
 # Create table code
 for key in range(maxkey + 1):
 if key not in table:
-mapvalue = None
+mapvalue = MISSING_CODE
 mapcomment = 'UNDEFINED'
 else:
 mapvalue, mapcomment = table[key]
-if mapvalue is None:
+if mapvalue == MISSING_CODE:
 mapchar = UNI_UNDEFINED
 else:
 if isinstance(mapvalue, tuple):
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Alexander Belopolsky

Changes by Alexander Belopolsky belopol...@users.sourceforge.net:


Removed file: http://bugs.python.org/file19843/issue10552a.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Alexander Belopolsky wrote:
 
 Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
 
 On Mon, Nov 29, 2010 at 1:21 PM, Marc-Andre Lemburg
 rep...@bugs.python.org wrote:
 ..
 BTW: You appear to have a comma appended to the constant, that doesn't
 belong there:

 +# Placeholder for a missing codepoint
 +MISSING_CODE = -1,
 +

 Perhaps that's causing the second error you are seeing.
 
 No, that comma was a left-over from the attempt to fix the
 mac_chinsimp error.  The trace that I reported was generated with
 MISSING_CODE = -1.   I am replacing the patch.
 
 Is it ok to commit a partial fix?  It may take longer to fix the mac error.

Sure, we won't need that script anytime soon and if we do, we
can just as well use the Python2 version.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

On Mon, Nov 29, 2010 at 1:38 PM, Marc-Andre Lemburg
rep...@bugs.python.org wrote:
..
 Sure, we won't need that script anytime soon and if we do, we
 can just as well use the Python2 version.

That may not be true.  I compared 2.7 and py3k versions and the later
has some new features:

* unidata_version  changed from 5.2.0 to 6.0.0
* Unihan data is read from zip file
* added processing of DerivedCoreProperties

These changes don't affect gencodec.py, but it may be inconvenient to
run makeunicodedata.py and gencodec.py using different versions of
Python.

I'll check that all non-mac encodings are correctly generated before committing.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-29 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 These changes don't affect gencodec.py, but it may be inconvenient to
 run makeunicodedata.py and gencodec.py using different versions of
 Python.

As MAL explains: these are completely unrelated, independent tools,
and gencodec isn't run more than once per decade (or so). I only ever
run makeunicodedata, and I have been using Python 3 to run it.

The mappings are not supposed to ever change once produced. In
particular, new versions of Unicode cannot affect them, since the
existing characters all map fine to existing code points, which will
not change their meaning per Unicode stability criteria.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Alexander Belopolsky

New submission from Alexander Belopolsky belopol...@users.sourceforge.net:

$ ../../python.exe gencodec.py MAPPINGS/VENDORS/MISC/ build/
converting APL-ISO-IR-68.TXT to build/apl_iso_ir_68.py and 
build/apl_iso_ir_68.mapping
converting ATARIST.TXT to build/atarist.py and build/atarist.mapping
converting CP1006.TXT to build/cp1006.py and build/cp1006.mapping
converting CP424.TXT to build/cp424.py and build/cp424.mapping
Traceback (most recent call last):
  File gencodec.py, line 421, in module
convertdir(*sys.argv[1:])
  File gencodec.py, line 391, in convertdir
pymap(mappathname, map, dirprefix + codefile,name,comments)
  File gencodec.py, line 355, in pymap
code = codegen(name,map,encodingname,comments)
  File gencodec.py, line 268, in codegen
precisions=(4, 2))
  File gencodec.py, line 152, in python_mapdef_code
mappings = sorted(map.items())
TypeError: unorderable types: NoneType()  int()

It does appear to have been updated for 3.x:

$ python2.7 gencodec.py MAPPINGS/VENDORS/MISC/ build/
Traceback (most recent call last):
  File gencodec.py, line 35, in module
UNI_UNDEFINED = chr(0xFFFE)
ValueError: chr() arg not in range(256)

--
components: Demos and Tools
messages: 122549
nosy: belopolsky, lemburg
priority: normal
severity: normal
status: open
title: Tools/unicode/gencodec.py error
type: behavior
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Attached patch addresses the issue by using -1 instead of None for missing 
codes.  Comparison of generated encoding files to those in Lib/encodings shows 
only whitespace changes except one which appears to be a change on the 
unicode.org side:


diff -b build/koi8_u.py ../../Lib/encodings/koi8_u.py
1c1
  Python Character Mapping Codec koi8_u generated from 
'MAPPINGS/VENDORS/MISC/KOI8-U.TXT' with gencodec.py.
---
  Python Character Mapping Codec koi8_u generated from 
 'python-mappings/KOI8-U.TXT' with gencodec.py.
221c221
 '\u0491'#  0xAD - CYRILLIC SMALL LETTER GHE WITH UPTURN
---
 '\u0491'   #  0xAD - CYRILLIC SMALL LETTER UKRAINIAN GHE WITH UPTURN
237c237
 '\u0490'#  0xBD - CYRILLIC CAPITAL LETTER GHE WITH UPTURN
---
 '\u0490'   #  0xBD - CYRILLIC CAPITAL LETTER UKRAINIAN GHE WITH UPTURN
308d307


--
keywords: +patch
nosy: +loewis
Added file: http://bugs.python.org/file19842/issue10552.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Alexander Belopolsky wrote:
 
 Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
 
 Attached patch addresses the issue by using -1 instead of None for missing 
 codes.  Comparison of generated encoding files to those in Lib/encodings 
 shows only whitespace changes except one which appears to be a change on the 
 unicode.org side:

Please use a global constant instead of the literal -1, e.g. MISSING_CODE.
Thanks.

 diff -b build/koi8_u.py ../../Lib/encodings/koi8_u.py
 1c1
   Python Character Mapping Codec koi8_u generated from 
 'MAPPINGS/VENDORS/MISC/KOI8-U.TXT' with gencodec.py.
 ---
  Python Character Mapping Codec koi8_u generated from 
 'python-mappings/KOI8-U.TXT' with gencodec.py.
 221c221
  '\u0491'#  0xAD - CYRILLIC SMALL LETTER GHE WITH UPTURN
 ---
 '\u0491'   #  0xAD - CYRILLIC SMALL LETTER UKRAINIAN GHE WITH UPTURN
 237c237
  '\u0490'#  0xBD - CYRILLIC CAPITAL LETTER GHE WITH UPTURN
 ---
 '\u0490'   #  0xBD - CYRILLIC CAPITAL LETTER UKRAINIAN GHE WITH UPTURN
 308d307
 

That's just a comment and doesn't change the semantics of the codec.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Attached patch uses MISSING_CODE as Mark suggested.  There are still errors 
apparently because parsecodes() may return either an int or a tuple.  I think 
only mac encodings are affected, so I would like to commit the current patch 
before tackling this issue. 

$ ../../python.exe  gencodec.py MAPPINGS/VENDORS/APPLE/ build/ mac_
converting ARABIC.TXT to build/mac_arabic.py and build/mac_arabic.mapping
converting CELTIC.TXT to build/mac_celtic.py and build/mac_celtic.mapping
converting CENTEURO.TXT to build/mac_centeuro.py and build/mac_centeuro.mapping
converting CHINSIMP.TXT to build/mac_chinsimp.py and build/mac_chinsimp.mapping
Traceback (most recent call last):
  File gencodec.py, line 424, in module
convertdir(*sys.argv[1:])
  File gencodec.py, line 394, in convertdir
pymap(mappathname, map, dirprefix + codefile,name,comments)
  File gencodec.py, line 358, in pymap
code = codegen(name,map,encodingname,comments)
  File gencodec.py, line 271, in codegen
precisions=(4, 2))
  File gencodec.py, line 155, in python_mapdef_code
mappings = sorted(map.items())
TypeError: unorderable types: tuple()  int()

--
stage:  - commit review
Added file: http://bugs.python.org/file19843/issue10552a.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10552] Tools/unicode/gencodec.py error

2010-11-27 Thread Alexander Belopolsky

Alexander Belopolsky belopol...@users.sourceforge.net added the comment:

Please ignore Makefile changes in the patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10552
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com