[issue30586] Encode to EBCDIC doesn't take into account conversion table irregularities
Vladimir Filippov added the comment: According to ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT symbols [ and ] have other codes (instead of 0xAD and 0xBD): 0xBA0x005B #LEFT SQUARE BRACKET 0xBB0x005D #RIGHT SQUARE BRACKET Looks like ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP500.TXT was created based on https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob.adref.doc/topics/r_deeadvrf_ASCII_to_EBCDIC.html But this information "This translation is not bidirectional. Some EBCDIC characters cannot be translated to ASCII and some conversion irregularities exist in the table. For more information, see Conversion table irregularities." was ignored. Additional, this line from CP500.TXT: 0xBB0x007C #VERTICAL LINE haven't any source in IBM's table. Example from z/OS mainframe: --- bash-4.3$ iconv -f 819 -t 1047 -T ascii.txt > ebcdic.txt bash-4.3$ ls -T *.txt t ISO8859-1 T=on ascii.txt t IBM-1047T=on ebcdic.txt bash-4.3$ cat ascii.txt ![]|bash-4.3$ od -h ascii.txt 0021 5B 5D 7C 04 bash-4.3$ cat ebcdic.txt ![]|bash-4.3$ od -h ebcdic.txt 005A AD BD 4F 04 --- -- status: pending -> open ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30586> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30586] Encode to EBCDIC doesn't take into account conversion table irregularities
New submission from Vladimir Filippov: These 4 symbols were encoded incorrectly to EBCDIC (codec cp500): "![]|". Correct table of conversation for these symbols described in https://www.ibm.com/support/knowledgecenter/SSZJPZ_11.3.0/com.ibm.swg.im.iis.ds.parjob.adref.doc/topics/r_deeadvrf_Conversion_Table_Irregularities.html This code: ascii = '![]|'; print("ASCII: " + bytes(ascii, 'ascii').hex()) res = ascii.encode('cp500') print ("EBCDIC: " +res.hex()) on Python 3.6.1 produce this output: ASCII: 215b5d7c EBCDIC: 4f4a5abb Expected encoding (from IBM's table): ! - 5A [ - AD ] - BD | - 4F Workaround: use this translation after encoding bytes.maketrans(b'\x4F\x4A\x5A\xBB', b'\x5A\xAD\xBD\x4F') -- components: Unicode messages: 295329 nosy: Vladimir Filippov, ezio.melotti, haypo priority: normal severity: normal status: open title: Encode to EBCDIC doesn't take into account conversion table irregularities type: behavior versions: Python 3.6 ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue30586> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com