[issue24339] iso6937 encoding missing

2021-06-29 Thread Maarten Derickx
Maarten Derickx added the comment: Hi Marc-Andre Lemburg, Thanks for your responses and guidance. At least your pointers to charmap_encode and charmap_decode helped, since it shows at least what the general idea is on how to deal with these types of encodings. In the mean time I did produce

[issue24339] iso6937 encoding missing

2021-06-29 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Right, the charmap codec was built with the Unicode Consortium mappings in mind. However, you may have some luck decoding the two byte chars in ISO 6937 using combining code points in Unicode. With some extra post processing you could also normalize the

[issue24339] iso6937 encoding missing

2021-06-29 Thread Maarten Derickx
Maarten Derickx added the comment: The route via gencodec or more generally via charmap_encode and charmap_decode seems to be one that is not possible without some low level CPython code adjustments. The reason for this is that charmap_encode and charmap_decode only seem to support mappings

[issue24339] iso6937 encoding missing

2021-06-24 Thread Maarten Derickx
Maarten Derickx added the comment: Hi Marc-Andre Lemburg, Thanks for your reply. I tried using gencodec.py as could be downloaded from https://github.com/python/cpython/blob/main/Tools/unicode/gencodec.py as you mentioned. However the code in gencodec.py seems to be in a much worse shape th

[issue24339] iso6937 encoding missing

2021-06-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Maarten, the code posted on bugs is copyrighted by the person who wrote it. We can only accept it for inclusion in Python after the CLA has been signed, since then we are allowed to relicense it. As a result you can only take John's code and post it else

[issue24339] iso6937 encoding missing

2021-06-22 Thread Maarten Derickx
Maarten Derickx added the comment: Is there any way to contact John Helour? I would still very much like to put this package on github and pypi. And would like to ask him permission for licensing. Or is there some standard open source license under which all code uploaded to https://bugs.pyt

[issue24339] iso6937 encoding missing

2019-05-06 Thread Julien Palard
Julien Palard added the comment: For the moment, I'm closing this issue as there's no activity on it I suspect it may no be that usefull. I may be wrong, so if someone actually needs this, don't hesitate either to put it as a package on PyPI (it should probably go there anyway), either to re

[issue24339] iso6937 encoding missing

2017-05-15 Thread Xiang Zhang
Xiang Zhang added the comment: Would you mind converting this patch to a Github PR John? -- stage: needs patch -> patch review ___ Python tracker ___

[issue24339] iso6937 encoding missing

2017-02-19 Thread Julien Palard
Julien Palard added the comment: John: You should probably package this as a pip module alongisde with a git repository, at least to measure qty of interested persones, and get some feedback / contributions. -- ___ Python tracker

[issue24339] iso6937 encoding missing

2016-12-04 Thread Julien Palard
Julien Palard added the comment: LGTM, for me it's time to release it as a package on pypi to check the adoption rate and see it it's worth adding it in Python and maybe close this issue. -- ___ Python tracker ___

[issue24339] iso6937 encoding missing

2016-12-04 Thread John Helour
John Helour added the comment: Performance issue resolved, more info on error added. I've checked encoding and decoding on a two UTF-8 ~3MiB txt files. Except the first BOM mark (May I ignore it?) all seems work OK. -- ___ Python tracker

[issue24339] iso6937 encoding missing

2016-12-04 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- assignee: -> serhiy.storchaka priority: normal -> low ___ Python tracker ___ ___ Python-bugs-list ma

[issue24339] iso6937 encoding missing

2016-12-04 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file45750/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2016-12-04 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45740/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-12-04 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file45749/check_iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing lis

[issue24339] iso6937 encoding missing

2016-12-03 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file45740/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2016-12-03 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45708/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
John Helour added the comment: Please ignore my previous question about: tmp += bytearray(encoding_map[c], 'latin1', 'ignore') The latest version don't needs such encoding ... -- Added file: http://bugs.python.org/file45708/iso6937.py ___ Python trac

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45707/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file45707/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file45706/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45706/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-11-30 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45697/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-11-29 Thread John Helour
John Helour added the comment: > Please also check whether it's not possible to reuse the charmap codec > functions we have I've found nothing useful, maybe you (as the author) can find something really useful which can improve code readability or increase the performance. Please look at the

[issue24339] iso6937 encoding missing

2016-11-28 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The codec code has a few (performance) issues: * nonspacing_diacritical_marks should be a set for fast lookup * ord(c) in range(0x00, 0xA0) should be rewritten using < and >= * result += bytes([ord(c)]) has exponential timing (it copies the whole bytes

[issue24339] iso6937 encoding missing

2016-11-26 Thread John Helour
John Helour added the comment: If I take the ISO_6937 file as a template for encoding table then increasing the range 0x20..0x7f to 0x00..0xA0 is the simplest solution. -- Added file: http://bugs.python.org/file45654/iso6937.py ___ Python tracker

[issue24339] iso6937 encoding missing

2016-11-26 Thread John Helour
John Helour added the comment: If I take the ISO_6937 file as a template for encoding table then increasing the range 0x20..0x7f to 0x00..0xA0 is the simplest solution. -- ___ Python tracker ___

[issue24339] iso6937 encoding missing

2016-11-26 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file45647/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2016-11-26 Thread Julien Palard
Julien Palard added the comment: According to https://webstore.iec.ch/preview/info_isoiec6937%7Bed3.0%7Den.pdf: > NOTE: The shaded positions 00/00 to 01/15 and 07/15 to 09/15 are outside the > scope of this International Standard. So it's clear to me that they are not undefined, they are just

[issue24339] iso6937 encoding missing

2016-11-25 Thread John Helour
John Helour added the comment: @mdk Big thanks for the checker. >Looks like your implementation is missing some codepoints, like "\t": > >>>> >print("\t".encode(encoding='iso6937')) > >[...] >UnicodeError: encoding with 'iso6937'

[issue24339] iso6937 encoding missing

2016-11-25 Thread John Helour
John Helour added the comment: PEP8 compliant, added missing codepoints, utf-8 to \u rewrited -- Added file: http://bugs.python.org/file45647/iso6937.py ___ Python tracker __

[issue24339] iso6937 encoding missing

2016-11-14 Thread STINNER Victor
STINNER Victor added the comment: Ok. I'm not waiting for a simpler patch reusing existing charmap functions to see the complexity of the codec ;-) -- ___ Python tracker ___

[issue24339] iso6937 encoding missing

2016-11-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > My rule is more to only added encodings used (in practice) as locale > encodings. Just for reference: issue19459, issue21081, issue22679, issue20087. > @Serhiy: Do you think that the encoding is popular enough to pay the price of its maintainance? Yes, it

[issue24339] iso6937 encoding missing

2016-11-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 14.11.2016 13:03, STINNER Victor wrote: > > STINNER Victor added the comment: > > iso6937.py: > >> # from utf-8 to iso6937 >> def iso6937_encode(input,errors,encoding_map): > > Wait, is this code for Python 3? Decode from UTF-8 and encode to ISO-6937 i

[issue24339] iso6937 encoding missing

2016-11-14 Thread Julien Palard
Julien Palard added the comment: @Serhiy @haypo: Popular enough or not, it may start as a lib on pypi, we'll see its usage from here. -- ___ Python tracker ___ _

[issue24339] iso6937 encoding missing

2016-11-14 Thread STINNER Victor
STINNER Victor added the comment: @Serhiy: Do you think that the encoding is popular enough to pay the price of its maintainance? It's already possible to register manually a new encoding in an application. It was even already possible in Python 2.7 (and older). --

[issue24339] iso6937 encoding missing

2016-11-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I think the encoder can just use codecs.charmap_encode(). The decoder seems could be simpler too. Would be nice to generate the ISO 6937 encoding file from external data (e.g. from glibc localedata) like they are generated for other encodings. Take Tools/un

[issue24339] iso6937 encoding missing

2016-11-14 Thread STINNER Victor
STINNER Victor added the comment: iso6937.py: > # from utf-8 to iso6937 > def iso6937_encode(input,errors,encoding_map): Wait, is this code for Python 3? Decode from UTF-8 and encode to ISO-6937 in the same function seems strange to me. I expected that the codec only implements two functions:

[issue24339] iso6937 encoding missing

2016-11-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Just as reference, here's the wikipedia page for the encoding: https://en.wikipedia.org/wiki/ISO/IEC_6937 and this is the ISO document (as preview): http://webstore.iec.ch/preview/info_isoiec6937%7Bed3.0%7Den.pdf (from the German wikipedia page). ---

[issue24339] iso6937 encoding missing

2016-11-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Another comment about coding style: please use \u hex code representations for the decoding map. The stdlib source code is normally kept ASCII compatible and, for codecs, the Unicode code point numbers make it easier to check the mappings for correctne

[issue24339] iso6937 encoding missing

2016-11-13 Thread Xiang Zhang
Changes by Xiang Zhang : -- nosy: +xiang.zhang ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyt

[issue24339] iso6937 encoding missing

2016-11-13 Thread Julien
Julien added the comment: Hi John, thanks for your contribution, Looks like your implementation is missing some codepoints, like "\t": >>> print("\t".encode(encoding='iso6937')) [...] UnicodeError:

[issue24339] iso6937 encoding missing

2015-06-18 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file39575/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file39633/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file39632/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file39583/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Removed file: http://bugs.python.org/file39631/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file39632/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2015-06-05 Thread John Helour
Changes by John Helour : Added file: http://bugs.python.org/file39631/iso6937.py ___ Python tracker ___ ___ Python-bugs-list mailing list Unsu

[issue24339] iso6937 encoding missing

2015-06-01 Thread John Helour
John Helour added the comment: I've rewrote the iso6937 codec into Python 3. Could someone check it please? -- Added file: http://bugs.python.org/file39583/iso6937.py ___ Python tracker __

[issue24339] iso6937 encoding missing

2015-05-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New encoding can be added only in new Python release (3.6). -- nosy: +lemburg, loewis, serhiy.storchaka versions: +Python 3.6 -Python 2.7 ___ Python tracker _

[issue24339] iso6937 encoding missing

2015-05-31 Thread John Helour
New submission from John Helour: Please add encoding for the iso6937 charset. Many settopboxes (DVB-T/S) and relevant devices uses it for displaying EPG, videotext, etc. I've wrote (please look at the attached file) the encoding/decoding conversion codec some years ago. -- components: