Update of /cvsroot/tmda/tmda/TMDA/pythonlib/email
In directory usw-pr-cvs1:/tmp/cvs-serv28980/TMDA/pythonlib/email

Modified Files:
        Charset.py 
Log Message:
Integrate an important portion of Ben Gertzfield's patch until he gets
his act together and commits it to the email package core.

These changes add proper multibyte-aware support for Korean (EUC-KR,
CP949 aka ks_c__5601-1987 [our favorite spam charset!], ISO-2022-KR,
and Johab) email based on the Korean codecs from the koco project:

http://sf.net/projects/koco/

Before, Korean was treated like any other unknown 8-bit character set,
and wrapped header lines could be split between multi-byte characters,
corrupting the header.

In addition, it specifically lists all known ASCII-like ISO-8859
variants to suggest quoted-printable for headers and bodies, making
messages in mostly-Roman character languages like Turkish work.  Also,
the up-and-coming iso-8859-15 (Euro support) character set will be
encoded with quoted-printable with this patch.


Index: Charset.py
===================================================================
RCS file: /cvsroot/tmda/tmda/TMDA/pythonlib/email/Charset.py,v
retrieving revision 1.4
retrieving revision 1.5
diff -u -r1.4 -r1.5
--- Charset.py  14 Oct 2002 22:58:11 -0000      1.4
+++ Charset.py  18 Oct 2002 23:24:32 -0000      1.5
@@ -35,6 +35,20 @@
     # input        header enc  body enc output conv
     'iso-8859-1':  (QP,        QP,      None),
     'iso-8859-2':  (QP,        QP,      None),
+    'iso-8859-3':  (QP,        QP,      None),
+    'iso-8859-4':  (QP,        QP,      None),
+    # iso-8859-5 is Cyrillic, and not especially used
+    # iso-8859-6 is Arabic, also not particularly used
+    # iso-8859-7 is Greek, QP will not make it readable
+    # iso-8859-8 is Hebrew, QP will not make it readable
+    'iso-8859-9':  (QP,        QP,      None),
+    'iso-8859-10': (QP,        QP,      None),
+    # iso-8859-11 is Thai, QP will not make it readable
+    'iso-8859-13': (QP,        QP,      None),
+    'iso-8859-14': (QP,        QP,      None),
+    'iso-8859-15': (QP,        QP,      None),
+    'windows-1252':(QP,        QP,      None),
+    'viscii':      (QP,        QP,      None),
     'us-ascii':    (None,      None,    None),
     'big5':        (BASE64,    BASE64,  None),
     'gb2312':      (BASE64,    BASE64,  None),
@@ -52,6 +66,25 @@
 ALIASES = {
     'latin_1': 'iso-8859-1',
     'latin-1': 'iso-8859-1',
+    'latin_2': 'iso-8859-2',
+    'latin-2': 'iso-8859-2',
+    'latin_3': 'iso-8859-3',
+    'latin-3': 'iso-8859-3',
+    'latin_4': 'iso-8859-4',
+    'latin-4': 'iso-8859-4',
+    'latin_5': 'iso-8859-9',
+    'latin-5': 'iso-8859-9',
+    'latin_6': 'iso-8859-10',
+    'latin-6': 'iso-8859-10',
+    'latin_7': 'iso-8859-13',
+    'latin-7': 'iso-8859-13',
+    'latin_8': 'iso-8859-14',
+    'latin-8': 'iso-8859-14',
+    'latin_9': 'iso-8859-15',
+    'latin-9': 'iso-8859-15',
+    'cp949':   'ks_c_5601-1987',
+    'euc_jp':  'euc-jp',
+    'euc_kr':  'euc-kr',
     'ascii':   'us-ascii',
     }
 
@@ -69,6 +102,10 @@
     'euc-jp':      'japanese.euc-jp',
     'iso-2022-jp': 'japanese.iso-2022-jp',
     'shift_jis':   'japanese.shift_jis',
+    'euc-kr':      'korean.euc-kr',
+    'ks_c_5601-1987': 'korean.cp949',
+    'iso-2022-kr': 'korean.iso-2022-kr',
+    'johab':       'korean.johab',
     'gb2132':      'eucgb2312_cn',
     'big5':        'big5_tw',
     'utf-8':       'utf-8',

_______________________________________
tmda-cvs mailing list
http://tmda.net/lists/listinfo/tmda-cvs

Reply via email to