[issue4487] Add utf8 alias for email charsets
Changes by Shashwat Anand anand.shash...@gmail.com: -- nosy: -l0nwlf ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: David, can this issue be closed? -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
R. David Murray rdmur...@bitdance.com added the comment: Yes. Benjamin merged this to py3k in r82292. If someone wants to explain to me how to cherry pick the changeset into 3.1 I'd be happy to do it, otherwise I think I'm done with this one :) -- stage: commit review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Marc-Andre Lemburg m...@egenix.com added the comment: R. David Murray wrote: R. David Murray rdmur...@bitdance.com added the comment: For various reasons the email module has a table of character sets. What might be most effective would be for the email module to look a character set name up in the codecs module and find out the cannonical name of the character set, and then look that up in its table (ie: remove the aliases table from email completely, and instead depend on codecs to resolve the cannonical name). Unfortunately the codecs module does not recognize all of the aliases used by email, nor is there necessarily any guarantee that the two modules will agree on the proper cannonical name. I think that the encodings package should be the only source of valid aliases and encoding names - after all, you wouldn't be able to process email content using names or aliases not appearing in the encodings package tables. If there are aliases missing, then we can add them there. If the email packages needs different canonical names, it can apply its own map on the canonical names returned by the encodings package. -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
R. David Murray rdmur...@bitdance.com added the comment: Mark, any objection to my putting this patch in now, and then we'll fix the aliases implementation in 3.2? -- versions: +Python 3.1, Python 3.2 -Python 2.5 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Marc-Andre Lemburg m...@egenix.com added the comment: R. David Murray wrote: R. David Murray rdmur...@bitdance.com added the comment: Mark, any objection to my putting this patch in now, and then we'll fix the aliases implementation in 3.2? No. Please open a new issue targeting Python 3.2 for this. Thanks, -- Marc-Andre Lemburg eGenix.com 2010-07-19: EuroPython 2010, Birmingham, UK44 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
R. David Murray rdmur...@bitdance.com added the comment: Patch committed to trunk in r81705. Leaving issue open pending porting to the other branches, but I've also opened issue 8898 to further change things so that codecs becomes the sole authority for aliases in 3.2. -- resolution: - fixed stage: patch review - commit review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
R. David Murray rdmur...@bitdance.com added the comment: For various reasons the email module has a table of character sets. What might be most effective would be for the email module to look a character set name up in the codecs module and find out the cannonical name of the character set, and then look that up in its table (ie: remove the aliases table from email completely, and instead depend on codecs to resolve the cannonical name). Unfortunately the codecs module does not recognize all of the aliases used by email, nor is there necessarily any guarantee that the two modules will agree on the proper cannonical name. The attached patch instead uses the codecs module as a fallback if the charset name does not appear in the email package's ALIASES or CHARSETS tables. It therefore makes both utf8 and utf_8 work, as well as all the other variants the codec module accepts. The unit test just tests 'utf8', since if that one works all the others should too. I'm tentatively reclassifying this as a bug rather than a feature request, since I think it is a reasonable expectation that email would support at least the same set of encoding names that the rest of Python does. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by R. David Murray rdmur...@bitdance.com: Added file: http://bugs.python.org/file17532/email_accept_codec_aliases.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by R. David Murray rdmur...@bitdance.com: -- assignee: - r.david.murray stage: - patch review type: feature request - behavior ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Éric Araujo mer...@netwok.org added the comment: Idea: Import the aliases mapping from codecs and extend it with email-specific aliases. Alternate idea: Add email’s names to codecs. Side note: “charset” stands for “character encoding”, not “character set”. See http://www.w3.org/International/questions/qa-what-is-encoding#what -- nosy: +merwok ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Shashwat Anand anand.shash...@gmail.com added the comment: MIMEText doesn't support unicode input. This was the reason OP Test case failed. For reference : http://bugs.python.org/issue1368247 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Shashwat Anand anand.shash...@gmail.com added the comment: I tested it on python 2.5, 2.6, 2.7 trunk and 3.2 varying msg.set_charset(x) with x = 'utf8' and 'utf-8' Here are the results. Apparantly python 2.x had issue with Test case and 3.2 passed but I guess it is unrelated with the issue. 07:35:40 l0nwlf-MBP:~ $ python2.5 Python 2.5.4 (r254:67916, Jul 7 2009, 23:51:24) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type help, copyright, credits or license for more information. from email.MIMEText import MIMEText msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430') msg.set_charset('utf8') print msg.as_string() Traceback (most recent call last): File stdin, line 1, in module File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/message.py, line 131, in as_string g.flatten(self, unixfrom=unixfrom) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 84, in flatten self._write(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 109, in _write self._dispatch(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 135, in _dispatch meth(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 178, in _handle_text self._fp.write(payload) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128) msg.set_charset('utf-8') print msg.as_string() Traceback (most recent call last): File stdin, line 1, in module File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/message.py, line 131, in as_string g.flatten(self, unixfrom=unixfrom) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 84, in flatten self._write(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 109, in _write self._dispatch(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 135, in _dispatch meth(msg) File /System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/email/generator.py, line 178, in _handle_text self._fp.write(payload) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128) 07:36:17 l0nwlf-MBP:~ $ python2.6 Python 2.6.5 (r265:79063, Apr 6 2010, 21:34:21) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin Type help, copyright, credits or license for more information. from email.MIMEText import MIMEText msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430') msg.set_charset('utf8') print msg.as_string() Traceback (most recent call last): File stdin, line 1, in module File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/message.py, line 135, in as_string g.flatten(self, unixfrom=unixfrom) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 84, in flatten self._write(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 109, in _write self._dispatch(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 135, in _dispatch meth(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 178, in _handle_text self._fp.write(payload) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128) msg.set_charset('utf-8') print msg.as_string() Traceback (most recent call last): File stdin, line 1, in module File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/message.py, line 135, in as_string g.flatten(self, unixfrom=unixfrom) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 84, in flatten self._write(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 109, in _write self._dispatch(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 135, in _dispatch meth(msg) File /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/email/generator.py, line 178, in _handle_text self._fp.write(payload) UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128) 07:36:37 l0nwlf-MBP:~ $ python2.7 Python 2.7a4+ (trunk:78750, Mar 7 2010, 08:09:00) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin Type help, copyright, credits or license for more information. from email.MIMEText import MIMEText msg =
[issue4487] Add utf8 alias for email charsets
Ben Gamari bgam...@gmail.com added the comment: Has this patch been merged yet? -- nosy: +bgamari ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Tony Nelson tony_nel...@users.sourceforge.net added the comment: This seems entirely reasonable, helpful, and in accord with the mapping of ascii to us-ascii. I recommend accepting this patch or a slightly fancier one that would also do utf_8. There are pobably other encoding names with the same issue of being accepted by Python but not be understood by other email clients. This issue also affects 2.6.1 and 2.7trunk. I haven't checked 3.x. -- nosy: +barry, tony_nelson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by Tony Nelson tony_nel...@users.sourceforge.net: -- versions: +Python 2.6, Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by STINNER Victor victor.stin...@haypocalc.com: -- nosy: -haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
Changes by STINNER Victor [EMAIL PROTECTED]: -- nosy: +haypo ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue4487] Add utf8 alias for email charsets
New submission from maxua [EMAIL PROTECTED]: When using MIME email package you can specify utf8 as the encoding. It will be accepted but it is not rendered correctly in some MUA. E.g. Mac OS X Mail.app doesn't display it properly while Google Gmail does. It is confusing since Python itself happily understands both utf8 and utf-8. The patch adds utf8 as an alias to utf-8 encoding which means user won't need to think twice. Test case: from email.MIMEText import MIMEText msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430') msg.set_charset('utf8') print msg.as_string() -- components: Library (Lib) files: charset-utf8-alias.patch keywords: patch messages: 76738 nosy: maxua severity: normal status: open title: Add utf8 alias for email charsets versions: Python 2.5 Added file: http://bugs.python.org/file12191/charset-utf8-alias.patch ___ Python tracker [EMAIL PROTECTED] http://bugs.python.org/issue4487 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com