[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2014-02-07 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 4daf3cec9419 by R David Murray in branch '3.3':
#19063: the unicode-in-set_payload problem isn't getting fixed in 3.4.
http://hg.python.org/cpython/rev/4daf3cec9419

New changeset f942f1eddfea by R David Murray in branch 'default':
#20531: Revert e20f98a8ed71, the 3.4 version of the #19063 fix.
http://hg.python.org/cpython/rev/f942f1eddfea

New changeset ef8aaace85ca by R David Murray in branch 'default':
#20531: Apply the 3.3 version of the #19063 fix.
http://hg.python.org/cpython/rev/ef8aaace85ca

New changeset aab7258a31d3 by R David Murray in branch 'default':
Merge: #19063: the unicode-in-set_payload problem isn't getting fixed in 3.4.
http://hg.python.org/cpython/rev/aab7258a31d3

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-12-11 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d842bc07d30b by R David Murray in branch '3.3':
#19063: partially fix set_payload handling of non-ASCII string input.
http://hg.python.org/cpython/rev/d842bc07d30b

New changeset 02cb48459b58 by R David Murray in branch 'default':
Null merge for #19063 (3.4 fix is different).
http://hg.python.org/cpython/rev/02cb48459b58

New changeset e20f98a8ed71 by R David Murray in branch 'default':
#19063: fix set_payload handling of non-ASCII string input.
http://hg.python.org/cpython/rev/e20f98a8ed71

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-12-11 Thread R. David Murray

R. David Murray added the comment:

Well, it's a fair while after tomorrow, but now it is committed.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-29 Thread R. David Murray

R. David Murray added the comment:

3.4 patch updated to address Vajrasky's review comment.

I'll probably apply this tomorrow.

--
Added file: http://bugs.python.org/file32898/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-25 Thread R. David Murray

R. David Murray added the comment:

Updated patch for 3.3, and a new patch for 3.4.  In 3.4, set_payload raises an 
error if non-ascii-surrogateescape text is passed in as the argument (ie: there 
are non-ascii unicode characters in the string) and no charset is specified 
with which to encode them.  In an ideal world it would instead default to 
utf-8, but because there may exist code that passes in unicode and then encodes 
it later by calling set_charset, we can't go that route in 3.4.  In 3.5, after 
the few people who may be in that boat have fixed their code to include the 
charset on set_payload, we can in fact make it default to utf-8.

I had to fix a couple 3.4 test, one of which was no longer valid because I've 
now defined the previously undefined behavior of set_payload when passed 
unicode, and one of which was just wrong but I didn't notice because of this 
bug.

--
Added file: http://bugs.python.org/file32837/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-25 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


Added file: http://bugs.python.org/file32838/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-23 Thread Vajrasky Kok

Vajrasky Kok added the comment:

R. David Murray, your patch fails with this situation:

from email.mime.nonmultipart import *
from email.charset import *
from email.message import Message
from io import BytesIO
from email.generator import BytesGenerator
msg = Message()
cs = Charset('utf-8')
cs.body_encoding = None
msg.set_payload('АБВ', cs)
msg.as_string()
fp = BytesIO()
g = BytesGenerator(fp)
g.flatten(msg)
print(fp.getvalue())

=== b'MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: base64\n\n0JDQkdCS\n'

Apparently, there is a funky bug. If you never call msg.as_string(), the 
fp.get_value() will output correctly:

b'MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: 8bit\n\n\xd0\x90\xd0\x91\xd0\x92'

It turns out that msg.as_string() calls set_payload with different kind of 
charset!

Attached the patch to solve this problem. Not sure whether this is important or 
not to fix in 3.3. How many people call bytes generator after calling string 
generator?

--
Added file: http://bugs.python.org/file32800/support_8bit_charset_cte_v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-23 Thread Vajrasky Kok

Vajrasky Kok added the comment:

Simpler patch.

--
Added file: http://bugs.python.org/file32802/support_8bit_charset_cte_v3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-23 Thread R. David Murray

R. David Murray added the comment:

Yes, I discovered this in testing, but I forgot to file a bug report for it.  
It should be dealt with in a separate issue.  And yes, it should be fixed, 
since except for the documented on-demand filling out of missing pieces such as 
MIME borders, the model should not be changed by serializing it.

There are also some other issues that I discovered while running the more 
complete 3.4 tests, but I haven't had time to fix those yet.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-21 Thread Vajrasky Kok

Vajrasky Kok added the comment:

No review link?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-21 Thread R. David Murray

R. David Murray added the comment:

Ah, I posted a git-diff 3.3 patch.  Let me repost it.

--
Added file: http://bugs.python.org/file32757/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-21 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


Removed file: http://bugs.python.org/file32732/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-20 Thread R. David Murray

R. David Murray added the comment:

Vajrasky: thanks for taking a crack at this, but, well, there are a lot of 
subtleties involved here, due to the way the organic growth of the email 
package over many years has led to some really bad design issues.

It took me a lot of time to boot back up my understanding of how all this stuff 
hangs together (answer: badly).  After wandering down many blind alleys, the 
problem turns out to be yet one more disconnect in the model.  We previously 
fixed the issue where if set_payload was passed binary data bad things would 
happen.  That made the model more consistent, in that _payload was now a 
surrogateescaped string when the payload was specified as binary data.

But what the model *really* needs is that _payload *always* be an 
ascii+surrogateescape string, and never a full unicode string.  (Yeah, this is 
a sucky model...it ought to always be binary instead, but we are dealing with 
legacy code here.)

Currently it can be a unicode string.  If it is, set_charset turns it into an 
ascii only string by encoding it with the qp or base64 CTE.  This is pretty 
much just by luck, though.

If you set body_encode to None what happens is that the encode_7or8bit encoder 
thinks the string is 7bit because it does get_payload(decode=True) which, 
because the model invariant was broken, turns into a raw-unicode-escape string, 
which is a 7bit representation.  That doesn't affect the payload, but it does 
result in wrong CTE being used.

The fix is to fix the model invariant by turning a unicode string passed in to 
set_payload into an ascii+surrogateescape string with the escaped bytes being 
the unicode encoded to the output charset.

Unfortunately it is also possible to call set_payload without a charset, and 
*then* call set_charset.  To keep from breaking the code of anyone currently 
doing that, I had to allow a full unicode _payload, and detect it in 
set_charset.

My plan is to fix that in 3.4, causing a backward compatibility break because 
it will no longer be possible to call set_payload with a unicode string 
containing non-ascii if you don't also provide a character set.  I believe this 
is an acceptable break, since otherwise you *must* leave the model in an 
ambiguous state, and you have the possibility leaking unicode characters out 
into your wire-format message, which would ultimately result in either an 
exception at serialization time or, worse, mojibake.

Patch attached.

--
stage:  - patch review
type:  - behavior
versions:  -Python 3.2
Added file: http://bugs.python.org/file32730/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-20 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


Removed file: http://bugs.python.org/file32730/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-20 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


Added file: http://bugs.python.org/file32732/support_8bit_charset_cte.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-11-01 Thread Vajrasky Kok

Vajrasky Kok added the comment:

Attached the *preliminary* patch to address R. David Murray's request.

It does not address the case where we send raw utf-8 bytes to payload. Maybe we 
should handle that in different ticket.

msg.set_payload(b'\xd0\x90\xd0\x91\xd0\x92') == chucks

--
Added file: 
http://bugs.python.org/file32447/fix_8bit_data_charset_none_set_payload_v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-31 Thread Vajrasky Kok

Vajrasky Kok added the comment:

So msg.as_string() =
   cte - base64
   message - 0JDQkdCS # base64 encoded string

What about msg.as_bytes()? Should it be:
   cte - 8bit
   message - \\u0410\\u0411\\u0412 (raw-unicode-escape) or 
\xd0\x90\xd0\x91\xd0\x92 (utf-8)?

or message - 0JDQkdCS (base64) as well?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-31 Thread Eric Hanchrow

Changes by Eric Hanchrow eric.hanch...@gmail.com:


--
nosy:  -Eric.Hanchrow

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-31 Thread R. David Murray

R. David Murray added the comment:

as_bytes should be producing the raw utf8 bytes with cte 8bit.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Vajrasky Kok

Vajrasky Kok added the comment:

Here is the preliminary patch to fix the problem. My patch produces 8bit for 
msg.as_string and msg.as_bytes for simplicity reason.

If msg.as_string should gives content-transfer-encoding 7bit with 8bit data but 
msg.as_bytes should gives content-transfer-encoding 8bit with 8bit data, I can 
modify the patch. But it doesn't feel right...

--
keywords: +patch
nosy: +vajrasky
Added file: 
http://bugs.python.org/file32426/fix_8bit_data_charset_none_set_payload.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread R. David Murray

R. David Murray added the comment:

msg.as_string should not be producing a CTE of 8bit.  I haven't looked at your 
patch so I don't know what you mean by having as_string produce 8bit data, but 
it can't be right :)

To clarify: as_string must produce valid unicode data, and therefore *cannot* 
have any 8bit data in it.  Unicode is not an 8bit channel (in SMTP terms) it is 
a 7bit channel (ie: restricted to ASCII).  The *decoded* version of the message 
can have non-ASCII in it, but as_string is producing an *encoded* version.  The 
CTE applies only to an encoded version.  

It gets a little confusing because in Python we are used to 'decoding' to 
unicode and 'encoding' to bytes, but in the email package we also sometimes 
'encode' to ASCII but use unicode strings to store it.  If we didn't have to 
maintain backward compatibility it would probably be better to just drop 
as_string :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Eric Hanchrow

Eric Hanchrow added the comment:

Put the following into a file named repro.py, then type python repro.py at 
your shell.  You'll see ``AttributeError: 'CustomAdapter' object has no 
attribute 'setLevel'``

import logging logging.basicConfig ()

class CustomAdapter(logging.LoggerAdapter):
def process(self, msg, kwargs):
return '[%s] %s' % (self.extra['connid'], msg), kwargs

logger = logging.getLogger(__name__) 
adapter = CustomAdapter(logger, {'connid': '1234'}) 
adapter.setLevel (logging.WARN) 
adapter.warning (Ahoy matey)

--
nosy: +Eric.Hanchrow

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Eric Hanchrow

Eric Hanchrow added the comment:

Gaah, please ignore that last message; I accidentally pasted it into the wrong 
page :-(

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
Removed message: http://bugs.python.org/msg201781

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
Removed message: http://bugs.python.org/msg201782

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread Vajrasky Kok

Vajrasky Kok added the comment:

Okay, so for this case, what are the correct outputs for the cte and the 
message?

from email.charset import Charset
from email.message import Message

cs = Charset('utf-8')
cs.body_encoding = None # disable base64
msg = Message()
msg.set_payload('АБВ', cs)

msg.as_string() ==
   cte - 7bit
   message - АБВ or \\u0410\\u0411\\u0412 or \xd0\x90\xd0\x91\xd0\x92?

msg.as_bytes() ==
   cte - 8bit
   message - \\u0410\\u0411\\u0412 or \xd0\x90\xd0\x91\xd0\x92?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-30 Thread R. David Murray

R. David Murray added the comment:

cte base64 I think (see below).

Basically, set_payload should be putting the surrogateescape encoded utf-8 into 
the _payload (which it should now be doing), and probably calling set_charset.  
The cte will at that point be 8bit, but when as_string calls Generator, it will 
get converted to 7bit clean by doing (I think) a base64 encode and emitting 
that as the CTE.  I have to look through the code to remind myself how it all 
works, which I haven't had time for yet, which is why I haven't tried to make a 
fix myself yet.

(Yes, this is poor design, but we are dealing with a long line of legacy code 
and API here...)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-10-10 Thread R. David Murray

R. David Murray added the comment:

There is definitely a bug in set_payload here, and (obviously :) no test for 
that case (passing an 8bit charset to set_payload).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread Florian Apolloner

New submission from Florian Apolloner:

Take the following example:

 from email.mime.nonmultipart import *
 from email.charset import *
 msg = MIMENonMultipart('text', 'plain')
 cs = Charset('utf-8')
 cs.body_encoding = None
 msg.set_payload('А Б В Г Д Е Ж Ѕ З И І К Л М Н О П.', cs)
 msg.as_string()
'MIME-Version: 1.0\nContent-Type: text/plain; 
charset=utf-8\nContent-Transfer-Encoding: 7bit\n\nА Б В Г Д Е Ж Ѕ З И І К Л М 
Н О П.'

Till python 3.3.2 this created the email as 8bit, which is in line with what 
Thunderbird creates on my box. And if I understand the RFC correctly 
(https://tools.ietf.org/html/rfc2045#section-2.7) 7bit should only be used for 
ASCII. This issue was introduced by 
http://hg.python.org/cpython/rev/64e004737837 -- I changed orig = 
msg.get_payload(decode=True) back to it's original value (decode=False) in 
encode_7or8bit and get the correct behavior again. Running the tests didn't 
cause any errors (I hope I ran the correct one).

--
components: Library (Lib)
messages: 198217
nosy: apollo13
priority: normal
severity: normal
status: open
title: Python 3.3.3  encodes emails containing non-ascii data as 7bit
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread R. David Murray

R. David Murray added the comment:

There is definitely a bug here, but 8bit would also be wrong, since you are 
calling as_string.  It *should* be producing a 7bit CTE with a base64 encoded 
part in that case.

--
components: +email
nosy: +barry, r.david.murray
versions: +Python 3.2, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread Florian Apolloner

Florian Apolloner added the comment:

Am I not explicitelly disabling base64 by setting body_encoding to None?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread R. David Murray

R. David Murray added the comment:

You are, but you are also calling as_string.  Unicode can not handle 8bit data, 
therefore the email package must down-transform all data to 7bit when 
converting it to a string, just like a mail server trying to send to another 
mail server that can only handle 7bit would do.

If you used BytesGenerator to serialize the message, *then* you would expect to 
see 8bit utf-8 data in the body.

There are supposed to be tests for these cases, so I have to investigate why 
the tests are broken, as well :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread Florian Apolloner

Florian Apolloner added the comment:

Using BytesGenerator I get:
 fp = BytesIO()
 g = BytesGenerator(fp)
 msg = MIMENonMultipart('text', 'plain')
 msg.set_payload('А Б В Г Д Е Ж Ѕ З И І К Л М Н О П.', cs)
 g.flatten(msg)
Traceback (most recent call last):
  File stdin, line 1, in module
  File /usr/lib/python3.3/email/generator.py, line 112, in flatten
self._write(msg)
  File /usr/lib/python3.3/email/generator.py, line 177, in _write
self._dispatch(msg)
  File /usr/lib/python3.3/email/generator.py, line 203, in _dispatch
meth(msg)
  File /usr/lib/python3.3/email/generator.py, line 421, in _handle_text
super(BytesGenerator,self)._handle_text(msg)
  File /usr/lib/python3.3/email/generator.py, line 233, in _handle_text
self._write_lines(payload)
  File /usr/lib/python3.3/email/generator.py, line 158, in _write_lines
self.write(laststripped)
  File /usr/lib/python3.3/email/generator.py, line 395, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
UnicodeEncodeError: 'ascii' codec can't encode character '\u0410' in position 
0: ordinal not in range(128)

But I might be using it wrong :/

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19063] Python 3.3.3 encodes emails containing non-ascii data as 7bit

2013-09-21 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com