Roundup Robot added the comment:
New changeset 4daf3cec9419 by R David Murray in branch '3.3':
#19063: the unicode-in-set_payload problem isn't getting fixed in 3.4.
http://hg.python.org/cpython/rev/4daf3cec9419
New changeset f942f1eddfea by R David Murray in branch 'default':
#20531: Revert
Roundup Robot added the comment:
New changeset d842bc07d30b by R David Murray in branch '3.3':
#19063: partially fix set_payload handling of non-ASCII string input.
http://hg.python.org/cpython/rev/d842bc07d30b
New changeset 02cb48459b58 by R David Murray in branch 'default':
Null merge for
R. David Murray added the comment:
Well, it's a fair while after tomorrow, but now it is committed.
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
R. David Murray added the comment:
3.4 patch updated to address Vajrasky's review comment.
I'll probably apply this tomorrow.
--
Added file: http://bugs.python.org/file32898/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
R. David Murray added the comment:
Updated patch for 3.3, and a new patch for 3.4. In 3.4, set_payload raises an
error if non-ascii-surrogateescape text is passed in as the argument (ie: there
are non-ascii unicode characters in the string) and no charset is specified
with which to encode
Changes by R. David Murray rdmur...@bitdance.com:
Added file: http://bugs.python.org/file32838/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Vajrasky Kok added the comment:
R. David Murray, your patch fails with this situation:
from email.mime.nonmultipart import *
from email.charset import *
from email.message import Message
from io import BytesIO
from email.generator import BytesGenerator
msg = Message()
cs = Charset('utf-8')
Vajrasky Kok added the comment:
Simpler patch.
--
Added file: http://bugs.python.org/file32802/support_8bit_charset_cte_v3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
R. David Murray added the comment:
Yes, I discovered this in testing, but I forgot to file a bug report for it.
It should be dealt with in a separate issue. And yes, it should be fixed,
since except for the documented on-demand filling out of missing pieces such as
MIME borders, the model
Vajrasky Kok added the comment:
No review link?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
Python-bugs-list mailing list
R. David Murray added the comment:
Ah, I posted a git-diff 3.3 patch. Let me repost it.
--
Added file: http://bugs.python.org/file32757/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
Changes by R. David Murray rdmur...@bitdance.com:
Removed file: http://bugs.python.org/file32732/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
R. David Murray added the comment:
Vajrasky: thanks for taking a crack at this, but, well, there are a lot of
subtleties involved here, due to the way the organic growth of the email
package over many years has led to some really bad design issues.
It took me a lot of time to boot back up my
Changes by R. David Murray rdmur...@bitdance.com:
Removed file: http://bugs.python.org/file32730/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Changes by R. David Murray rdmur...@bitdance.com:
Added file: http://bugs.python.org/file32732/support_8bit_charset_cte.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Vajrasky Kok added the comment:
Attached the *preliminary* patch to address R. David Murray's request.
It does not address the case where we send raw utf-8 bytes to payload. Maybe we
should handle that in different ticket.
msg.set_payload(b'\xd0\x90\xd0\x91\xd0\x92') == chucks
--
Vajrasky Kok added the comment:
So msg.as_string() =
cte - base64
message - 0JDQkdCS # base64 encoded string
What about msg.as_bytes()? Should it be:
cte - 8bit
message - \\u0410\\u0411\\u0412 (raw-unicode-escape) or
\xd0\x90\xd0\x91\xd0\x92 (utf-8)?
or message - 0JDQkdCS (base64)
Changes by Eric Hanchrow eric.hanch...@gmail.com:
--
nosy: -Eric.Hanchrow
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
___
R. David Murray added the comment:
as_bytes should be producing the raw utf8 bytes with cte 8bit.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Vajrasky Kok added the comment:
Here is the preliminary patch to fix the problem. My patch produces 8bit for
msg.as_string and msg.as_bytes for simplicity reason.
If msg.as_string should gives content-transfer-encoding 7bit with 8bit data but
msg.as_bytes should gives
R. David Murray added the comment:
msg.as_string should not be producing a CTE of 8bit. I haven't looked at your
patch so I don't know what you mean by having as_string produce 8bit data, but
it can't be right :)
To clarify: as_string must produce valid unicode data, and therefore *cannot*
Eric Hanchrow added the comment:
Put the following into a file named repro.py, then type python repro.py at
your shell. You'll see ``AttributeError: 'CustomAdapter' object has no
attribute 'setLevel'``
import logging logging.basicConfig ()
class CustomAdapter(logging.LoggerAdapter):
def
Eric Hanchrow added the comment:
Gaah, please ignore that last message; I accidentally pasted it into the wrong
page :-(
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
Removed message: http://bugs.python.org/msg201781
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
Removed message: http://bugs.python.org/msg201782
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
Vajrasky Kok added the comment:
Okay, so for this case, what are the correct outputs for the cte and the
message?
from email.charset import Charset
from email.message import Message
cs = Charset('utf-8')
cs.body_encoding = None # disable base64
msg =
R. David Murray added the comment:
cte base64 I think (see below).
Basically, set_payload should be putting the surrogateescape encoded utf-8 into
the _payload (which it should now be doing), and probably calling set_charset.
The cte will at that point be 8bit, but when as_string calls
R. David Murray added the comment:
There is definitely a bug in set_payload here, and (obviously :) no test for
that case (passing an 8bit charset to set_payload).
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
New submission from Florian Apolloner:
Take the following example:
from email.mime.nonmultipart import *
from email.charset import *
msg = MIMENonMultipart('text', 'plain')
cs = Charset('utf-8')
cs.body_encoding = None
msg.set_payload('А Б В Г Д Е Ж Ѕ З И І К Л М Н О П.', cs)
R. David Murray added the comment:
There is definitely a bug here, but 8bit would also be wrong, since you are
calling as_string. It *should* be producing a 7bit CTE with a base64 encoded
part in that case.
--
components: +email
nosy: +barry, r.david.murray
versions: +Python 3.2,
Florian Apolloner added the comment:
Am I not explicitelly disabling base64 by setting body_encoding to None?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
R. David Murray added the comment:
You are, but you are also calling as_string. Unicode can not handle 8bit data,
therefore the email package must down-transform all data to 7bit when
converting it to a string, just like a mail server trying to send to another
mail server that can only
Florian Apolloner added the comment:
Using BytesGenerator I get:
fp = BytesIO()
g = BytesGenerator(fp)
msg = MIMENonMultipart('text', 'plain')
msg.set_payload('А Б В Г Д Е Ж Ѕ З И І К Л М Н О П.', cs)
g.flatten(msg)
Traceback (most recent call last):
File stdin, line 1, in module
File
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19063
___
34 matches
Mail list logo