[issue41023] smtplib does not handle Unicode characters

2020-06-29 Thread Jay Patel


Change by Jay Patel :


Removed file: 
https://bugs.python.org/file49250/providing_only_ascii_characters.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread R. David Murray


R. David Murray  added the comment:

If you use the 'sendmail' function for sending, then it is entirely your 
responsibility to turn the email into "wire format".  Unicode is not wire 
format, but if you give sendmail a string that only has ascii in it it nicely 
converts it to binary for you.  But given that the email RFCs specify specific 
ways to indicate how non-ascii is encoded in the message, there is no way for 
the smtp library to know now to do that correctly when passed an arbitrary 
unicode string, so it doesn't try.  sendmail requires *you* do do the encoding 
to binary, indicating you at least think that you got the RFC parts right :)  
In python2, strings are binary by default, so in that case you are handing 
sendmail binary format data (with the same assumption that you got the RFC 
parts right)...if you passed the python2 function a unicode string it would 
probably complain as well, although not in the same way.

If your raw email is RFC compliant, then you can do: sendmail(from, to, 
mymsg.encode()).

I see from your example that you are trying to use the email package to 
construct the email, which is good.  But, emails are *binary*, they are not 
unicode, so passing "message_from_string" a unicode string containing non-ascii 
isn't going to do what you are expecting, any more than passing unicode to the 
'sendmail' function did.  message_from_string is really only useful for doing 
certain sorts of debug and ought to be deprecated.  Or produce a warning when 
handed a string containing non-ascii.  (There are historical reasons why it 
doesn't :(

And then you should use smtplib's 'sendmessage' function, which understands 
email package messages and will Do the Right Thing with them (including the 
extraction of the to and from addresses your code is currently doing).

However, even if you encoded your raw message to binary and then passed it to 
message_from_bytes, your example message is *not* RFC compliant: without MIME 
headers, an email with non-ascii characters in the body is technically in 
violation of the RFC.  Most email programs will handle that particular message 
despite that, but not all.  You are better off using the email package to 
construct a properly RFC formatted email,  using the new API (ex: msg = 
EmailMessage() (not Message), and then doing msg['from'] = address, etc, and 
msg.set_content(your unicode string body)). I can't really give you much advice 
here (nor should I, this being a bug tracker :) because I don't know how 
exactly how the data is coming in to your program in your real use case.

Once you have a properly constructed EmailMessage object, you should use 
smtplib's 'sendmessage' function, which understands email package messages and 
will Do the Right Thing with them (including the extraction of the to and from 
addresses your code is currently doing, as well as properly handling BCC, which 
means deleting BCC headers from the message before sending it, which your code 
does not do and which 'sendmail' would not do.)

SMTPUTF8 is about non-ascii in the email *headers*, and most SMTP servers these 
days do not yes support it[*]. Some of the big ones do, though (I believe gmail 
does).

[*] although that doesn't explain why what you got was SMTPSenderRefused.  You 
should have gotten SMTPNotSupportedError.

--
resolution:  -> works for me
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread Jay Patel


Change by Jay Patel :


Added file: 
https://bugs.python.org/file49252/providing_mail_options_in_sendmail.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread Jay Patel


Change by Jay Patel :


Added file: 
https://bugs.python.org/file49251/providing_Unicode_characters_in_email_body.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread Jay Patel


Jay Patel  added the comment:

Screenshot for the case, where only the 'raw_email' variable contains only 
'ascii' characters.

--
Added file: 
https://bugs.python.org/file49250/providing_only_ascii_characters.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread Jay Patel


New submission from Jay Patel :

According to the user requirements, I need to send an email, which is provided 
as a raw email, i.e., the contents of email are provided in form of headers. To 
accomplish this I am using the methods provided in the "send_rawemail_demo.py" 
file (attached below).
The smtplib library works fine when providing only 'ascii' characters in the 
'raw_email' variable. But, when I provide any Unicode characters either in the 
Subject or Body of the email, then the sendmail method of the smtplib library 
fails with the following message:
UnicodeEncodeError 'ascii' codec can't encode characters in position 123-124: 
ordinal not in range(128)
I tried providing the mail_options=["SMTPUTF-8"] in the sendmail method (On 
line no. 72 in the send_rawemail_demo.py file), but then it fails (even for the 
'ascii' characters) with the exception as SMTPSenderRefused.
I have faced the same issue on Python 3.6. 
The sendmail method of the SMTP class encodes the message using 'ascii' as:
if isinstance(msg, str):
msg = _fix_eols(msg).encode('ascii')
The code works properly for Python 2 as the smtplib library for Python 2 does 
not have the above line and hence it allows Unicode characters in the Body and 
the Subject.

--
components: email
files: send_rawemail_demo.py
messages: 371801
nosy: barry, jpatel, r.david.murray
priority: normal
severity: normal
status: open
title: smtplib does not handle Unicode characters
type: enhancement
versions: Python 3.8
Added file: https://bugs.python.org/file49249/send_rawemail_demo.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com