[issue41553] encoded-word abused for header line folding causes RFC 2047 violation

2020-08-17 Thread R. David Murray


R. David Murray  added the comment:

Yes for the registry changes.  I thought we had fixed the bug that was causing 
message-id to get encoded, but maybe it still exists in 3.7?  I don't remember 
when we fixed it (and I may be remembering wrong!)

As for X- "unstructured headers" getting trashed, by *definition* in the rfc, 
if the header body is unstructured it must support RFC encoding.  If does not, 
it is not an unstructured header field.  Which is why I said we need to think 
about what characteristics the default parser should have.  The RFC doesn't 
really speak to that, it expects every header to be one of the defined 
types...but while an X- header might be of a defined type, the email package 
can't know that unless it is told, so what should we use as the default parsing 
strategy?  "text without encoded words" isn't really RFC compliant, I think.  
(Though I'll admit it has been a while since I last reviewed the relevant RFCs.)

Note that I believe that we have an open issue (or at least an open discussion) 
that we should change the 'refold_source' default from 'long' to 'none', which 
means that X- headers would at least be passed through by default.  It would 
also mitigate this problem, and can be used as a local workaround for headers 
that are just getting passed through and not modified.

--

___
Python tracker 
<https://bugs.python.org/issue41553>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41553] encoded-word abused for header line folding causes RFC 2047 violation

2020-08-14 Thread R. David Murray


R. David Murray  added the comment:

It's not really an abuse.  It is, however, buggy.  It should be being applied 
*only* when the header contains unstructured text.  Unfortunately I made the 
choice to treat any header that doesn't have a specific parser as unstructured, 
and that was a wrong choice which should be fixed.  It is an interesting 
question what should be used as the default parser, though.  Suggestions and 
code are welcome :)

There should be specific header parsers for headers that contain message ids.  
That was on my todo list but did not get done before my circumstances changed 
and my free-time focus moved away from python development work :(

The message_id parser exists.  In-Reply-To just needs to be declared in the 
header registry as a MessageIDHeader (not sure how that got missed).  Writing a 
Header class for References should be trivial, it's just a list of message ids. 
 That will fix those headers, and I suggest we do that asap.

Fixing the default-to-unstructured will take a bit more thought and should 
probably be split out into a separate issue.  I can review and give advice 
(though you may have to ping me directly) but I won't have time to write any 
code.

--

___
Python tracker 
<https://bugs.python.org/issue41553>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41402] email: ContentManager.set_content calls nonexistent method encode() on bytes

2020-07-31 Thread R. David Murray


R. David Murray  added the comment:

The fix looks good to me.  Don't know how I made that mistake, and obviously I 
didn't write a test for it...

--

___
Python tracker 
<https://bugs.python.org/issue41402>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41387] Escape needed in the email documentation example

2020-07-24 Thread R. David Murray


Change by R. David Murray :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41387>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41145] EmailMessage.as_string is altering the message state and actually fix bugs

2020-07-10 Thread R. David Murray


R. David Murray  added the comment:

The as_strings docs say:

"Flattening the message may trigger changes to the Message if defaults need to 
be filled in to complete the transformation to a string (for example, MIME 
boundaries may be generated or modified)."

So, while this is indeed an API design bug, it isn't an actual bug in the code 
but rather is expected behavior, currently.  The historical reason for this is 
that the generator code looks at the entire message to make sure the boundary 
string is unique.  My long term plan for email included plans to rewrite the 
generator, and I was going to fix this issue at that point.  My life got too 
busy to be able to continue with email development work, though, so that never 
happened.

It has been *years* since I've looked at the code.  Thinking about it now, I'm 
wondering if it would be possible to use a GUID technique to generate the 
boundary and thus do exactly as you say: have make_alternative (and anything 
else that causes a boundary to be needed) pre-create the boundary.  That, I 
think, would mean we wouldn't need to change the generator, even though it 
would still be doing its (inefficient) check that the boundary was unique.  I'm 
not sure if it would work, though; it's been too long since I've looked at the 
relevant code.

--
type: resource usage -> behavior

___
Python tracker 
<https://bugs.python.org/issue41145>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41206] behaviour change with EmailMessage.set_content

2020-07-07 Thread R. David Murray


R. David Murray  added the comment:

I'm short of time, if someone could approve Mark's PR and merge it it would be 
great. There wasn't supposed to be any behavior change other than the one 
documented in #40597.

--

___
Python tracker 
<https://bugs.python.org/issue41206>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread R. David Murray


R. David Murray  added the comment:

If you use the 'sendmail' function for sending, then it is entirely your 
responsibility to turn the email into "wire format".  Unicode is not wire 
format, but if you give sendmail a string that only has ascii in it it nicely 
converts it to binary for you.  But given that the email RFCs specify specific 
ways to indicate how non-ascii is encoded in the message, there is no way for 
the smtp library to know now to do that correctly when passed an arbitrary 
unicode string, so it doesn't try.  sendmail requires *you* do do the encoding 
to binary, indicating you at least think that you got the RFC parts right :)  
In python2, strings are binary by default, so in that case you are handing 
sendmail binary format data (with the same assumption that you got the RFC 
parts right)...if you passed the python2 function a unicode string it would 
probably complain as well, although not in the same way.

If your raw email is RFC compliant, then you can do: sendmail(from, to, 
mymsg.encode()).

I see from your example that you are trying to use the email package to 
construct the email, which is good.  But, emails are *binary*, they are not 
unicode, so passing "message_from_string" a unicode string containing non-ascii 
isn't going to do what you are expecting, any more than passing unicode to the 
'sendmail' function did.  message_from_string is really only useful for doing 
certain sorts of debug and ought to be deprecated.  Or produce a warning when 
handed a string containing non-ascii.  (There are historical reasons why it 
doesn't :(

And then you should use smtplib's 'sendmessage' function, which understands 
email package messages and will Do the Right Thing with them (including the 
extraction of the to and from addresses your code is currently doing).

However, even if you encoded your raw message to binary and then passed it to 
message_from_bytes, your example message is *not* RFC compliant: without MIME 
headers, an email with non-ascii characters in the body is technically in 
violation of the RFC.  Most email programs will handle that particular message 
despite that, but not all.  You are better off using the email package to 
construct a properly RFC formatted email,  using the new API (ex: msg = 
EmailMessage() (not Message), and then doing msg['from'] = address, etc, and 
msg.set_content(your unicode string body)). I can't really give you much advice 
here (nor should I, this being a bug tracker :) because I don't know how 
exactly how the data is coming in to your program in your real use case.

Once you have a properly constructed EmailMessage object, you should use 
smtplib's 'sendmessage' function, which understands email package messages and 
will Do the Right Thing with them (including the extraction of the to and from 
addresses your code is currently doing, as well as properly handling BCC, which 
means deleting BCC headers from the message before sending it, which your code 
does not do and which 'sendmail' would not do.)

SMTPUTF8 is about non-ascii in the email *headers*, and most SMTP servers these 
days do not yes support it[*]. Some of the big ones do, though (I believe gmail 
does).

[*] although that doesn't explain why what you got was SMTPSenderRefused.  You 
should have gotten SMTPNotSupportedError.

--
resolution:  -> works for me
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41023>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2020-05-28 Thread R. David Murray


R. David Murray  added the comment:


New changeset 21017ed904f734be9f195ae1274eb81426a9e776 by Abhilash Raj in 
branch 'master':
bpo-39040: Fix parsing of email mime headers with whitespace between 
encoded-words. (gh-17620)
https://github.com/python/cpython/commit/21017ed904f734be9f195ae1274eb81426a9e776


--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-17 Thread R. David Murray


Change by R. David Murray :


--
stage: backport needed -> resolved

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-17 Thread R. David Murray


R. David Murray  added the comment:


New changeset c1f1ddf30a595c2bfa3c06e54fb03fa212cd28b5 by Miss Islington (bot) 
in branch '3.8':
bpo-40597: email: Use CTE if lines are longer than max_line_length consistently 
(gh-20038) (gh-20084)
https://github.com/python/cpython/commit/c1f1ddf30a595c2bfa3c06e54fb03fa212cd28b5


--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-13 Thread R. David Murray


R. David Murray  added the comment:

Thanks, Arkadiusz.

--
resolution:  -> fixed
stage: patch review -> backport needed
versions:  -Python 3.5, Python 3.6, Python 3.7

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-13 Thread R. David Murray


R. David Murray  added the comment:


New changeset 6f2f475d5a2cd7675dce844f3af436ba919ef92b by Arkadiusz Hiler in 
branch 'master':
bpo-40597: email: Use CTE if lines are longer than max_line_length consistently 
(gh-20038)
https://github.com/python/cpython/commit/6f2f475d5a2cd7675dce844f3af436ba919ef92b


--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-11 Thread R. David Murray


R. David Murray  added the comment:

The PR looks good to me, but I describe the change differently.  I'm not sure 
how I missed this in the original implementation, since I obviously checked it 
for the 8bit case.  Too long ago to remember :)

--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40359] email.parse part.get_filename() fails to unwrap long attachment file names (legacy API)

2020-04-28 Thread R. David Murray


R. David Murray  added the comment:

As far as I know you currently still have to specify the policy.  It was, yes, 
intended that 'default' become the actual default.  I could have sworn there 
was an open issue for doing this, but I can't find it.  I remember having a 
conversation with someone who said they were going to work on getting it done, 
but unfortunately I don't remember who :(

I'm not very active in the python community currently so I can't really drive 
it, but it should definitely happen.

--

___
Python tracker 
<https://bugs.python.org/issue40359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40359] email.parse part.get_filename() fails to unwrap long attachment file names (legacy API)

2020-04-23 Thread R. David Murray


R. David Murray  added the comment:

Yeah, that looks like a bug in the old API.  If you try the new API, it does 
the right thing.  To do that, import email.policy and make your 
message_as_string call:

  email.message_from_string(raw, policy=email.policy.default)

Note, however, that you really ought to be using message_from_bytes.  
Serialized email messages are bytes, not unicode, and using message_from_string 
will get you in to other trouble.

I don't know if it is worth fixing the old API.

--
title: email.parse part.get_filename() fails to unwrap long attachment file 
names -> email.parse part.get_filename() fails to unwrap long attachment file 
names (legacy API)

___
Python tracker 
<https://bugs.python.org/issue40359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


Change by R. David Murray :


--
stage: patch review -> backport needed

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


R. David Murray  added the comment:

Thanks!

--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


R. David Murray  added the comment:


New changeset 614f17211c5fc0e5b828be1d3320661d1038fe8f by Ashwin Ramaswami in 
branch 'master':
bpo-39073: validate Address parts to disallow CRLF (#19007)
https://github.com/python/cpython/commit/614f17211c5fc0e5b828be1d3320661d1038fe8f


--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39966] mock 3.9 bug: Wrapped objects without __bool__ raise exception

2020-03-28 Thread R. David Murray


R. David Murray  added the comment:

My guess is that it isn't so much that __bool__ is special, as that the 
evaluation of values in a boolean context is special.  What you have to do to 
make a mock behave "correctly" in the face that I'm not sure (I haven't 
investigated).  And I might be wrong.

--

___
Python tracker 
<https://bugs.python.org/issue39966>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-15 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the PR.  I've made some review comments.

--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27793] Double underscore variables in module are mangled when used in class

2020-03-06 Thread R. David Murray


R. David Murray  added the comment:

You are welcome to open a doc-enhancement issue for the global docs.  For the 
other, as noted already if you want to advocate for a change to this behavior 
you need to start on python-ideas, but I don't think you will get any traction.

Another possible enhancement you could propose (in a new issue) is to have the 
global statement check for variables that start with '__' and do something 
appropriate such as issue a warning...although I don't really know how hard 
that would be to implement.

--

___
Python tracker 
<https://bugs.python.org/issue27793>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage may need to support RFC-non-compliant MIME parameter encoding (encoded words in quotes) for output.

2020-02-29 Thread R. David Murray


R. David Murray  added the comment:

I actually agree: if most (by market share) MUAs handle the RFC-incorrect 
parameter encoding style, and a significant portion does not handle the RFC 
correct style, then we should support the de-facto standard rather than the 
official standard as the default.  I just wish Microsoft would write better 
software :)  If on the other hand it is only microsoft out of the big market 
share players that is broken, I'm not sure I'd want it to be the default.  But 
we could still support it optionally.

So yeah, we could have a policy control that governs which one is actually used.

So this is a feature request, and ideally should be supported by an 
investigation of what MUAs support what, by market share.  And there's another 
question: does this only affect the filename parameter, or is it all MIME 
parameters?  I would expect it to be the latter, but someone should check at 
least a few examples of that to be sure.

--
stage:  -> needs patch
title: EmailMessage.add_header doesn't work -> EmailMessage may need to support 
RFC-non-compliant MIME parameter encoding (encoded words in quotes) for output.
type: behavior -> enhancement

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39793] make_msgid fail on FreeBSD 12.1-RELEASE-p1 with different domains

2020-02-29 Thread R. David Murray


R. David Murray  added the comment:

I don't object to this patch, but that sure looks like a broken system.

--

___
Python tracker 
<https://bugs.python.org/issue39793>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39757] EmailMessage bad encoding for international domain

2020-02-28 Thread R. David Murray


R. David Murray  added the comment:

This is not actually a duplicate of 11783.  Rereading (parts of) that issue, we 
decided we currently have no good way to do automatic conversion between 
unicode and internationalized domains, so the user of the library has to do it 
themselves.  This means that the bug *here* is that the new email API is 
*wrongly* encoding the non-ascii in the domain by using an encoded word.  I'm 
surprised at that; I thought I'd guarded against it.

What should be happening here is that an error should be raised when that 
header is set (or possibly when it is accessed/serialized, but when set would 
be better I think) saying that there is non-ascii in the domain part.

--
resolution: duplicate -> 
stage: resolved -> needs patch
status: closed -> open
superseder: email parseaddr and formataddr should be IDNA aware -> 
title: EmailMessage wrong encoding for international domain -> EmailMessage bad 
encoding for international domain

___
Python tracker 
<https://bugs.python.org/issue39757>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-28 Thread R. David Murray


R. David Murray  added the comment:

Since Outlook is one of the mailers that generates the non-RFC-compliant 
headers, it doesn't surprise me all that much that it can't interpret the RFC 
compliant headers correctly.

I'm not sure there is anything we can do here.

I suppose someone could do a survey of mail clients and document which ones can 
handle which style of parameter encoding.  If it turns out more handle the 
"wrong" way than handle the "right" way, we could consider adopting to the 
de-facto standard, although I won't like it much :)

(There is also a possibility there is a bug in our RFC compliance, but this is 
the first problem report I've seen.)

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

The legacy API appears to be using an RFC-incorrect (but common) encoded-word 
encoding, while the new API is using the RFC-compliant MIME-parameter encoding 
(% encoding).  Which email client are you using?

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

Actually, given that the contentmanager does accept a charset parameter for 
text content, it does seem reasonable to treat this as a bug.  But as I said 
fixing it may not be trivial.

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

I think you are saying that you want the charset in the encoded filename to be 
GBK rather than utf-8?  utf-8 should certainly display correctly in your email 
client, though, so if it is not there is something else going wrong.  

As far as the 3 tuple not working to set the charset...I believe what is 
happening there is that a header created by the application gets "refolded" on 
serialization, and refolding doesn't keep the existing charset, it converts 
everything to utf-8.  This is an intentional part of the design: the library 
handles the gory details of MIME and uses utf-8 as the charset for application 
created content.  It is actually an accident of the implementation that the 
tuple form of the filename is even accepted; you will note that it is *not* 
documented in the contentmanager docs.

It wouldn't be crazy to ask for this as a feature, and it could even be treated 
as a bug that it doesn't work if we want to, but it may not be easy to "fix", 
because it goes against the design philosophy of the new API.

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-02-04 Thread R. David Murray


R. David Murray  added the comment:

message_from_bytes

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-02-04 Thread R. David Murray


R. David Murray  added the comment:

If we can get an actual reproducer using message_as_bytes I'd feel more 
comfortable with the fix.  I worry that there is some other bug this is 
exposing that should be fixed instead.

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10740] sqlite3 module breaks transactions and potentially corrupts data

2020-01-25 Thread R. David Murray


R. David Murray  added the comment:

Please open a new issue for this question.

--

___
Python tracker 
<https://bugs.python.org/issue10740>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24337] Implement `http.client.HTTPMessage.__repr__` to make debugging easier

2020-01-22 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the PR, but I've noted an issue on the review.  In any case we 
should agree on what goes in the repr here in this issue before actually 
implementing anything.

--

___
Python tracker 
<https://bugs.python.org/issue24337>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39309] Please delete my account

2020-01-20 Thread R. David Murray


R. David Murray  added the comment:

AFAIR it can only be done using the roundup command line on the server.

--
nosy: +ezio.melotti

___
Python tracker 
<https://bugs.python.org/issue39309>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-01-20 Thread R. David Murray

R. David Murray  added the comment:

Since you parsed it as a string it is not really legitimate to serialize it as 
bytes.  (That will work if the input message only contains ascii, but not if it 
contains unicode).  You'll get the same error if you replace the garbage with 
the "’".  Using errors=replace is not crazy, but it hides the actual problem.  
Let's see what other people think :)

In theory you could "fix" this by encoding the unicode using the charset 
specified by the container.  I have no idea how complicated it will be do that, 
and it would be a new feature: parsing strings is specified to only work with 
ASCII input, currently.

I put "fix" in quotes, because even if you make text parts like this example 
work, you still can't handle non-text 8bit mime parts.  Is it worth doing 
anyway?

Really, message_as_string and friends should just be avoided entirely, maybe 
even deprecated.

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-07 Thread R. David Murray


R. David Murray  added the comment:

Are you saying there is no (http) RFC compliant way to fix this, or no way to 
fix it with the email library parsers?  If the latter, the library is pretty 
flexible and for internal stdlib use it would probably be permissible to 
directly call methods in the internal parsing module, if those would be useful.

I haven't re-read the issue to reload my brain, so this question may be off 
point (except for the first clause of the question).

--

___
Python tracker 
<https://bugs.python.org/issue23434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23147] Possible error in _header_value_parser.py

2020-01-07 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the ping.  Whether or not Serhiy's patch fixed the original problem, 
the algorithm rewrite has happened so this issue is no longer relevant in any 
case.

--
stage: test needed -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue23147>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-24 Thread R. David Murray


R. David Murray  added the comment:

I don't see the change to the test in the PR.  Did you miss a push or is github 
doing something wonky with the review?  (I haven't used github review in a 
while and I had forgetten how hard it is to use...)

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39131] signing needs two serialisation passes

2019-12-24 Thread R. David Murray


R. David Murray  added the comment:

Ideally this should be exposed by extending the content manager.  Instantiating 
MIME classes is part of the old API, not the new. The code in the PR may well 
be correct, but class should be hidden from the normal user (of the new API).  
I'm not sure what the best way to specify the signing function will be, but I'm 
guessing a new keyword parameter in the content API.

Note that the current content management API is more of a framework than a 
fully worked out system, so figuring out the best way to add this may require 
some design discussion.

--

___
Python tracker 
<https://bugs.python.org/issue39131>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

One more tweak to the test and we'll be good to go.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

Hmm.  Yes, \r\n should be disallowed in the arguments to Address.  I thought it 
already was, so that's a bug.  That bug produces the other apparent bug as 
well: because the X: was treated as a separate line, the previous header did 
not need double quotes so they are no longer added.

So there's no 3.8 specific bug here, but there is a bug.

--
title: email regression in 3.8: folding -> email incorrect handling of crlf in 
Address objects.

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39071] email.parser.BytesParser - parse and parsebytes work not equivalent

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

All of which isn't to discount that you might have a found a bug, by the way, 
if you want to investigate further :)

--

___
Python tracker 
<https://bugs.python.org/issue39071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39071] email.parser.BytesParser - parse and parsebytes work not equivalent

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

The problem is that you are starting with different inputs.  unicode strings 
and bytes are different things, and so parsing them can produce different 
results.  The fact of that matter is that email messages are defined to be 
bytes, so parsing a unicode string pretending it is an email message is just 
asking for errors anyway.  The string parsing methods are really only provided 
for backward compatibility and historical reasons.

I thought this was clear from the existing documentation, but clearly it isn't 
:)  I'll review a suggested doc change, but the thing to explain is not that 
parse and parsebytes might produce different results, but that parsing email 
from strings is not a good idea and will likely produce unexpected results for 
anything except the simplest non-mime messages.

Note: the reason you got different checksums might have had to do with line 
ends, depending on how you calculated the checksums.  You should also consider 
using get_content and not get_payload.  get_payload has a weird legacy API that 
doesn't always do what you think it will, and that might be another source of 
checksum issues.  But really, parsing a unicode representation of a mime 
message is just likely to be buggy.

--

___
Python tracker 
<https://bugs.python.org/issue39071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-16 Thread R. David Murray


R. David Murray  added the comment:

In general your solution looks good, just a few naming comments and an 
additional test request.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-15 Thread R. David Murray


R. David Murray  added the comment:

The example you want to look at is get_unstructured.  That shows both lookback 
and modification of the parse tree to handle the whitespace between encoded 
words.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-14 Thread R. David Murray


R. David Murray  added the comment:

And you are right that this is a very common bug in email programs.  So common 
that I suspect the RFC folks will eventually have to accept it as a de-facto 
standard.  So we do need to support it in the python email library.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-14 Thread R. David Murray


R. David Murray  added the comment:

Yes, google should fix their bug.  However, the python email package tries very 
hard to interpret even RFC-non-compliant emails when there is a way to do so.  
As I said, the package already tries to interpret headers such as google is 
generating, it's just that there is a bug in that interpretation: it is keeping 
the blank between then encoded words when it should not be.  That bug can be 
fixed, in get_raw_encoded_word and/or get_parameter, in 
email._header_value_parser.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-13 Thread R. David Murray


R. David Murray  added the comment:

That header is *completely* non-RFC compliant.  If gmail generated that header 
there is something very wrong in google-land :(

The RFC compliant formatting for that header looks like this:

Content-Disposition: attachment;
 filename*=utf-8''Schulbesuchsbest%C3%A4ttigung.pdf

You will note that this is nothing like encoded word format.  Encoded words are 
not valid inside quoted strings, and quoted strings can't be used in mime 
header attributes if there are non-ascii characters involved.  Nor can encoded 
words.  

Now, all that said, there is an obvious rule that can be followed to understand 
what that header is trying to convey, and the current parser already implements 
most of it (you will find comments about it in the parser, as well as defects 
being registered).  So, a patch to _header_value_parser to fix the error 
recovery will be accepted.  I've looked at the code to remind myself, but not 
deeply enough to be *sure* where the changes need to be made.  There are two 
possibilities I see off the bat (and both may need fixing): 
get_bare_quoted_string and get_parameter.  Either one or both of those may be 
forgetting that whitespace between encoded words should be dropped.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-13 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the report.  Can you provide an example that reproduces the problem? 
 

Per the RFC, lines may be broken before whitespace in certain places in certain 
headers, but that does not make the whitespace go away.  Only the crlf sequence 
is removed when unfolding the header, per the RFC, so your proposed fix is 
incorrect. I suspect your example header is invalid, and the question will then 
become is there some sort of Postel-style error recovery we can and want to do 
in the function that parses the content-disposition header.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38625] SpooledTemporaryFile does not seek correctly after being rolled over

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

The docs currently say "The returned object is a file-like object whose _file 
attribute is either an io.BytesIO or io.StringIO object (depending on whether 
binary or text mode was specified) or a true file object, depending on whether 
rollover() has been called."  The fact that taking an iterator gets you 
whatever the *current* _file object is is implied by that but not made 
explicit.  A doc update to make that explicit would probably be appropriate.

--

___
Python tracker 
<https://bugs.python.org/issue38625>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38698] While parsing email message id: UnboundLocalError

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

Actually, the success path there should also check that value is empty, and if 
it is not register a defect for that as well.

--

___
Python tracker 
<https://bugs.python.org/issue38698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38672] mimetypes.init() fails if no access to one of known files

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

I haven't looked at this in detail, but here are my general thoughts: I think 
it would be reasonable to expect that the module would function even if the 
file permissions are screwed up, similar to how unix commands that try to read 
.netrc will (try to) function even if its permissions are wrong.  I would, 
however, expect the module to emit a warning in that case.  I'm of two minds 
about the behavior when the caller specifies filenames explicitly.  I could see 
that going either way, but I lean slightly toward making the behavior 
consistent.  While the programmer might appreciate the traceback, the user of 
the program would probably appreciate the "try to keep going" behavior, since 
the filenames provided will often be in the same class of "standard defaults" 
as the existing well known files are, just in the context of that particular 
application.  But like I said, that is just a lean, and I could go the other 
way on this as well :)

I haven't looked at the isflie issue, but it seems reasonable that if the path 
exists we should make sure it is a file before reading it...but perhaps readfp 
will effectively do that?  Write a test and see what happens :)

I don't know whether to call this change a bug fix or a feature, so I guess 
we'd default to feature unless someone can tilt the balance with an argument :)

--

___
Python tracker 
<https://bugs.python.org/issue38672>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38698] While parsing email message id: UnboundLocalError

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

More tests are always good :)

The "correct" solution here (as far as I remember, its has been a while since 
I've had time to even looked at the _header_value_parser code) would be to add 
a new 'invalid-msg-id' token, and do this:

message_id = MessageID()
try:
token, value = get_msg_id(value)
message_id.append(token)
except HeaderParseError as ex:
message_id = InvalidMessageID(value)
message_id.defects.append(InvalidHeaderDefect(
f"Invalid msg_id: {ex}"))
return message_id

--

___
Python tracker 
<https://bugs.python.org/issue38698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37532] email.header.make_header() doesn't work if any `ascii` code is out of range(128)

2019-08-01 Thread R. David Murray

R. David Murray  added the comment:

Right, and the python email package fully supports non ascii:

>>> msg = EmailMessage()
>>> msg['Subject'] = "Panamá- Casco Antiguo"
>>> bytes(msg)
b'Subject: =?utf-8?q?Panam=C3=A1-?= Casco Antiguo\n\n'
>>> str(msg)
'Subject: Panamá- Casco Antiguo\n\n'
>>> msg['subject']
'Panamá- Casco Antiguo'

make_header also supports non-ascii, you just have to tell it what charset you 
want to use.  Like I said, make_header is part of the *legacy* API, and it 
really is a pain to use.  That's why we wrote the new API.

--

___
Python tracker 
<https://bugs.python.org/issue37532>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37532] email.header.make_header() doesn't work if any `ascii` code is out of range(128)

2019-08-01 Thread R. David Murray


R. David Murray  added the comment:

The input header is not valid (non-ascii is not allowed in headers), so you 
shouldn't expect make_header to do anything sensible.  Note that this is the 
legacy API, which is a toolkit and does not hold your hand when it comes to RFC 
compliance.  Aside from any other concerns, this is long standing behavior (it 
is the same in python2), and it doesn't make sense to change the behavior of a 
legacy API.

--
resolution:  -> not a bug
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37532>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37491] IndexError in get_bare_quoted_string

2019-08-01 Thread R. David Murray


Change by R. David Murray :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37491>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37492] should email.utils.parseaddr treat a@b. as invalid email ?

2019-07-13 Thread R. David Murray


R. David Murray  added the comment:

Right, those absolutely are valid addresses.  A resolver will normally look up 
a name with an internal dot first as if it were an FQDN, but if it does so and 
does not get an answer it will then look it up again as a "local" address 
(appending in turn the strings from the 'search' directive in resolv.conf or 
equivalent) *if* it does not end in a final dot.  If it does end in a final 
dot, no further lookup as local is done.

While it isn't *normal* to send email to a TLD using a trailing dot, it is 
*legal*.  In theory the address 'postmaster@com.' ought to be a valid email 
address (I doubt that it actually is, though). On the other hand, I will be 
very surprised if *all other* TLDs are without valid email addresses, 
especially the new ones.  It is also easy to imagine an environment using email 
with private single label domain names using trailing dots specifically to 
suppress appending of search domains for sandboxing reasons.  Thus the email 
library must support it as valid, both for RFC reasons and for practical 
reasons.

--

___
Python tracker 
<https://bugs.python.org/issue37492>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37482] Email address display name fails with both encoded words and special chars

2019-07-10 Thread R. David Murray


R. David Murray  added the comment:

The display name is a phrase, and a phrase is a sequence of words, and a word 
is either a quoted string or an atom.  So it is legal to mix quoted strings and 
encoded words in a display name.  I'd vote to do whichever one is easier to 
implement :)  (I haven't looked at your PR yet and unfortunately my time is 
limited :(

--

___
Python tracker 
<https://bugs.python.org/issue37482>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37482] Email address display name fails with both encoded words and special chars

2019-07-10 Thread R. David Murray

R. David Murray  added the comment:

FYI, it would have been most helpful if you had posted your example in the 
issue text instead of as an attached file, as it explains the problem better 
than your text does :)

Here is a minimal reproducer:

>>> m = EmailMessage(policy=strict)
>>> m['From'] = '"Foo Bar, España" '
>>> bytes(m)
b'From: Foo Bar, =?utf-8?q?Espa=C3=B1a?= \n\n'

This serialization of the header is, as you say, invalid.  Either the comma 
should be encoded, or the "Foo Bar," should be in quotes.

--

___
Python tracker 
<https://bugs.python.org/issue37482>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37357] mbox From line wrongly detected

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

This problem is the whole reason "mangle_from" exists in the email library...

--

___
Python tracker 
<https://bugs.python.org/issue37357>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31445] Index out of range in get of message.EmailMessage.get()

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

Note that the reporter indicated that the message was an instance of 
EmailMessage (the new API).  You'd need to use policy-default to get that using 
message_from_string.  But yes, this was fixed in another issue.

--
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue31445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32179] Empty email address in headers triggers an IndexError

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

BareQuotedString implies the new API is being used, though that was not made 
clear in the report.  However, unlike the other recently closed issue, this one 
was in fact fixed (and I have a vague memory of reviewing the PR):

>>> m = message_from_string('ReplyTo: ""', policy=default)
>>> m['ReplyTo']
'""'

--

___
Python tracker 
<https://bugs.python.org/issue32179>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32178] Some invalid email address groups cause an IndexError instead of a HeaderParseError

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

The fact that the original report mentions HeaderParserError implies that the 
new API is being used, though the report didn't make that clear.  The problem 
still exists:

>>> m = message_from_string("To: :Foo  
>>> \n\n", policy=default)
>>> m['To']
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/rdmurray/python/p38/Lib/email/message.py", line 391, in 
__getitem__
return self.get(name)
  File "/home/rdmurray/python/p38/Lib/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
  File "/home/rdmurray/python/p38/Lib/email/policy.py", line 163, in 
header_fetch_parse
return self.header_factory(name, value)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 602, in 
__call__
return self[name](name, value)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 197, in 
__new__
cls.parse(value, kwds)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 343, in 
parse
groups.append(Group(addr.display_name,
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 315, 
in display_name
return self[0].display_name
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 382, 
in display_name
return self[0].display_name
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 564, 
in display_name
if res[0].token_type == 'cfws':
IndexError: list index out of range

--
resolution: out of date -> 
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue32178>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19645] decouple unittest assertions from the TestCase class

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

"But - what are we solving for here?"  I'll tell you what my fairly common use 
case is.  Suppose I have some test infrastructure code, and I want to make some 
assertions in it.  What I invariably end up doing is passing 'self' into the 
infrastructure method/class just so I can call the assert methods from it.  I'd 
much rather be just calling the assertions, without carrying the whole test 
object around.  It *works* to do that, but it bothers me every time I do it or 
read it in code, and it makes the infrastructure code needlessly more 
complicated and slightly harder to understand/read.

--

___
Python tracker 
<https://bugs.python.org/issue19645>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:


New changeset 0416d6f05a96e0f1b3751aa97abfffe6d3323976 by R. David Murray (Miss 
Islington (bot)) in branch '3.7':
bpo-27737: Allow whitespace only headers encoding (GH-13478) (#13517)
https://github.com/python/cpython/commit/0416d6f05a96e0f1b3751aa97abfffe6d3323976


--

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36520] Email header folded incorrectly

2019-05-22 Thread R. David Murray

R. David Murray  added the comment:

Nevermind, I was testing with the wrong version of python.  This bug was 
introduced somewhere after 3.4 :(

>>> from email.message import EmailMessage
>>> m = EmailMessage()
>>> m['Subject'] = 'Hello Wörld! Hello Wörld! Hello Wörld! Hello Wörld!Hello 
>>> Wörld!'
>>> bytes(m)
b'Subject: Hello =?utf-8?q?W=C3=B6rld!_Hello_W=C3=B6rld!_Hello_W=C3=B6rld!?=\n 
Hello =?utf-8?=?utf-8?q?q=3FW=3DC3=3DB6rld!Hello=3F=3D_W=C3=B6rld!?=\n\n'

--

___
Python tracker 
<https://bugs.python.org/issue36520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36520] Email header folded incorrectly

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:

Can you demonstrate the problem with an actual email object?  
header_store_parse is not meant to be called directly.

--

___
Python tracker 
<https://bugs.python.org/issue36520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:

Thanks.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
versions: +Python 3.7, Python 3.8 -Python 3.4, Python 3.5, Python 3.6

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray

R. David Murray  added the comment:


New changeset ef5bb25e2d6147cd44be9c9b166525fb30485be0 by R. David Murray 
(Batuhan Taşkaya) in branch 'master':
bpo-27737: Allow whitespace only headers encoding (#13478)
https://github.com/python/cpython/commit/ef5bb25e2d6147cd44be9c9b166525fb30485be0


--

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33524] non-ascii characters in headers causes TypeError on email.policy.Policy.fold

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:


New changeset feac6cd7753425fba006e97e2d9b74a0c0c75894 by R. David Murray 
(Abhilash Raj) in branch 'master':
bpo-33524: Fix the folding of email header when max_line_length is 0 or None 
(#13391)
https://github.com/python/cpython/commit/feac6cd7753425fba006e97e2d9b74a0c0c75894


--

___
Python tracker 
<https://bugs.python.org/issue33524>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21315] email._header_value_parser does not recognise in-line encoding changes

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

I don't see that line of code in unstructured_ew_without_whitespace.diff.

Oh, you are referring to his monkey patch.  Yes, that is not a suitable 
solution for anyone but him, and I don't think he meant to imply otherwise :)

--

___
Python tracker 
<https://bugs.python.org/issue21315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21315] email._header_value_parser does not recognise in-line encoding changes

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

A cleaner/safer solution here would be:

  tok, *remainder = _wsp_splitter(value, 1)
  if _rfc2047_matcher(tok):
  tok, *remainder = value.partition('=?')
  
where _rfc2047_matcher would be a regex that matches a correctly formatted 
encoded word. There a regex for that in the header.py module, though for this 
application we don't need the groups it has.

Abhilash, I'm not sure why you say the proposed solution only works for utf-8 
and 'q'?

--

___
Python tracker 
<https://bugs.python.org/issue21315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Right, one of the fundamental principles of the email library is that when 
parsing input we do not ever raise an error.  We may note defects, but whatever 
we get we *must* parse and turn in to *something*.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Good point about the backward compatibility.  Yes I agree, I think raising the 
error is probably better.  A deprecation warning seems like a good path 
forward...I will be very surprised if anyone encounters it, though :)

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

As for the other, I don't see the need for a custom error.  It's a ValueError 
in my view.  I wouldn't object to it strongly, but note that this error is 
content dependent.  If there's nothing to encode, you can "get away with" a 
shorter maxlen.  Though why you would want to is beyond me, and that's another 
reason I don't think this warrants a custom error class.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Can you demonstrate the parsing error?  maxlen should have no effect during 
parsing.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34424] Unicode names break email header

2019-05-14 Thread R. David Murray


R. David Murray  added the comment:

Thank you.  I don't believe this is a security issue.

--

___
Python tracker 
<https://bugs.python.org/issue34424>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36910] Certain Malformed email causes email.parser to throw AttributeError

2019-05-14 Thread R. David Murray


R. David Murray  added the comment:

Not a security issue, no.  This isn't C where a stack overflow can give an 
attacker a vector for injecting arbitrary code.

Per the Parser contract ("raise no exceptions, only register defects"), this 
should, as you say, register a defect 
(email.errors.InvalidMultipartContentTransferEncodingDefect) and assume a CTE 
of 7bit for the rest of the parsing.  The problem here is that the feedparser 
is running into the "hack" I put in place in python3.2 for dealing with invalid 
binary data in headers (which is to turn it into a Header with charset 
unknown-8bit).  That works most of the time, but in cases like this it breaks 
down :(

Note that the new API (policy=default and friends) handles this without error.

--

___
Python tracker 
<https://bugs.python.org/issue36910>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34424] Unicode names break email header

2019-05-13 Thread R. David Murray


R. David Murray  added the comment:

Approved and merged.  Cheryl, can you shepherd this through the backport 
process, please? I'm contributing infrequently enough that I'm not even sure 
which version we are on :)

--

___
Python tracker 
<https://bugs.python.org/issue34424>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34424] Unicode names break email header

2019-05-13 Thread R. David Murray


R. David Murray  added the comment:


New changeset 45b2f8893c1b7ab3b3981a966f82e42beea82106 by R. David Murray (Jens 
Troeger) in branch 'master':
bpo-34424: Handle different policy.linesep lengths correctly. (#8803)
https://github.com/python/cpython/commit/45b2f8893c1b7ab3b3981a966f82e42beea82106


--

___
Python tracker 
<https://bugs.python.org/issue34424>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36893] email.headerregistry.Address blocks Unicode local part addr_spec accepted elsewhere

2019-05-12 Thread R. David Murray


R. David Murray  added the comment:

In order to legitimately have a non-ascii localpart, you *must* be using 
RFC6532 and RFC6531.  In the email package you do this by using 
policy=SMTPUTF8, or setting utf8=True in your custom Policy.  In smtplib you do 
this by specifying smtputf8 in the mail_options list to sendmail, or passing a 
message with a policy that has utf8=True to send_message.

I notice in answering this report that this is not really documented clearly.  
The information is there, but only if you already know how the RFCs work.  Some 
variation of the text above should be added to the smtplib documentation, and 
an example of using SMTPUTF8 should be added to the email examples chapter.

However, you are correct, there are couple of bugs here.

The rendering done by as_string (and as_bytes) is the best that we can do 
without raising an error...but we should probably be raising an error if the 
rendering policy does not have utf8=True and we don't have an "original source 
line" from parsing a message (which is the case here), rather than using the 
incorrect RFC2047 encoding.

The second bug, the one you are reporting, is that we apparently missed the 
constructor of Address when we were adding RFC6532 support.  If you look at the 
comment above that code, it is purposefully trying to raise an error if the 
addr_spec is invalid and it was provided by the *application* (as opposed to 
email.Parser).  But with RFC6532 support, it should be valid to have a local 
part that has non-ascii in an Address, and the error, as I noted above, should 
be raised only at serialization time and when we don't have an original source 
string.  So that raise should be modified to explicitly ignore the 
NonASCIILocalPartDefect.  (Really, Address should take a policy argument.  
That's a bigger change, but it would be the "right way" to fix this.)

Raising the error on serialization could cause some breakage if existing 
programs are "getting away" with specifying non-ascii local parts but not doing 
it via addr_spec.  It is breakage that should happen, I think, but we may want 
to only do it in a feature release.

--

___
Python tracker 
<https://bugs.python.org/issue36893>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25545] email parsing docs: clarify that only ASCII strings are supported

2019-04-26 Thread R. David Murray

R. David Murray  added the comment:

This is one of the infelicities of the translation of the old API to python3: 
'get_payload(decode=True)' actually means 'give me the bytes version of this 
payload", which in this case is the utf-8, which is what you got.  
get_payload() means "give me the payload as a string without doing CTE 
decoding".In a sort of accident-of-translation this turns out to mean "give 
me the unicode" in this particular case.  If the payload had been base64 
encoded, you'd have gotten a unicode string containing the base64 characters.

Which I grant you is all very confusing.

For a more consistent API, use the new one:

>>> import email.policy
>>> m = email.message_from_bytes(msg_bytes, policy=email.policy.default)
>>> bytes(m)
b'MIME-Version: 1.0\nContent-Type: text/plain;\n 
charset=utf-8\nContent-Transfer-Encoding: 8bit\nContent-Disposition: 
attachment;\n filename="camper_store.csv"\n\nBeyo\xc4\x9flu-\xc4\xb0st'

>>> m.get_content()
'Beyoğlu-İst'

Here we don't even pretend that you have any use for the encoded version, 
either CTE encoding or binary encoding: get_content gives you the "fully 
decoded" payload (decoded from CTE *and* decoded to unicode).

--

___
Python tracker 
<https://bugs.python.org/issue25545>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19770] NNTP.post broken

2019-04-10 Thread R. David Murray


R. David Murray  added the comment:

I do, and sure.  I won't be able to review it, though :(

--

___
Python tracker 
<https://bugs.python.org/issue19770>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36460] Add AMP MIME type support

2019-03-28 Thread R. David Murray


R. David Murray  added the comment:

Not sure what you mean by "depend on that structure".  A quick grep
shows the only stdlib modules that use mimetimes are urllib and
http.server.

Backward compatibility will of course be a significant issue here.

--

___
Python tracker 
<https://bugs.python.org/issue36460>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36460] Add AMP MIME type support

2019-03-28 Thread R. David Murray


R. David Murray  added the comment:

That link should do for our purposes here.

The fact that it is an 'x-' mimetype means it has not been approved at
any level.  There might be an in progress application to the mimetype
registry, but if so the web site doesn't mention it anywhere obvious.

I'm not sure about the filetype problem, but I'm guessing amp isn't the
only mimetype that will have this issue going forward, so we probably need
to come up with a solution.  You don't need support from the mimetypes
module to create or manipulate emails using the content-type, though,
so it isn't a blocker on that side.

That lightning thing is *seriously* hokey :(

--

___
Python tracker 
<https://bugs.python.org/issue36460>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36460] Add AMP MIME type support

2019-03-28 Thread R. David Murray


R. David Murray  added the comment:

Can you provide some links to relevant RFCs or other official documents?

--

___
Python tracker 
<https://bugs.python.org/issue36460>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36261] email examples should not gratuitously mess with preamble

2019-03-11 Thread R. David Murray


Change by R. David Murray :


--
stage:  -> needs patch

___
Python tracker 
<https://bugs.python.org/issue36261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36261] email examples should not gratuitously mess with preamble

2019-03-11 Thread R. David Murray


R. David Murray  added the comment:

We could also change both of them to be more correct and say something like "If 
you are reading this your browser probably does not support MIME, and you will 
have to find a MIME aware email client or decode the message by hand."  That 
demonstrates what the preamble is actually for.

--

___
Python tracker 
<https://bugs.python.org/issue36261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36261] email examples should not gratuitously mess with preamble

2019-03-11 Thread R. David Murray


R. David Murray  added the comment:

I don't see "several", can you point to the other instances?  I only see that 
one case you mention (for reference, it is in Doc/includes/email-mime.py).  The 
other case of setting preamble is actually correct ("You will not see this in a 
MIME-aware mail reader").  We could change email-mime to say the same thing.

--

___
Python tracker 
<https://bugs.python.org/issue36261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29539] [smtplib] collect response data for all recipients

2019-02-28 Thread R. David Murray


Change by R. David Murray :


--
nosy: +sls

___
Python tracker 
<https://bugs.python.org/issue29539>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36148] smtplib.SMTP.sendmail: mta status codes only accessible by local variables

2019-02-28 Thread R. David Murray


R. David Murray  added the comment:

Sorry, that should be #29539.

--
superseder: Deprecate string concatenation without plus -> [smtplib] collect 
response data for all recipients

___
Python tracker 
<https://bugs.python.org/issue36148>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36148] smtplib.SMTP.sendmail: mta status codes only accessible by local variables

2019-02-28 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the PR, but this is a duplicate of #29539, which I think has a 
better API proposal.  Since the original author never actually submitted a PR 
there, perhaps you could pick up his work (after pinging the issue).

--
resolution:  -> duplicate
stage: patch review -> resolved
status: open -> closed
superseder:  -> Deprecate string concatenation without plus
versions: +Python 3.8 -Python 3.7

___
Python tracker 
<https://bugs.python.org/issue36148>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34464] There are inconsitencies in the treatment of True, False, None, and __debug__ keywords in the docs

2019-02-20 Thread R. David Murray


Change by R. David Murray :


--
versions:  -Python 3.6

___
Python tracker 
<https://bugs.python.org/issue34464>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36041] email: folding of quoted string in display_name violates RFC

2019-02-20 Thread R. David Murray


R. David Murray  added the comment:

I'm afraid I don't have time to parse through the file you uploaded.  Can you 
produce a pull request or a diff showing your fix?  And ideally some added 
tests :)  But whatever you can do is great, if you don't have time maybe 
someone else will pick it up (I unfortunately don't have time, though I should 
be able to do a review of a PR).

--

___
Python tracker 
<https://bugs.python.org/issue36041>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36041] email: folding of quoted string in display_name violates RFC

2019-02-19 Thread R. David Murray


R. David Murray  added the comment:

Since Address itself renders it correctly (str(address)), the problem is going 
to take a bit of digging to find.  I'm guessing the quoted_string atom is 
getting transformed incorrectly into something else at some point during the 
folding.

--

___
Python tracker 
<https://bugs.python.org/issue36041>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35863] email.headers wraps headers badly

2019-02-01 Thread R. David Murray


R. David Murray  added the comment:

Well, "display" in the context of email includes looking at the raw email 
serialized as a text file.  This is something one can do in most mailers. I use 
nmh as my mailer, which only shows raw headers, so I myself would be personally 
affected if headers were not normally wrapped at 78 characters when possible :)

The >80 characters issue you mention is fixed by the folder used by the new 
API.  That folder will use encoded words to wrap overlong tokens in text 
portions of headers, which may or may not have been the best decision (jury is 
still out on that one), and for non-text headers it does not put in that /n if 
the word won't fit on the next line if wrapped.  (Or at least its not supposed 
to, so if you find a case where it does, please submit a bug report.)

email.Header is a legacy module and no longer maintained.  And yes, I realize 
it is used by default.  There should be an open issue about going through a 
deprecation cycle to make the new API the default, but I've lost track and have 
no time to push for it myself.

--

___
Python tracker 
<https://bugs.python.org/issue35863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35863] email.headers wraps headers badly

2019-01-31 Thread R. David Murray


R. David Murray  added the comment:

The rules are: lines should be less than 78 characters; and that lines may be 
broken only at FWS (folding whitespace), not in the middle of words.  Putting 
these rules together, you get the result that the email library produces.  
"Conservative in what you send" means *following the RFC rules*, which is what 
the code does.  The failure here is on the Outlook side, which is supposed to 
be being "liberal in what you accept".  Which it is clearly not doing.

In case you want to read the RFCs, which I just reviewed, Content-ID is defined 
to have the same syntax as Message-ID, and Message-Id is defined as 
"Message-ID:" msg-id CRLF, while 'msg-id' is defined as:

msg-id  =   [CFWS] "<" id-left "@" id-right ">" [CFWS]

Which means that a fold is permitted before the id itself.

We could consider an "enhancement" request to cater to Outlook's deficiency, 
since email clients that are actually limited to 78 character lines are 
vanishingly rare these days.  The change would only be made in the new API 
folder, and I myself wouldn't have the time (or desire :) to work on it, but if 
you want to submit an issue as see what the other email team members think and 
produce a PR if the vote is positive, that's fine by me.

Do you know if it is all headers that Outlook has this problem with, or only 
some?  I will admit that it has been long enough since I implemented this that 
I can't confirm that I made sure it was legal to fold *every* header after the 
colon, but I'm pretty sure I did.

--

___
Python tracker 
<https://bugs.python.org/issue35863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35863] email.headers wraps headers badly

2019-01-31 Thread R. David Murray


R. David Murray  added the comment:

Also note that you might want to switch to the new API, the folder it uses is 
smarter, although in this case I think it will produce the same result because 
it is the "best" rendering of the header under the circumstances.

--

___
Python tracker 
<https://bugs.python.org/issue35863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35863] email.headers wraps headers badly

2019-01-31 Thread R. David Murray


R. David Murray  added the comment:

That is correct folding.  The word is too long to fit within the 78 character 
default if put on the same line as the label, but does fit on a line by itself.

If Outlook can't understand such a header it is even more broken than I thought 
it was :(  You can work around the outlook but by specifying a 
longer-than-standard (but still RFC compliant) line length when serializing the 
message.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed
type:  -> behavior

___
Python tracker 
<https://bugs.python.org/issue35863>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20767] Some python extensions can't be compiled with clang 3.4

2019-01-31 Thread R. David Murray


Change by R. David Murray :


--
nosy:  -r.david.murray

___
Python tracker 
<https://bugs.python.org/issue20767>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35837] smtpd PureProxy breaks on mail_options keyword argument

2019-01-29 Thread R. David Murray


R. David Murray  added the comment:

I'm closing this in favor of #35799 because someone has to first make a 
remove-or-fix decision, which is mentioned there.

--
resolution:  -> duplicate
stage: patch review -> resolved
status: open -> closed
superseder:  -> fix or remove smtpd.PureProxy
type:  -> behavior

___
Python tracker 
<https://bugs.python.org/issue35837>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >