[issue28973] [doc] The fact that multiprocess.Queue uses serialization should be documented.

2021-08-07 Thread R. David Murray


R. David Murray  added the comment:

Mentioning ids would be pretty much redundant with mentioning pickle.  If it is 
pickled its id is going to change.  I think Davin was suggesting that while the 
use of serialization is documented, it is not documented *consistently*.  
Everywhere serialization happens it should be mentioned in the docs.

Regardless, a proposed doc PR is the way forward here.

--

___
Python tracker 
<https://bugs.python.org/issue28973>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44685] Email package issue with Outlook msg files

2021-07-23 Thread R. David Murray


R. David Murray  added the comment:

That file appears to be a binary file?  By itself it isn't enough to reproduce 
the problem.  Can you provide a complete script as well as the email message 
you are parsing that demonstrates the problem?

By "looks like any other eml file", are you including the MIME headers 
associated with the part?  Because it is the MIME headers that contain the 
information you say is missing.  Mostly likely, outlook is not supplying that 
information for these transformed eml files.  If you can supply a copy of the 
actual email message you are parsing, we should be able to confirm that.

--

___
Python tracker 
<https://bugs.python.org/issue44685>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44694] Message from BytesParser cannot be flattened immediately

2021-07-23 Thread R. David Murray


R. David Murray  added the comment:

I suspect maxheaderlen=0 works because it causes the original lines to be 
re-emitted without any folding or other processing.  Without that, lines longer 
than the default max_line_length get refolded.

Can you provide an example of an input message that triggers this problem?

--

___
Python tracker 
<https://bugs.python.org/issue44694>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44660] email.feedparser: support RFC 6532 section 3.5

2021-07-23 Thread R. David Murray


R. David Murray  added the comment:

Having looked at the cited part of the RFC (but not tried to analyze it in 
detail), I think you are correct.  I've also glanced at your PR, and I think 
your approach is correct in broad outline, but I haven't looked at the details. 
 For full message/global support, however, it will also be necessary to look at 
the output side: given a message/global part, a transfer encoding should be 
applied when serializing with cte_type=7bit.  Support for message/global should 
also be added to the contentmanager.

I won't have an objection if this is accepted with only the feedparser support, 
but I would recommend that the remaining pieces of support for message/global 
be added before the feature is released.

--

___
Python tracker 
<https://bugs.python.org/issue44660>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43124] [security] smtplib multiple CRLF injection

2021-07-19 Thread R. David Murray


R. David Murray  added the comment:

My apologies, I did not think about the possibility of an English issue.  I was 
reacting to the "security report speak", which I find often makes a security 
issue sound worse than it is :)  Thank you for reporting this problem, and I do 
think we should fix it.

My posting was directed at the severity of the issue, since it was potentially 
holding up a release.  My point about the example is that without an example of 
code that could reasonably be expected to use user input in a call that could 
inject newlines, we can treat this as a low priority issue.  If we had a 
proposed example of such code, then the priority would be higher.  If it was an 
example of such code "in the wild", then it would be quite high :)

The reason I'm saying we should have an example in order to consider it higher 
priority is that I cannot see *any* likelihood that this would be a problem in 
practice.  Let me explain.

putcmd is an *internal* interface.  If we look at the commands that call putcmd 
or docmd, the only ones that pass extra data that aren't pretty obviously safe 
(ie: not clearly sanitized data) are rcpt and mail[*].  In both cases the item 
of concern is optionslist.  optionslist is a list of *SMTP server options*.  
This is not data that is reasonably taken from user input, it is data provided 
*by the programmer*.

[*] I did double check to make sure that email.utils.parseaddr sanitizes both 
\r and \r, just to be sure :)

Therefore this is *not* a significant security issue.  But as I said, we should 
take the "defense in depth" approach and apply the check in putcmd as you 
recommend.  I just don't think it needs to hold up a release.

--

___
Python tracker 
<https://bugs.python.org/issue43124>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44637] Quoting issue on header Reply-To

2021-07-15 Thread R. David Murray


R. David Murray  added the comment:

Yes, compat32 uses a different parser and folder (the legacy ones), that have a 
lot of small bugs relative to the RFCs (which is why I rewrote it).

--

___
Python tracker 
<https://bugs.python.org/issue44637>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44637] Quoting issue on header Reply-To

2021-07-15 Thread R. David Murray


R. David Murray  added the comment:

Forget what I said about my different error, I made a mistake running the test 
script.

Interesting.  If it is related to the length of the name, then the problem is 
most likely in the folding algorithm, specifically in what happens when the 
"display-name" token is wrapped across lines.  And indeed, if we clone the SMTP 
policy and set the max_line_len to 1000 in your sample script. it renders the 
header correctly.

The problem here is that the surrounding quotation marks are added by the 
'value' property of DisplayName, but that property isn't invoked when handling 
parts of the display name separately during mulit-line folding.  I was always 
bothered by the handling of the quotation marks in the part of the parser and 
folder dealing with quoted strings, but I never hit on a better way to do it.  
This, unfortunately, is going to be non-trivial problem to solve.  It is 
probably going to require an ugly hack in the folding code :(

Really, the handling of quoted strings throughout the _header_value_parser code 
is...a hack :(  There are probably other places where it breaks down during 
multi-line folding.  If we are lucky the hack can just add special handling for 
the quoted-string token type in the folder.  If we aren't it will get messier :(

Glancing at the folder code (it's been a long time since I worked on it), one 
possible approach (not necessarily the best one) would be to mark the first and 
last sub-tokens in a quoted-string so that folder knows to put in a leading or 
trailing quote mark, respectively, during folding.

--

___
Python tracker 
<https://bugs.python.org/issue44637>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44637] Quoting issue on header Reply-To

2021-07-14 Thread R. David Murray


R. David Murray  added the comment:

There is definitely a problem here, though I see a different problem when I run 
it (AttributeError: 'Group' object has no attribute 'local_part', presumably 
because of the ':' not getting escaped correctly).  I believe it applies to any 
address header, not just Reply-To.  Unfortunately I don't have time to 
investigate the cause, at least right now.  An interesting first step on 
diagnosing it might be to produce a minimal example: start deleting special 
characters from inside that quoted string until you find the one (or ones) that 
is triggering it.

--

___
Python tracker 
<https://bugs.python.org/issue44637>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43124] [security] smtplib multiple CRLF injection

2021-07-13 Thread R. David Murray


R. David Murray  added the comment:

s/header injection/command injection/

--

___
Python tracker 
<https://bugs.python.org/issue43124>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43124] [security] smtplib multiple CRLF injection

2021-07-13 Thread R. David Murray


R. David Murray  added the comment:

This bug report starts with "a malicious user with direct access to 
`smtplib.SMTP(..., local_hostname, ..)", which is a senseless supposition.  
Anyone with "access to" the SMTP object could just as well be talking directly 
to the SMTP server and do anything they want that SMTP itself allows.

The concern here is that data a program might obtain *from unsanitized user 
input* could be used to do header injection.  The "proof of concept" does not 
address this at all.  We'd need to see a scenario under which data that could 
reasonably be derived from user input ends up being passed as arguments to an 
smtplib method that calls putcmd with arguments.

So, I would rate this as *very* low impact issue, unless someone has an *actual 
example* of code using smtplib that passes user input through to smtplib 
commands in an exploitable way.

That said, it is perfectly reasonable to be proactive here and prevent 
scenarios we haven't yet thought of, by doing as recommended (and a bit more) 
by raising a ValueError if 'args' in the putcmd call contain either \n or \r 
characters.  I don't think we need to check 'cmd', because I can't see any 
scenario in which the SMTP command would be derived from user input.  If you 
want to be *really* paranoid you could check cmd too, and since it will always 
be a short string the additional performance impact will be minor.

--
type: performance -> security
versions: +Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 
3.9

___
Python tracker 
<https://bugs.python.org/issue43124>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43493] EmailMessage mis-folding headers of a certain length

2021-07-06 Thread R. David Murray


R. David Murray  added the comment:

Ah, yes, the problem is more subtle than I thought.

The design here is that we should be starting with the largest lexical unit, 
seeing if that fits on the current line, or a line by itself, and if so, using 
that, and if not, move down to the next smaller lexical unit and try again, 
until we are finally left with an unbreakable unit.  For unstructured headers 
such as Subject the lexical units should be encoded words followed by blank 
delimited words.  I'm guessing the code is treating the collection of words it 
has accumulated as a unit in the above algorithm, and since it fits on a line 
by itself, it goes with that.  So yeah, it's sort of intentional.

So the bug here is that in your step 2 we ideally want to be considering 
whether the last token on the current line is at the same lexical level as the 
token that precedes it...and if so, and if moving that token to the next line 
lets the remainder fit on the first line, we should do that.  Exactly how to 
implement that correctly is a good question...it's been too long since I wrote 
that code, and I may not have time to investigate it more deeply.

If you come up with something based on my description of the intent above, I 
should be able to review it (though you might need to ping me directly to get 
my attention).

--

___
Python tracker 
<https://bugs.python.org/issue43493>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39100] email.policy.SMTP throws AttributeError on invalid header

2021-07-06 Thread R. David Murray


R. David Murray  added the comment:

How are you encountering this error?  The following program runs without 
exception for me on master:

from email import message_from_binary_file
from email.policy import SMTP

msg = message_from_binary_file(open('mail.eml', 'rb'), policy=SMTP)
print(msg)

--

___
Python tracker 
<https://bugs.python.org/issue39100>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44560] Unrecognized charset "eucgb2312_cn" in email header for many MUA

2021-07-06 Thread R. David Murray

R. David Murray  added the comment:

I can't tell tell for sure if this behavior is intentional or not from a quick 
glance at the code (though like you I wouldn't think it would be).

That's part of the legacy api, at this point.  The new api will just use utf8:

from email.message import EmailMessage

m = EmailMessage()
m['Subject'] = '中文'

print(bytes(m))

results in

b'Subject: =?utf-8?b?5Lit5paH?=\n\n'

The fix, assuming it is correct, would be to add the line:

'eucgb2312_cn': 'gb2312',

to the CODEC_MAP in email/charset.py, and then specify the internal codec name 
in your Charset call.  I'm not sure that's right, though...once upon I time I 
think I understood the logic behind the charset module, but I no longer 
remember the details.

I'd recommend just using the new API and not the legacy API.

--

___
Python tracker 
<https://bugs.python.org/issue44560>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42892] AttributeError in email.message.get_body()

2021-05-19 Thread R. David Murray


R. David Murray  added the comment:

Actually, I'm wrong.  The body of a part can be a string, and that's what's 
going to happen with a malformed body of something claiming to be a multipart. 
The problem is that there is code that doesn't guard against this possibility.  
The following patch against master fixes the bug listed here, as well as 
iter_parts().  But it causes one test suite failure so it isn't a correct patch 
as it stands:

diff --git a/Lib/email/message.py b/Lib/email/message.py
index 3701b30553..d5d4a2385a 100644
--- a/Lib/email/message.py
+++ b/Lib/email/message.py
@@ -982,7 +982,7 @@ def _find_body(self, part, preferencelist):
 if subtype in preferencelist:
 yield (preferencelist.index(subtype), part)
 return
-if maintype != 'multipart':
+if maintype != 'multipart' or not self.is_multipart():
 return
 if subtype != 'related':
 for subpart in part.iter_parts():
@@ -1087,7 +1087,7 @@ def iter_parts(self):
 
 Return an empty iterator for a non-multipart.
 """
-if self.get_content_maintype() == 'multipart':
+if self.is_multipart():
 yield from self.get_payload()
 
 def get_content(self, *args, content_manager=None, **kw):

Maybe someone can take this and finish it (with tests)...I may or may not have 
time to get back to this.

--

___
Python tracker 
<https://bugs.python.org/issue42892>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42892] AttributeError in email.message.get_body()

2021-05-19 Thread R. David Murray


R. David Murray  added the comment:

Yes, that's the real question.  That's what needs to be fixed, otherwise we'll 
just keep finding new bugs.  For example, try calling iter_parts() on that 
message.  It isn't pretty :)

--

___
Python tracker 
<https://bugs.python.org/issue42892>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43922] Double dots in quopri transported emails

2021-04-23 Thread R. David Murray


R. David Murray  added the comment:

As far as I know the only resources are the context manager docs and the source 
code.  The stdlib content manager can serve as a model.  I have to admit that 
it was long enough ago that I wrote that code that I'd have to re-read the docs 
and code myself to figure it out :)

I'm afraid I don't really have time to do a complete review, but at a quick 
glance your patch doesn't look too complicated to me.  Quick observation:  the 
comment should explain why the dot check is done, and that it isn't needed for 
rfc compliance.

--

___
Python tracker 
<https://bugs.python.org/issue43922>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43922] Double dots in quopri transported emails

2021-04-23 Thread R. David Murray


R. David Murray  added the comment:

Since python is doing the right thing here, I don't see a particularly good 
reason to put a hack into the stdlib to fix the failure of third party software 
to adhere to standards.  (On the output side.  We do follow Postel's rule on 
input and try hard to handle broken but recoverable input.)  I don't actually 
*object* to it, though, as long as it follows the standard on output, and is a 
*simple* change.

Please note that you can fix this locally by implementing and using a custom 
content manager.

--

___
Python tracker 
<https://bugs.python.org/issue43922>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43493] EmailMessage mis-folding headers of a certain length

2021-03-18 Thread R. David Murray


R. David Murray  added the comment:

Parsing and newlines have nothing to do with this bug, actually.  I don't think 
your foldfix post-processing is going to do what you want in the general case.

The source of the bug here is in the folding algorithm in _header_value_parser. 
 It has checks to see if the "text so far" will fit within the header width, 
and it starts a new line under vafious conditions.  For example, if there is a 
single word after Subject: whose length is, say, 70, it would produce the 
effect you show, because the single word would fit without folding or encoding 
on a new line.  I don't think this violates the RFC.  What your example shows 
makes it look like the folder is treating all of the text as if it were a 
single word, which is obviously wrong.  It is supposed to break at spaces.  You 
will note that if you increase the repeat count in your example to 16 it folds 
the line correctly.  So the bug has something to do with the total text so far 
accumulated for the line being right in that window where it won't fit on the 
first line but does fit on a line by itself.  This is obviously a bug in the 
folder, since it should be splitting that text if it isn't a sin
 gle word, not moving it to a new line as a whole.

Note that this bug is still present on master.

--

___
Python tracker 
<https://bugs.python.org/issue43493>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43090] parseaddr (from email.utils) returns invalid input string instead of ('', '')

2021-02-01 Thread R. David Murray


R. David Murray  added the comment:

The return value is correct.  Interpreted as an email address, 'randomstring' 
is a local mailbox.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue43090>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43061] subprocess: feature request: Get only the stdout of the last shell command

2021-01-29 Thread R. David Murray


R. David Murray  added the comment:

This has nothing to do with python other than the fact that you are using it to 
capture stdout.  You have to figure out how to get the output you want to be 
what shows up on stdout, python has no knowledge of what commands you put in 
your shell script, and it *cannot* have any knowledge of that.  I think you 
need to learn more about basic shell scripting and unix pipelines and how 
stdout works.

Also note that making people nosy on an issue is not a good idea if you are not 
part of the triage team.  You should leave that for the bug triage people to 
do, as they know who's attention on the issue will be most useful.  In the 
future when you open an issue please simply wait a while for a response.

--
resolution:  -> rejected
stage:  -> resolved
status: open -> closed
versions:  -Python 3.10

___
Python tracker 
<https://bugs.python.org/issue43061>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42433] mailbox.mbox fails on non ASCII characters

2020-11-30 Thread R. David Murray


R. David Murray  added the comment:

After thinking about it some more, I think given that when there is no 
non-ascii mbox will happily treat *anything* as valid on the "From " line, that 
we should consider blowing up on non-ascii to be a bug.

--

___
Python tracker 
<https://bugs.python.org/issue42433>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42484] get_obs_local_part fails to handle empty local part

2020-11-30 Thread R. David Murray


R. David Murray  added the comment:

Yep, you've found another in a category of bugs that have shown up in the 
parser: places where there is a missing check for there being any value at all 
before checking character [0].

In this case, the fix should be to add

if not obs_local_part:
return obs_local_part, value

just before the if that is blowing up.

--
title: parse_message_id, get_msg_id, get_obs_local_part is poorly written -> 
get_obs_local_part fails to handle empty local part

___
Python tracker 
<https://bugs.python.org/issue42484>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42433] mailbox.mbox fails on non ASCII characters

2020-11-22 Thread R. David Murray

R. David Murray  added the comment:

The problem with that archive is that it is not in proper mbox format.  It 
contains the following line (5689):

From here I was hoping to run something like “dbus-send –system 
–dest=Test.Me –print-reply /Japan Japan.Reset.Test string:”Hello””

You will note that there is no leading '>' on that line to escape that 'From '. 
 So mbox tries to build a 'From ' line from it, and fails because 'From ' lines 
should not contain any non-ascii characters.  It can be argued that that 
failure is sub-optimal...it should probably be calling decode('ascii', 
errors='replace') so that the parse doesn't fail, just like it would not fail 
if there were no non-ascii in the unescaped 'From ' line.

--

___
Python tracker 
<https://bugs.python.org/issue42433>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41553] encoded-word abused for header line folding causes RFC 2047 violation

2020-08-17 Thread R. David Murray


R. David Murray  added the comment:

Yes for the registry changes.  I thought we had fixed the bug that was causing 
message-id to get encoded, but maybe it still exists in 3.7?  I don't remember 
when we fixed it (and I may be remembering wrong!)

As for X- "unstructured headers" getting trashed, by *definition* in the rfc, 
if the header body is unstructured it must support RFC encoding.  If does not, 
it is not an unstructured header field.  Which is why I said we need to think 
about what characteristics the default parser should have.  The RFC doesn't 
really speak to that, it expects every header to be one of the defined 
types...but while an X- header might be of a defined type, the email package 
can't know that unless it is told, so what should we use as the default parsing 
strategy?  "text without encoded words" isn't really RFC compliant, I think.  
(Though I'll admit it has been a while since I last reviewed the relevant RFCs.)

Note that I believe that we have an open issue (or at least an open discussion) 
that we should change the 'refold_source' default from 'long' to 'none', which 
means that X- headers would at least be passed through by default.  It would 
also mitigate this problem, and can be used as a local workaround for headers 
that are just getting passed through and not modified.

--

___
Python tracker 
<https://bugs.python.org/issue41553>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41553] encoded-word abused for header line folding causes RFC 2047 violation

2020-08-14 Thread R. David Murray


R. David Murray  added the comment:

It's not really an abuse.  It is, however, buggy.  It should be being applied 
*only* when the header contains unstructured text.  Unfortunately I made the 
choice to treat any header that doesn't have a specific parser as unstructured, 
and that was a wrong choice which should be fixed.  It is an interesting 
question what should be used as the default parser, though.  Suggestions and 
code are welcome :)

There should be specific header parsers for headers that contain message ids.  
That was on my todo list but did not get done before my circumstances changed 
and my free-time focus moved away from python development work :(

The message_id parser exists.  In-Reply-To just needs to be declared in the 
header registry as a MessageIDHeader (not sure how that got missed).  Writing a 
Header class for References should be trivial, it's just a list of message ids. 
 That will fix those headers, and I suggest we do that asap.

Fixing the default-to-unstructured will take a bit more thought and should 
probably be split out into a separate issue.  I can review and give advice 
(though you may have to ping me directly) but I won't have time to write any 
code.

--

___
Python tracker 
<https://bugs.python.org/issue41553>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41402] email: ContentManager.set_content calls nonexistent method encode() on bytes

2020-07-31 Thread R. David Murray


R. David Murray  added the comment:

The fix looks good to me.  Don't know how I made that mistake, and obviously I 
didn't write a test for it...

--

___
Python tracker 
<https://bugs.python.org/issue41402>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41387] Escape needed in the email documentation example

2020-07-24 Thread R. David Murray


Change by R. David Murray :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41387>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41145] EmailMessage.as_string is altering the message state and actually fix bugs

2020-07-10 Thread R. David Murray


R. David Murray  added the comment:

The as_strings docs say:

"Flattening the message may trigger changes to the Message if defaults need to 
be filled in to complete the transformation to a string (for example, MIME 
boundaries may be generated or modified)."

So, while this is indeed an API design bug, it isn't an actual bug in the code 
but rather is expected behavior, currently.  The historical reason for this is 
that the generator code looks at the entire message to make sure the boundary 
string is unique.  My long term plan for email included plans to rewrite the 
generator, and I was going to fix this issue at that point.  My life got too 
busy to be able to continue with email development work, though, so that never 
happened.

It has been *years* since I've looked at the code.  Thinking about it now, I'm 
wondering if it would be possible to use a GUID technique to generate the 
boundary and thus do exactly as you say: have make_alternative (and anything 
else that causes a boundary to be needed) pre-create the boundary.  That, I 
think, would mean we wouldn't need to change the generator, even though it 
would still be doing its (inefficient) check that the boundary was unique.  I'm 
not sure if it would work, though; it's been too long since I've looked at the 
relevant code.

--
type: resource usage -> behavior

___
Python tracker 
<https://bugs.python.org/issue41145>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41206] behaviour change with EmailMessage.set_content

2020-07-07 Thread R. David Murray


R. David Murray  added the comment:

I'm short of time, if someone could approve Mark's PR and merge it it would be 
great. There wasn't supposed to be any behavior change other than the one 
documented in #40597.

--

___
Python tracker 
<https://bugs.python.org/issue41206>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41023] smtplib does not handle Unicode characters

2020-06-18 Thread R. David Murray


R. David Murray  added the comment:

If you use the 'sendmail' function for sending, then it is entirely your 
responsibility to turn the email into "wire format".  Unicode is not wire 
format, but if you give sendmail a string that only has ascii in it it nicely 
converts it to binary for you.  But given that the email RFCs specify specific 
ways to indicate how non-ascii is encoded in the message, there is no way for 
the smtp library to know now to do that correctly when passed an arbitrary 
unicode string, so it doesn't try.  sendmail requires *you* do do the encoding 
to binary, indicating you at least think that you got the RFC parts right :)  
In python2, strings are binary by default, so in that case you are handing 
sendmail binary format data (with the same assumption that you got the RFC 
parts right)...if you passed the python2 function a unicode string it would 
probably complain as well, although not in the same way.

If your raw email is RFC compliant, then you can do: sendmail(from, to, 
mymsg.encode()).

I see from your example that you are trying to use the email package to 
construct the email, which is good.  But, emails are *binary*, they are not 
unicode, so passing "message_from_string" a unicode string containing non-ascii 
isn't going to do what you are expecting, any more than passing unicode to the 
'sendmail' function did.  message_from_string is really only useful for doing 
certain sorts of debug and ought to be deprecated.  Or produce a warning when 
handed a string containing non-ascii.  (There are historical reasons why it 
doesn't :(

And then you should use smtplib's 'sendmessage' function, which understands 
email package messages and will Do the Right Thing with them (including the 
extraction of the to and from addresses your code is currently doing).

However, even if you encoded your raw message to binary and then passed it to 
message_from_bytes, your example message is *not* RFC compliant: without MIME 
headers, an email with non-ascii characters in the body is technically in 
violation of the RFC.  Most email programs will handle that particular message 
despite that, but not all.  You are better off using the email package to 
construct a properly RFC formatted email,  using the new API (ex: msg = 
EmailMessage() (not Message), and then doing msg['from'] = address, etc, and 
msg.set_content(your unicode string body)). I can't really give you much advice 
here (nor should I, this being a bug tracker :) because I don't know how 
exactly how the data is coming in to your program in your real use case.

Once you have a properly constructed EmailMessage object, you should use 
smtplib's 'sendmessage' function, which understands email package messages and 
will Do the Right Thing with them (including the extraction of the to and from 
addresses your code is currently doing, as well as properly handling BCC, which 
means deleting BCC headers from the message before sending it, which your code 
does not do and which 'sendmail' would not do.)

SMTPUTF8 is about non-ascii in the email *headers*, and most SMTP servers these 
days do not yes support it[*]. Some of the big ones do, though (I believe gmail 
does).

[*] although that doesn't explain why what you got was SMTPSenderRefused.  You 
should have gotten SMTPNotSupportedError.

--
resolution:  -> works for me
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41023>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2020-05-28 Thread R. David Murray


R. David Murray  added the comment:


New changeset 21017ed904f734be9f195ae1274eb81426a9e776 by Abhilash Raj in 
branch 'master':
bpo-39040: Fix parsing of email mime headers with whitespace between 
encoded-words. (gh-17620)
https://github.com/python/cpython/commit/21017ed904f734be9f195ae1274eb81426a9e776


--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-17 Thread R. David Murray


Change by R. David Murray :


--
stage: backport needed -> resolved

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-17 Thread R. David Murray


R. David Murray  added the comment:


New changeset c1f1ddf30a595c2bfa3c06e54fb03fa212cd28b5 by Miss Islington (bot) 
in branch '3.8':
bpo-40597: email: Use CTE if lines are longer than max_line_length consistently 
(gh-20038) (gh-20084)
https://github.com/python/cpython/commit/c1f1ddf30a595c2bfa3c06e54fb03fa212cd28b5


--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-13 Thread R. David Murray


R. David Murray  added the comment:

Thanks, Arkadiusz.

--
resolution:  -> fixed
stage: patch review -> backport needed
versions:  -Python 3.5, Python 3.6, Python 3.7

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-13 Thread R. David Murray


R. David Murray  added the comment:


New changeset 6f2f475d5a2cd7675dce844f3af436ba919ef92b by Arkadiusz Hiler in 
branch 'master':
bpo-40597: email: Use CTE if lines are longer than max_line_length consistently 
(gh-20038)
https://github.com/python/cpython/commit/6f2f475d5a2cd7675dce844f3af436ba919ef92b


--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40597] generated email message exceeds RFC-mandated limit of 998 characters

2020-05-11 Thread R. David Murray


R. David Murray  added the comment:

The PR looks good to me, but I describe the change differently.  I'm not sure 
how I missed this in the original implementation, since I obviously checked it 
for the 8bit case.  Too long ago to remember :)

--

___
Python tracker 
<https://bugs.python.org/issue40597>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40359] email.parse part.get_filename() fails to unwrap long attachment file names (legacy API)

2020-04-28 Thread R. David Murray


R. David Murray  added the comment:

As far as I know you currently still have to specify the policy.  It was, yes, 
intended that 'default' become the actual default.  I could have sworn there 
was an open issue for doing this, but I can't find it.  I remember having a 
conversation with someone who said they were going to work on getting it done, 
but unfortunately I don't remember who :(

I'm not very active in the python community currently so I can't really drive 
it, but it should definitely happen.

--

___
Python tracker 
<https://bugs.python.org/issue40359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40359] email.parse part.get_filename() fails to unwrap long attachment file names (legacy API)

2020-04-23 Thread R. David Murray


R. David Murray  added the comment:

Yeah, that looks like a bug in the old API.  If you try the new API, it does 
the right thing.  To do that, import email.policy and make your 
message_as_string call:

  email.message_from_string(raw, policy=email.policy.default)

Note, however, that you really ought to be using message_from_bytes.  
Serialized email messages are bytes, not unicode, and using message_from_string 
will get you in to other trouble.

I don't know if it is worth fixing the old API.

--
title: email.parse part.get_filename() fails to unwrap long attachment file 
names -> email.parse part.get_filename() fails to unwrap long attachment file 
names (legacy API)

___
Python tracker 
<https://bugs.python.org/issue40359>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


Change by R. David Murray :


--
stage: patch review -> backport needed

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


R. David Murray  added the comment:

Thanks!

--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-29 Thread R. David Murray


R. David Murray  added the comment:


New changeset 614f17211c5fc0e5b828be1d3320661d1038fe8f by Ashwin Ramaswami in 
branch 'master':
bpo-39073: validate Address parts to disallow CRLF (#19007)
https://github.com/python/cpython/commit/614f17211c5fc0e5b828be1d3320661d1038fe8f


--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39966] mock 3.9 bug: Wrapped objects without __bool__ raise exception

2020-03-28 Thread R. David Murray


R. David Murray  added the comment:

My guess is that it isn't so much that __bool__ is special, as that the 
evaluation of values in a boolean context is special.  What you have to do to 
make a mock behave "correctly" in the face that I'm not sure (I haven't 
investigated).  And I might be wrong.

--

___
Python tracker 
<https://bugs.python.org/issue39966>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2020-03-15 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the PR.  I've made some review comments.

--

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27793] Double underscore variables in module are mangled when used in class

2020-03-06 Thread R. David Murray


R. David Murray  added the comment:

You are welcome to open a doc-enhancement issue for the global docs.  For the 
other, as noted already if you want to advocate for a change to this behavior 
you need to start on python-ideas, but I don't think you will get any traction.

Another possible enhancement you could propose (in a new issue) is to have the 
global statement check for variables that start with '__' and do something 
appropriate such as issue a warning...although I don't really know how hard 
that would be to implement.

--

___
Python tracker 
<https://bugs.python.org/issue27793>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage may need to support RFC-non-compliant MIME parameter encoding (encoded words in quotes) for output.

2020-02-29 Thread R. David Murray


R. David Murray  added the comment:

I actually agree: if most (by market share) MUAs handle the RFC-incorrect 
parameter encoding style, and a significant portion does not handle the RFC 
correct style, then we should support the de-facto standard rather than the 
official standard as the default.  I just wish Microsoft would write better 
software :)  If on the other hand it is only microsoft out of the big market 
share players that is broken, I'm not sure I'd want it to be the default.  But 
we could still support it optionally.

So yeah, we could have a policy control that governs which one is actually used.

So this is a feature request, and ideally should be supported by an 
investigation of what MUAs support what, by market share.  And there's another 
question: does this only affect the filename parameter, or is it all MIME 
parameters?  I would expect it to be the latter, but someone should check at 
least a few examples of that to be sure.

--
stage:  -> needs patch
title: EmailMessage.add_header doesn't work -> EmailMessage may need to support 
RFC-non-compliant MIME parameter encoding (encoded words in quotes) for output.
type: behavior -> enhancement

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39793] make_msgid fail on FreeBSD 12.1-RELEASE-p1 with different domains

2020-02-29 Thread R. David Murray


R. David Murray  added the comment:

I don't object to this patch, but that sure looks like a broken system.

--

___
Python tracker 
<https://bugs.python.org/issue39793>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39757] EmailMessage bad encoding for international domain

2020-02-28 Thread R. David Murray


R. David Murray  added the comment:

This is not actually a duplicate of 11783.  Rereading (parts of) that issue, we 
decided we currently have no good way to do automatic conversion between 
unicode and internationalized domains, so the user of the library has to do it 
themselves.  This means that the bug *here* is that the new email API is 
*wrongly* encoding the non-ascii in the domain by using an encoded word.  I'm 
surprised at that; I thought I'd guarded against it.

What should be happening here is that an error should be raised when that 
header is set (or possibly when it is accessed/serialized, but when set would 
be better I think) saying that there is non-ascii in the domain part.

--
resolution: duplicate -> 
stage: resolved -> needs patch
status: closed -> open
superseder: email parseaddr and formataddr should be IDNA aware -> 
title: EmailMessage wrong encoding for international domain -> EmailMessage bad 
encoding for international domain

___
Python tracker 
<https://bugs.python.org/issue39757>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-28 Thread R. David Murray


R. David Murray  added the comment:

Since Outlook is one of the mailers that generates the non-RFC-compliant 
headers, it doesn't surprise me all that much that it can't interpret the RFC 
compliant headers correctly.

I'm not sure there is anything we can do here.

I suppose someone could do a survey of mail clients and document which ones can 
handle which style of parameter encoding.  If it turns out more handle the 
"wrong" way than handle the "right" way, we could consider adopting to the 
de-facto standard, although I won't like it much :)

(There is also a possibility there is a bug in our RFC compliance, but this is 
the first problem report I've seen.)

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

The legacy API appears to be using an RFC-incorrect (but common) encoded-word 
encoding, while the new API is using the RFC-compliant MIME-parameter encoding 
(% encoding).  Which email client are you using?

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

Actually, given that the contentmanager does accept a charset parameter for 
text content, it does seem reasonable to treat this as a bug.  But as I said 
fixing it may not be trivial.

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39771] EmailMessage.add_header doesn't work

2020-02-27 Thread R. David Murray


R. David Murray  added the comment:

I think you are saying that you want the charset in the encoded filename to be 
GBK rather than utf-8?  utf-8 should certainly display correctly in your email 
client, though, so if it is not there is something else going wrong.  

As far as the 3 tuple not working to set the charset...I believe what is 
happening there is that a header created by the application gets "refolded" on 
serialization, and refolding doesn't keep the existing charset, it converts 
everything to utf-8.  This is an intentional part of the design: the library 
handles the gory details of MIME and uses utf-8 as the charset for application 
created content.  It is actually an accident of the implementation that the 
tuple form of the filename is even accepted; you will note that it is *not* 
documented in the contentmanager docs.

It wouldn't be crazy to ask for this as a feature, and it could even be treated 
as a bug that it doesn't work if we want to, but it may not be easy to "fix", 
because it goes against the design philosophy of the new API.

--

___
Python tracker 
<https://bugs.python.org/issue39771>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-02-04 Thread R. David Murray


R. David Murray  added the comment:

message_from_bytes

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-02-04 Thread R. David Murray


R. David Murray  added the comment:

If we can get an actual reproducer using message_as_bytes I'd feel more 
comfortable with the fix.  I worry that there is some other bug this is 
exposing that should be fixed instead.

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10740] sqlite3 module breaks transactions and potentially corrupts data

2020-01-25 Thread R. David Murray


R. David Murray  added the comment:

Please open a new issue for this question.

--

___
Python tracker 
<https://bugs.python.org/issue10740>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24337] Implement `http.client.HTTPMessage.__repr__` to make debugging easier

2020-01-22 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the PR, but I've noted an issue on the review.  In any case we 
should agree on what goes in the repr here in this issue before actually 
implementing anything.

--

___
Python tracker 
<https://bugs.python.org/issue24337>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39309] Please delete my account

2020-01-20 Thread R. David Murray


R. David Murray  added the comment:

AFAIR it can only be done using the roundup command line on the server.

--
nosy: +ezio.melotti

___
Python tracker 
<https://bugs.python.org/issue39309>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39384] Email parser creates a message object that can't be flattened as bytes.

2020-01-20 Thread R. David Murray

R. David Murray  added the comment:

Since you parsed it as a string it is not really legitimate to serialize it as 
bytes.  (That will work if the input message only contains ascii, but not if it 
contains unicode).  You'll get the same error if you replace the garbage with 
the "’".  Using errors=replace is not crazy, but it hides the actual problem.  
Let's see what other people think :)

In theory you could "fix" this by encoding the unicode using the charset 
specified by the container.  I have no idea how complicated it will be do that, 
and it would be a new feature: parsing strings is specified to only work with 
ASCII input, currently.

I put "fix" in quotes, because even if you make text parts like this example 
work, you still can't handle non-text 8bit mime parts.  Is it worth doing 
anyway?

Really, message_as_string and friends should just be avoided entirely, maybe 
even deprecated.

--

___
Python tracker 
<https://bugs.python.org/issue39384>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23434] support encoded filename in Content-Disposition for HTTP in cgi.FieldStorage

2020-01-07 Thread R. David Murray


R. David Murray  added the comment:

Are you saying there is no (http) RFC compliant way to fix this, or no way to 
fix it with the email library parsers?  If the latter, the library is pretty 
flexible and for internal stdlib use it would probably be permissible to 
directly call methods in the internal parsing module, if those would be useful.

I haven't re-read the issue to reload my brain, so this question may be off 
point (except for the first clause of the question).

--

___
Python tracker 
<https://bugs.python.org/issue23434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23147] Possible error in _header_value_parser.py

2020-01-07 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the ping.  Whether or not Serhiy's patch fixed the original problem, 
the algorithm rewrite has happened so this issue is no longer relevant in any 
case.

--
stage: test needed -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue23147>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-24 Thread R. David Murray


R. David Murray  added the comment:

I don't see the change to the test in the PR.  Did you miss a push or is github 
doing something wonky with the review?  (I haven't used github review in a 
while and I had forgetten how hard it is to use...)

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39131] signing needs two serialisation passes

2019-12-24 Thread R. David Murray


R. David Murray  added the comment:

Ideally this should be exposed by extending the content manager.  Instantiating 
MIME classes is part of the old API, not the new. The code in the PR may well 
be correct, but class should be hidden from the normal user (of the new API).  
I'm not sure what the best way to specify the signing function will be, but I'm 
guessing a new keyword parameter in the content API.

Note that the current content management API is more of a framework than a 
fully worked out system, so figuring out the best way to add this may require 
some design discussion.

--

___
Python tracker 
<https://bugs.python.org/issue39131>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

One more tweak to the test and we'll be good to go.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39073] email incorrect handling of crlf in Address objects.

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

Hmm.  Yes, \r\n should be disallowed in the arguments to Address.  I thought it 
already was, so that's a bug.  That bug produces the other apparent bug as 
well: because the X: was treated as a separate line, the previous header did 
not need double quotes so they are no longer added.

So there's no 3.8 specific bug here, but there is a bug.

--
title: email regression in 3.8: folding -> email incorrect handling of crlf in 
Address objects.

___
Python tracker 
<https://bugs.python.org/issue39073>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39071] email.parser.BytesParser - parse and parsebytes work not equivalent

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

All of which isn't to discount that you might have a found a bug, by the way, 
if you want to investigate further :)

--

___
Python tracker 
<https://bugs.python.org/issue39071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39071] email.parser.BytesParser - parse and parsebytes work not equivalent

2019-12-17 Thread R. David Murray


R. David Murray  added the comment:

The problem is that you are starting with different inputs.  unicode strings 
and bytes are different things, and so parsing them can produce different 
results.  The fact of that matter is that email messages are defined to be 
bytes, so parsing a unicode string pretending it is an email message is just 
asking for errors anyway.  The string parsing methods are really only provided 
for backward compatibility and historical reasons.

I thought this was clear from the existing documentation, but clearly it isn't 
:)  I'll review a suggested doc change, but the thing to explain is not that 
parse and parsebytes might produce different results, but that parsing email 
from strings is not a good idea and will likely produce unexpected results for 
anything except the simplest non-mime messages.

Note: the reason you got different checksums might have had to do with line 
ends, depending on how you calculated the checksums.  You should also consider 
using get_content and not get_payload.  get_payload has a weird legacy API that 
doesn't always do what you think it will, and that might be another source of 
checksum issues.  But really, parsing a unicode representation of a mime 
message is just likely to be buggy.

--

___
Python tracker 
<https://bugs.python.org/issue39071>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-16 Thread R. David Murray


R. David Murray  added the comment:

In general your solution looks good, just a few naming comments and an 
additional test request.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-15 Thread R. David Murray


R. David Murray  added the comment:

The example you want to look at is get_unstructured.  That shows both lookback 
and modification of the parse tree to handle the whitespace between encoded 
words.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-14 Thread R. David Murray


R. David Murray  added the comment:

And you are right that this is a very common bug in email programs.  So common 
that I suspect the RFC folks will eventually have to accept it as a de-facto 
standard.  So we do need to support it in the python email library.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-14 Thread R. David Murray


R. David Murray  added the comment:

Yes, google should fix their bug.  However, the python email package tries very 
hard to interpret even RFC-non-compliant emails when there is a way to do so.  
As I said, the package already tries to interpret headers such as google is 
generating, it's just that there is a bug in that interpretation: it is keeping 
the blank between then encoded words when it should not be.  That bug can be 
fixed, in get_raw_encoded_word and/or get_parameter, in 
email._header_value_parser.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-13 Thread R. David Murray


R. David Murray  added the comment:

That header is *completely* non-RFC compliant.  If gmail generated that header 
there is something very wrong in google-land :(

The RFC compliant formatting for that header looks like this:

Content-Disposition: attachment;
 filename*=utf-8''Schulbesuchsbest%C3%A4ttigung.pdf

You will note that this is nothing like encoded word format.  Encoded words are 
not valid inside quoted strings, and quoted strings can't be used in mime 
header attributes if there are non-ascii characters involved.  Nor can encoded 
words.  

Now, all that said, there is an obvious rule that can be followed to understand 
what that header is trying to convey, and the current parser already implements 
most of it (you will find comments about it in the parser, as well as defects 
being registered).  So, a patch to _header_value_parser to fix the error 
recovery will be accepted.  I've looked at the code to remind myself, but not 
deeply enough to be *sure* where the changes need to be made.  There are two 
possibilities I see off the bat (and both may need fixing): 
get_bare_quoted_string and get_parameter.  Either one or both of those may be 
forgetting that whitespace between encoded words should be dropped.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39040] Wrong attachement filename when mail mime header was too long

2019-12-13 Thread R. David Murray


R. David Murray  added the comment:

Thanks for the report.  Can you provide an example that reproduces the problem? 
 

Per the RFC, lines may be broken before whitespace in certain places in certain 
headers, but that does not make the whitespace go away.  Only the crlf sequence 
is removed when unfolding the header, per the RFC, so your proposed fix is 
incorrect. I suspect your example header is invalid, and the question will then 
become is there some sort of Postel-style error recovery we can and want to do 
in the function that parses the content-disposition header.

--

___
Python tracker 
<https://bugs.python.org/issue39040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38625] SpooledTemporaryFile does not seek correctly after being rolled over

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

The docs currently say "The returned object is a file-like object whose _file 
attribute is either an io.BytesIO or io.StringIO object (depending on whether 
binary or text mode was specified) or a true file object, depending on whether 
rollover() has been called."  The fact that taking an iterator gets you 
whatever the *current* _file object is is implied by that but not made 
explicit.  A doc update to make that explicit would probably be appropriate.

--

___
Python tracker 
<https://bugs.python.org/issue38625>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38698] While parsing email message id: UnboundLocalError

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

Actually, the success path there should also check that value is empty, and if 
it is not register a defect for that as well.

--

___
Python tracker 
<https://bugs.python.org/issue38698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38672] mimetypes.init() fails if no access to one of known files

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

I haven't looked at this in detail, but here are my general thoughts: I think 
it would be reasonable to expect that the module would function even if the 
file permissions are screwed up, similar to how unix commands that try to read 
.netrc will (try to) function even if its permissions are wrong.  I would, 
however, expect the module to emit a warning in that case.  I'm of two minds 
about the behavior when the caller specifies filenames explicitly.  I could see 
that going either way, but I lean slightly toward making the behavior 
consistent.  While the programmer might appreciate the traceback, the user of 
the program would probably appreciate the "try to keep going" behavior, since 
the filenames provided will often be in the same class of "standard defaults" 
as the existing well known files are, just in the context of that particular 
application.  But like I said, that is just a lean, and I could go the other 
way on this as well :)

I haven't looked at the isflie issue, but it seems reasonable that if the path 
exists we should make sure it is a file before reading it...but perhaps readfp 
will effectively do that?  Write a test and see what happens :)

I don't know whether to call this change a bug fix or a feature, so I guess 
we'd default to feature unless someone can tilt the balance with an argument :)

--

___
Python tracker 
<https://bugs.python.org/issue38672>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38698] While parsing email message id: UnboundLocalError

2019-11-24 Thread R. David Murray


R. David Murray  added the comment:

More tests are always good :)

The "correct" solution here (as far as I remember, its has been a while since 
I've had time to even looked at the _header_value_parser code) would be to add 
a new 'invalid-msg-id' token, and do this:

message_id = MessageID()
try:
token, value = get_msg_id(value)
message_id.append(token)
except HeaderParseError as ex:
message_id = InvalidMessageID(value)
message_id.defects.append(InvalidHeaderDefect(
f"Invalid msg_id: {ex}"))
return message_id

--

___
Python tracker 
<https://bugs.python.org/issue38698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37532] email.header.make_header() doesn't work if any `ascii` code is out of range(128)

2019-08-01 Thread R. David Murray

R. David Murray  added the comment:

Right, and the python email package fully supports non ascii:

>>> msg = EmailMessage()
>>> msg['Subject'] = "Panamá- Casco Antiguo"
>>> bytes(msg)
b'Subject: =?utf-8?q?Panam=C3=A1-?= Casco Antiguo\n\n'
>>> str(msg)
'Subject: Panamá- Casco Antiguo\n\n'
>>> msg['subject']
'Panamá- Casco Antiguo'

make_header also supports non-ascii, you just have to tell it what charset you 
want to use.  Like I said, make_header is part of the *legacy* API, and it 
really is a pain to use.  That's why we wrote the new API.

--

___
Python tracker 
<https://bugs.python.org/issue37532>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37532] email.header.make_header() doesn't work if any `ascii` code is out of range(128)

2019-08-01 Thread R. David Murray


R. David Murray  added the comment:

The input header is not valid (non-ascii is not allowed in headers), so you 
shouldn't expect make_header to do anything sensible.  Note that this is the 
legacy API, which is a toolkit and does not hold your hand when it comes to RFC 
compliance.  Aside from any other concerns, this is long standing behavior (it 
is the same in python2), and it doesn't make sense to change the behavior of a 
legacy API.

--
resolution:  -> not a bug
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37532>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37491] IndexError in get_bare_quoted_string

2019-08-01 Thread R. David Murray


Change by R. David Murray :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37491>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37492] should email.utils.parseaddr treat a@b. as invalid email ?

2019-07-13 Thread R. David Murray


R. David Murray  added the comment:

Right, those absolutely are valid addresses.  A resolver will normally look up 
a name with an internal dot first as if it were an FQDN, but if it does so and 
does not get an answer it will then look it up again as a "local" address 
(appending in turn the strings from the 'search' directive in resolv.conf or 
equivalent) *if* it does not end in a final dot.  If it does end in a final 
dot, no further lookup as local is done.

While it isn't *normal* to send email to a TLD using a trailing dot, it is 
*legal*.  In theory the address 'postmaster@com.' ought to be a valid email 
address (I doubt that it actually is, though). On the other hand, I will be 
very surprised if *all other* TLDs are without valid email addresses, 
especially the new ones.  It is also easy to imagine an environment using email 
with private single label domain names using trailing dots specifically to 
suppress appending of search domains for sandboxing reasons.  Thus the email 
library must support it as valid, both for RFC reasons and for practical 
reasons.

--

___
Python tracker 
<https://bugs.python.org/issue37492>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37482] Email address display name fails with both encoded words and special chars

2019-07-10 Thread R. David Murray


R. David Murray  added the comment:

The display name is a phrase, and a phrase is a sequence of words, and a word 
is either a quoted string or an atom.  So it is legal to mix quoted strings and 
encoded words in a display name.  I'd vote to do whichever one is easier to 
implement :)  (I haven't looked at your PR yet and unfortunately my time is 
limited :(

--

___
Python tracker 
<https://bugs.python.org/issue37482>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37482] Email address display name fails with both encoded words and special chars

2019-07-10 Thread R. David Murray

R. David Murray  added the comment:

FYI, it would have been most helpful if you had posted your example in the 
issue text instead of as an attached file, as it explains the problem better 
than your text does :)

Here is a minimal reproducer:

>>> m = EmailMessage(policy=strict)
>>> m['From'] = '"Foo Bar, España" '
>>> bytes(m)
b'From: Foo Bar, =?utf-8?q?Espa=C3=B1a?= \n\n'

This serialization of the header is, as you say, invalid.  Either the comma 
should be encoded, or the "Foo Bar," should be in quotes.

--

___
Python tracker 
<https://bugs.python.org/issue37482>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37357] mbox From line wrongly detected

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

This problem is the whole reason "mangle_from" exists in the email library...

--

___
Python tracker 
<https://bugs.python.org/issue37357>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31445] Index out of range in get of message.EmailMessage.get()

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

Note that the reporter indicated that the message was an instance of 
EmailMessage (the new API).  You'd need to use policy-default to get that using 
message_from_string.  But yes, this was fixed in another issue.

--
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue31445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32179] Empty email address in headers triggers an IndexError

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

BareQuotedString implies the new API is being used, though that was not made 
clear in the report.  However, unlike the other recently closed issue, this one 
was in fact fixed (and I have a vague memory of reviewing the PR):

>>> m = message_from_string('ReplyTo: ""', policy=default)
>>> m['ReplyTo']
'""'

--

___
Python tracker 
<https://bugs.python.org/issue32179>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32178] Some invalid email address groups cause an IndexError instead of a HeaderParseError

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

The fact that the original report mentions HeaderParserError implies that the 
new API is being used, though the report didn't make that clear.  The problem 
still exists:

>>> m = message_from_string("To: :Foo  
>>> \n\n", policy=default)
>>> m['To']
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/rdmurray/python/p38/Lib/email/message.py", line 391, in 
__getitem__
return self.get(name)
  File "/home/rdmurray/python/p38/Lib/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
  File "/home/rdmurray/python/p38/Lib/email/policy.py", line 163, in 
header_fetch_parse
return self.header_factory(name, value)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 602, in 
__call__
return self[name](name, value)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 197, in 
__new__
cls.parse(value, kwds)
  File "/home/rdmurray/python/p38/Lib/email/headerregistry.py", line 343, in 
parse
groups.append(Group(addr.display_name,
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 315, 
in display_name
return self[0].display_name
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 382, 
in display_name
return self[0].display_name
  File "/home/rdmurray/python/p38/Lib/email/_header_value_parser.py", line 564, 
in display_name
if res[0].token_type == 'cfws':
IndexError: list index out of range

--
resolution: out of date -> 
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue32178>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19645] decouple unittest assertions from the TestCase class

2019-07-09 Thread R. David Murray


R. David Murray  added the comment:

"But - what are we solving for here?"  I'll tell you what my fairly common use 
case is.  Suppose I have some test infrastructure code, and I want to make some 
assertions in it.  What I invariably end up doing is passing 'self' into the 
infrastructure method/class just so I can call the assert methods from it.  I'd 
much rather be just calling the assertions, without carrying the whole test 
object around.  It *works* to do that, but it bothers me every time I do it or 
read it in code, and it makes the infrastructure code needlessly more 
complicated and slightly harder to understand/read.

--

___
Python tracker 
<https://bugs.python.org/issue19645>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:


New changeset 0416d6f05a96e0f1b3751aa97abfffe6d3323976 by R. David Murray (Miss 
Islington (bot)) in branch '3.7':
bpo-27737: Allow whitespace only headers encoding (GH-13478) (#13517)
https://github.com/python/cpython/commit/0416d6f05a96e0f1b3751aa97abfffe6d3323976


--

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36520] Email header folded incorrectly

2019-05-22 Thread R. David Murray

R. David Murray  added the comment:

Nevermind, I was testing with the wrong version of python.  This bug was 
introduced somewhere after 3.4 :(

>>> from email.message import EmailMessage
>>> m = EmailMessage()
>>> m['Subject'] = 'Hello Wörld! Hello Wörld! Hello Wörld! Hello Wörld!Hello 
>>> Wörld!'
>>> bytes(m)
b'Subject: Hello =?utf-8?q?W=C3=B6rld!_Hello_W=C3=B6rld!_Hello_W=C3=B6rld!?=\n 
Hello =?utf-8?=?utf-8?q?q=3FW=3DC3=3DB6rld!Hello=3F=3D_W=C3=B6rld!?=\n\n'

--

___
Python tracker 
<https://bugs.python.org/issue36520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36520] Email header folded incorrectly

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:

Can you demonstrate the problem with an actual email object?  
header_store_parse is not meant to be called directly.

--

___
Python tracker 
<https://bugs.python.org/issue36520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray


R. David Murray  added the comment:

Thanks.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
versions: +Python 3.7, Python 3.8 -Python 3.4, Python 3.5, Python 3.6

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27737] email.header.Header.encode() crashes with IndexError on spaces only value

2019-05-22 Thread R. David Murray

R. David Murray  added the comment:


New changeset ef5bb25e2d6147cd44be9c9b166525fb30485be0 by R. David Murray 
(Batuhan Taşkaya) in branch 'master':
bpo-27737: Allow whitespace only headers encoding (#13478)
https://github.com/python/cpython/commit/ef5bb25e2d6147cd44be9c9b166525fb30485be0


--

___
Python tracker 
<https://bugs.python.org/issue27737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33524] non-ascii characters in headers causes TypeError on email.policy.Policy.fold

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:


New changeset feac6cd7753425fba006e97e2d9b74a0c0c75894 by R. David Murray 
(Abhilash Raj) in branch 'master':
bpo-33524: Fix the folding of email header when max_line_length is 0 or None 
(#13391)
https://github.com/python/cpython/commit/feac6cd7753425fba006e97e2d9b74a0c0c75894


--

___
Python tracker 
<https://bugs.python.org/issue33524>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21315] email._header_value_parser does not recognise in-line encoding changes

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

I don't see that line of code in unstructured_ew_without_whitespace.diff.

Oh, you are referring to his monkey patch.  Yes, that is not a suitable 
solution for anyone but him, and I don't think he meant to imply otherwise :)

--

___
Python tracker 
<https://bugs.python.org/issue21315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21315] email._header_value_parser does not recognise in-line encoding changes

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

A cleaner/safer solution here would be:

  tok, *remainder = _wsp_splitter(value, 1)
  if _rfc2047_matcher(tok):
  tok, *remainder = value.partition('=?')
  
where _rfc2047_matcher would be a regex that matches a correctly formatted 
encoded word. There a regex for that in the header.py module, though for this 
application we don't need the groups it has.

Abhilash, I'm not sure why you say the proposed solution only works for utf-8 
and 'q'?

--

___
Python tracker 
<https://bugs.python.org/issue21315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Right, one of the fundamental principles of the email library is that when 
parsing input we do not ever raise an error.  We may note defects, but whatever 
we get we *must* parse and turn in to *something*.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Good point about the backward compatibility.  Yes I agree, I think raising the 
error is probably better.  A deprecation warning seems like a good path 
forward...I will be very surprised if anyone encounters it, though :)

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

As for the other, I don't see the need for a custom error.  It's a ValueError 
in my view.  I wouldn't object to it strongly, but note that this error is 
content dependent.  If there's nothing to encode, you can "get away with" a 
shorter maxlen.  Though why you would want to is beyond me, and that's another 
reason I don't think this warrants a custom error class.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36564] Infinite loop with short maximum line lengths in EmailPolicy

2019-05-17 Thread R. David Murray


R. David Murray  added the comment:

Can you demonstrate the parsing error?  maxlen should have no effect during 
parsing.

--

___
Python tracker 
<https://bugs.python.org/issue36564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34424] Unicode names break email header

2019-05-14 Thread R. David Murray


R. David Murray  added the comment:

Thank you.  I don't believe this is a security issue.

--

___
Python tracker 
<https://bugs.python.org/issue34424>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36910] Certain Malformed email causes email.parser to throw AttributeError

2019-05-14 Thread R. David Murray


R. David Murray  added the comment:

Not a security issue, no.  This isn't C where a stack overflow can give an 
attacker a vector for injecting arbitrary code.

Per the Parser contract ("raise no exceptions, only register defects"), this 
should, as you say, register a defect 
(email.errors.InvalidMultipartContentTransferEncodingDefect) and assume a CTE 
of 7bit for the rest of the parsing.  The problem here is that the feedparser 
is running into the "hack" I put in place in python3.2 for dealing with invalid 
binary data in headers (which is to turn it into a Header with charset 
unknown-8bit).  That works most of the time, but in cases like this it breaks 
down :(

Note that the new API (policy=default and friends) handles this without error.

--

___
Python tracker 
<https://bugs.python.org/issue36910>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >