[issue19662] smtpd.py should not decode utf-8

2015-05-19 Thread Arfrever Frehtes Taifersar Arahesis

Arfrever Frehtes Taifersar Arahesis added the comment:

 New changeset a7d3074fa888 by R David Murray in branch 'default':
 #19662: Make requirement to support arbitrary keywords explicit.
 https://hg.python.org/cpython/rev/a7d3074fa888

s/keword/keyword/

--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2015-05-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a3f2b171b765 by R David Murray in branch 'default':
#19662: fix typo
https://hg.python.org/cpython/rev/a3f2b171b765

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2015-05-19 Thread R. David Murray

R. David Murray added the comment:

Thanks, Arfrever.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2015-05-16 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a7d3074fa888 by R David Murray in branch 'default':
#19662: Make requirement to support arbitrary keywords explicit.
https://hg.python.org/cpython/rev/a7d3074fa888

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-06-11 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 4e22213ca275 by R David Murray in branch 'default':
#19662: add decode_data to smtpd so you can get at DATA in bytes form.
http://hg.python.org/cpython/rev/4e22213ca275

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-06-11 Thread R. David Murray

R. David Murray added the comment:

Thanks, Maciej. 

I tweaked the patch a bit, you might want to take a look just for your own 
information.  Mostly I fixed the warning stuff, which I didn't explain very 
well.  The idea is that if the default is used (no value is specified), we want 
there to be a warning.  But if a value *is* specified, there should be no 
warning (the user knows what they want).  To accomplish that we make the actual 
default value None, and check for that.  I also had to modify the tests so that 
warnings aren't issued, as well as test that they actually get issued when the 
default is used.

I also added versionchanged directives and a whatsnew entry, and expanded the 
decode_data docs a bit.

--
resolution:  - fixed
stage: patch review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-06-11 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a6c846ec5fd3 by R David Murray in branch 'default':
#19662: Eliminate warnings in other test modules that use smtpd.
http://hg.python.org/cpython/rev/a6c846ec5fd3

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-06-10 Thread Milan Oberkirch

Changes by Milan Oberkirch milan...@oberkirch.org:


--
nosy: +jesstess, zvyn

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-30 Thread Maciej Szulik

Maciej Szulik added the comment:

I've included Leslie's comments in rst file. The 3rd version is attached in 
issue19662_v3.patch.

--
Added file: http://bugs.python.org/file35409/issue19662_v3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-29 Thread R. David Murray

R. David Murray added the comment:

Added review comments.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-29 Thread Maciej Szulik

Maciej Szulik added the comment:

I've implemented all your proposed changes, because for most of your changes I 
was thinking pretty the same way for the whole day today, to make the code more 
elegant. The current state of work is attached as issue19662_v2.patch

--
Added file: http://bugs.python.org/file35404/issue19662_v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-28 Thread Maciej Szulik

Maciej Szulik added the comment:

I'm attaching file issue19662_v1.patch. David please have a look at it and let 
me know if this is it, if not I'm waiting for your suggestions.

--
Added file: http://bugs.python.org/file35390/issue19662_v1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-22 Thread R. David Murray

R. David Murray added the comment:

Yes, this will be fixed in 3.5 one way or another.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-22 Thread Maciej Szulik

Maciej Szulik added the comment:

I'll try to take care of this issue in the following few days.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-05-21 Thread Duke Dougal

Duke Dougal added the comment:

Is this one likely to be included in 3.5? It effectively breaks smtpd so it 
would be good to see it working again.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-04-24 Thread Sreepriya Chalakkal

Sreepriya Chalakkal added the comment:

Hi Maciej,
I am travelling now and it might take some delay for me to work on this! I got 
to know that you are working on RFC 6532. You might take this up and fix it as 
this is related to your work and I don't want to create delays.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-04-18 Thread Maciej Szulik

Maciej Szulik added the comment:

Sreepriya, are you still working on this issue? If no I'll be happy to take it 
over, is yes start with fixing following things:
- start with test - this is the most important to have each feautre tested
- decode_data, as David mentioned, needs to have default value True, meaning 
that __init__ should look like this: 
def __init__(self, server, conn, addr, data_size_limit=DATA_SIZE_DEFAULT, 
map=None, decode_data=True)
Assigning True in __init__ will make this value always True, and that's not the 
point. 
- add deprecation warning about this parameter using warnings module:
warnings.warn('decode_data=True is deprecated, data will not be decoded by 
default', DeprecationWarning, 2)
- as for the found_terminator method what David means is to decode data in the 
first if, where commands are checked, to simplify processing of this part 
(David please correct me if I'm wrong) and not what you did
- and finally you need to update the docs to include decode_data parameter with 
information about how it works and it's deprecation

--
nosy: +maciej.szulik

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-04-02 Thread Sreepriya Chalakkal

Sreepriya Chalakkal added the comment:

Hi David,
The variable decode_data is included to control decoding. But I am not sure 
what needs to be done while calling the process_message inside found_terminator 
when it is binary data. How to work around with binary data? Can you tell me 
what are the data types concerning binary data?

--
Added file: http://bugs.python.org/file34700/switch_while_decode1.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-04-02 Thread Sreepriya Chalakkal

Changes by Sreepriya Chalakkal sreepriya1...@gmail.com:


Added file: http://bugs.python.org/file34704/switch_while_decode2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-03-18 Thread R. David Murray

R. David Murray added the comment:

I propose that we add a new keyword argument to SMTP's __init__, 'decode_data'. 
 This would be set to True by default, and would preserve the current behavior 
of passing utf-8 decoded data to process_message.

Setting it to True would mean that process_message would get passed binary 
(undecoded) data.

In 3.5 we add this keyword, but we immediately deprecate 'decode_data=True'.  
In 3.6 we change the default to decode_data=False, and we deprecate the 
decode_data keyword.  Then in 3.7 we drop the decode_data keyword.

Now, as for implementation: what 'push' currently does (encode to ascii) is 
just fine for now.  What we need to change is collect_incoming_data (where the 
decode happens) and found_terminator (where the data is passed to other parts 
of the class or its subclasses).

When decode_data is False, collect_incoming_data should not decode.  
received_lines should be binary.  Then, in found_terminator the else branch of 
the if can pass the binary received_lines into process_message (care will be 
needed to use the correct data types for the various operations).  In the first 
branch of the if, though, when decode_data is False the data will now need to 
be decoded (still, I think, using utf-8) so that text can still be used to 
manipulate this part of the API, since unlike the message data it *is* 
conceptually text, just encoded as ASCII.  (I suggest still decoding using 
utf-8 rather than ASCII because this will be useful when we implement RFC6531.) 
 This will provide for the smallest number of needed changes to subclasses when 
converting to decode_data=False mode.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-03-17 Thread Sreepriya Chalakkal

Sreepriya Chalakkal added the comment:

Hi David, 

I would like to work on this bug. Can you give some more insights about the 
main issue? As far as I understood, the smtp server is now decoding the 
incoming bytes as UTF-8. Why do you say that it is not the right way? Can you 
give some idea about the right convention?  Also, you mention about a solution 
with a switch statement having default case as utf8. What are the other cases? 
And you also mention that smtpd should be emitting binary and unicode should be 
handled by the email package. 
But is it possible to make that change now as other functions depending on this 
might be affected?

--
nosy: +sreepriya

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-02-06 Thread Duke Dougal

Duke Dougal added the comment:

Is there a workaround for this as I'd like to just be receiving binary data 
from SMTPD. I'm new to this system - is this scheduled for fixing in Python 3.4?

--
nosy: +Duke.Dougal

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2014-02-06 Thread R. David Murray

R. David Murray added the comment:

Unfortunately I did not get to this before the 3.4 beta release, so no, it 
won't be fixed in 3.4.

You can work around it by overriding collect_incoming_data in your subclass and 
doing data.decode('ascii', 'surrogateescape') instead of str(data, 'utf-8'), 
and then doing mydata.encode('ascii', 'surrogateescape') at the point where you 
want to turn the data back into binary.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-26 Thread Illirgway

Illirgway added the comment:

Here is another patch for fixing this issue:

https://github.com/Illirgway/cpython/commit/12d7c59e0564c408a65dd782339f585ab6b14b34

Sorry for my bad english

--
nosy: +Illirgway
versions: +Python 3.3 -Python 3.5
Added file: http://bugs.python.org/file32861/python3.3-lib-smtpd-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-26 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage:  - patch review
versions: +Python 3.4 -Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-26 Thread R. David Murray

R. David Murray added the comment:

As I said, the decoding needs to be controlled by a switch (presumably a 
keyword argument to SMTPServer) that defaults to the present (incorrect) 
behavior.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-26 Thread R. David Murray

Changes by R. David Murray rdmur...@bitdance.com:


--
versions: +Python 3.5 -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread Leslie P. Polzer

New submission from Leslie P. Polzer:

http://hg.python.org/cpython/file/3.3/Lib/smtpd.py#l289

as of now decodes incoming bytes as UTF-8.

An SMTP server must not attempt to interpret characters beyond ASCII, however. 
Originally mail servers were not 8-bit clean, meaning they would only guarantee 
the lower 7 bits of each octet to be preserved.
However even then they were not expected to choke on any input because of 
attempts to decode it into a specific extended charset. Whenever a mail server 
does not need to interpret data (like base64-encoded auth information) it is 
simply left alone and passed through.

I am not aware of the reasons that caused the current state, but to correct 
this behavior and make it possible to support the 8BITMIME feature I suggest 
decoding received bytes as latin1, leaving it to the user to reinterpret it as 
UTF-8 or whatever charset they need. Any other simple extended encoding could 
be used for this, but latin1 is the default in asynchat.

The documentation should also mention charset handling. I'll be happy to submit 
a patch for both code and docs.

--
components: Library (Lib)
messages: 203467
nosy: skypher
priority: normal
severity: normal
status: open
title: smtpd.py should not decode utf-8
type: enhancement
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@gmail.com:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread Leslie P. Polzer

Leslie P. Polzer added the comment:

Patch attached. This also adds some more charset clarification to the docs and 
corrects a minor spelling issue.

It is also conceivable that we add a charset attribute to the class. This 
should have the safe default of latin1, and some notes in the docs that setting 
this to utf-8 (and probably other utf-* encodings) is not really 
standards-compliant.

--
keywords: +patch
Added file: http://bugs.python.org/file32719/smtpd_charset_latin1.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread R. David Murray

R. David Murray added the comment:

This bug was apparently introduced as part of the work from issue 4184 in 
python 3.2.  My guess, looking at the code, is that the module simply didn't 
work before that patch, since it would have been attempting to join binary data 
using a string join (''.join(...)).  Richard says in the issue that he wrote 
tests, so he probably figured out it wasn't working and fixed it.  It looks 
like there was no final review of his patch (at least not via the tracker...the 
patch uploaded to the tracker did not include the decode).  Not that a final 
review would necessarily have caught the bug...

The problem here is backward compatibility.

In terms of the API, it really ought to be producing binary data, and not 
decoding at all.  But, at the time he wrote the patch the email package 
couldn't handle binary data (Richard's patch landed in July 2010, binary 
support in the email package landed in October), so presumably nobody was 
thinking about binary emails.

I'm really not sure what to do here, I'll have to give it some thought.

--
components: +email
nosy: +barry, r.david.murray, richard
versions: +Python 3.4 -Python 2.6, Python 2.7, Python 3.1, Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread Leslie P. Polzer

Leslie P. Polzer added the comment:

Since this is my first contribution I'm not entirely sure about the fine 
details of backwards compatibility in Python, so please forgive me if I'm 
totally missing the mark here.

There are facilities in smtpd's parent class asynchat that perform the 
necessary conversions automatically if the user sets an encoding, so smtpd 
should be adjusted to rely on that and thus give the user the opportunity to 
choose for themselves.

Then it boils down to breaking backwards compatibility by setting a default 
encoding, which could be none as you suggest or latin1 as I suggest; either 
will probably be painful for current users.

My take here is that whoever is using this code for their SMTP server and 
hasn't given the encoding issues any thought will need to take a look at their 
code in that respect anyway, so IMHO a break with compatibility might be a bit 
painful but necessary.

If you agree then I will gladly rework the patch to have smtpd work with an 
underlying byte stream by default, rejecting anything non-ASCII where necessary.

Later patches could bring 8BITMIME support to smtpd, with charset conversion as 
specified by the MIME metadata.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread R. David Murray

R. David Murray added the comment:

I think the only backward compatible solution is to add a switch of *some* sort 
(exact API TBD), whose default is to continue to decode using utf-8, and 
document it as wrong.

Conversion of an email to unicode should be handled by the email package, not 
by smtpd, which is why I say smtpd should be emitting binary.

As I say, I need to find time to look at the current API in more detail before 
I'll be comfortable discussing the new API.  I've put it on my list, but likely 
I won't get to it until the weekend.

--
versions: +Python 3.5 -Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19662] smtpd.py should not decode utf-8

2013-11-20 Thread R. David Murray

R. David Murray added the comment:

Oh, and to clarify: the backward compatibility is that if code works with 
X.Y.Z, it should work with X.Y.Z+1.  So even though correctly handling binary 
mail would indeed require someone to reexamine their code, if things happen to 
be working OK for them (eg: their program only needs to handle utf-8 email), we 
don't want to break their working program.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19662
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com