R. David Murray added the comment:

As Milan said, the problem doesn't arise in 3.5 with decode_data=False, since 
there's no decoding.  His patch doesn't actually fix the bug for the 
decode_data=True case, though, since the bug is a *valid* utf-8 sequence 
getting split across tcp buffers.

To fix it, we would need to change the implementation of decode_data.  Instead 
of conditionally decoding in collect_data, we'd need to postpone decoding to 
found_terminator.  This would have the undesirable affect of changing what is 
in the received_lines attribute, which is why we didn't do it in the 
decode_data patch.  Using an incremental decoder won't solve that problem, 
since it too would change what gets stored in received_lines.

Since decode_data=True is really not a legitimate mode for smtpd (it is an 
historical accident/bug) and we are planning on removing it eventually, I think 
we should go ahead and apply Milan's patch as is, since it does improve the 
error reporting.  The message would need to be adjusted though, since it can 
trigger on valid utf-8 data.  It should say that smtpd should be run with 
decode_data=False in order to fix the decode problem.

That would leave the bug as-is in 3.4, but a similar patch with an error 
message suggesting an upgrade to 3.5/decode_data=True could be applied.  That 
feels a little weird, though :).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue19806>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to