Re: [Email-SIG] BytesFeedParser.close().get_boundary() returns string; I want bytes

R. David Murray Fri, 15 Mar 2013 15:35:11 -0700

On Fri, 15 Mar 2013 13:16:32 -0700, Forest <web11.for...@tibit.com> wrote:
> ascii codec, but I thought I'd raise the issue here in case the current
> behavior is a bug.  Considering the restrictions that rfc 2046 places on
> boundary characters and its requirement to respect ancestor boundary markers
> when parsing nested messages, I'm struggling to think of a situation where
> the current behavior is useful.  Shouldn't get_boundary() return something
> that can be found within the input data?


Well, you have to understand that the email package was written
when Python didn't make any distinction between bytes and strings.
What email in Python3 is doing is transforming the input into
string (unicode) right away, and carrying any non-ascii
bytes along until it has parsed enough information from the message
to recover them and convert them into real unicode. 

BytesParser is parsing bytes input and *turning it into
unicode*.  The model is the same regardless of whether
the input is bytes or already string. get_boundary
is a method on the model (the Message) and is thus
retrieving a string from the model and returning it.

That said, we have discussed adding methods for accessing the
binary form in various contexts.  We have also discussed
providing a stream version of message parsing and generation,
and at a minimum a way to store message bodies externally
(eg in a file).  I've got these as development goals and
welcome help in doing so.

--David
_______________________________________________
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Re: [Email-SIG] BytesFeedParser.close().get_boundary() returns string; I want bytes

Reply via email to