If you've read my blog (eg: on planet python), you will be aware that
I dedicated August to full time email package development.  At the
beginning of the month I worked out a design proposal for the remaining
API additions to the email package, dealing with handling message bodies
in a more natural way.  I posted this to the email-sig, and got...well,
no objections.  Barry Warsaw did review it, and told me he had no issues
with the overall design, but also had no time for a detailed review.

Since one way to see if a design holds together is to document and code
it, I decided to go ahead and do so.  This resulted in a number of small
tweaks, but no major changes.

I have at this point completed the coding.  You can view the whole
patch at:

    http://bugs.python.org/issue18891

which also links to three layered patches that I posted as I went along,
if you prefer somewhat smaller patches.

I think it would be great if I could check this in for alpha2.  Since it
is going in as an addition to the existing provisional code, the level
of review required is not as high as for non-provisional code, I think.
But I would certainly appreciate review from anyone so moved, since I
haven't gotten any yet.

Of course, if there is serious bikeshedding about the API, I won't
make alpha2, but that's fine.

The longer term goal, by the way, is to move all of this out of
provisional status for 3.5.

This code finishes the planned API additions for the email package
to bring it fully into the world of Python3 and unicode.  It does not
"fix" the deep internals, which could be a future development direction
(but probably only after the "old" API has been retired, which will take
a while).  But it does make it so that you can use the email package
without having to be a MIME expert.  (You can't get away with *no* MIME
knowledge, but you no longer have to fuss with the details of the syntax.)

To give you the flavor of how the entire new provisional API plays
together, here's how you can build a complete message in your application:

    from email.message import MIMEMessage
    from email.headerregistry import Address
    fullmsg = MIMEMessage()
    fullmsg['To'] = Address('Foö Bar', 'f...@example.com')
    fullmsg['From'] = "mè <m...@example.com>"
    fullmsg['Subject'] = "j'ai un problème de python."
    fullmsg.set_content("et la il est monté sur moi et il commence"
                       " a m'étouffer.")
    htmlmsg = MIMEMessage()
    htmlmsg.set_content("<p>et la il est monté sur moi et il commence"
                        " a m'étouffer.</p><img src='image1' />",
                        subtype='html')
    with open('python.jpg', 'rb') as python:
        htmlmsg.add_related(python.read(), 'image', 'jpg', cid='image1'
                            disposition='inline')
    fullmsg.make_alternative()
    fullmsg.attach(htmlmsg)
    with open('police-report.txt') as report:
        fullmsg.add_attachment(report.read(), filename='pölice-report.txt',
                               params=dict(wrap='flow'), headers=(
                                    'X-Secret-Level: top',
                                    'X-Authorization: Monty'))

Which results in:

    >>> for line in bytes(fullmsg).splitlines():
    >>>    print(line)
    b'To: =?utf-8?q?Fo=C3=B6?= Bar <f...@example.com>'
    b'From: =?utf-8?q?m=C3=A8?= <m...@example.com>'
    b"Subject: j'ai un =?utf-8?q?probl=C3=A8me?= de python."
    b'MIME-Version: 1.0'
    b'Content-Type: multipart/mixed; boundary="===============1710006838=="'
    b''
    b'--===============1710006838=='
    b'Content-Type: multipart/alternative; 
boundary="===============1811969196=="'
    b''
    b'--===============1811969196=='
    b'Content-Type: text/plain; charset="utf-8"'
    b'Content-Transfer-Encoding: 8bit'
    b''
    b"et la il est mont\xc3\xa9 sur moi et il commence a m'\xc3\xa9touffer."
    b''
    b'--===============1811969196=='
    b'MIME-Version: 1.0'
    b'Content-Type: multipart/related; boundary="===============1469657937=="'
    b''
    b'--===============1469657937=='
    b'Content-Type: text/html; charset="utf-8"'
    b'Content-Transfer-Encoding: quoted-printable'
    b''
    b"<p>et la il est mont=C3=A9 sur moi et il commence a 
m'=C3=A9touffer.</p><img ="
    b"src=3D'image1' />"
    b''
    b'--===============1469657937=='
    b'MIME-Version: 1.0'
    b'Content-Type: image/jpg'
    b'Content-Transfer-Encoding: base64'
    b'Content-Disposition: inline'
    b'Content-ID: image1'
    b''
    b'ZmFrZSBpbWFnZSBkYXRhCg=='
    b''
    b'--===============1469657937==--'
    b'--===============1811969196==--'
    b'--===============1710006838=='
    b'MIME-Version: 1.0'
    b'X-Secret-Level: top'
    b'X-Authorization: Monty'
    b'Content-Transfer-Encoding: 7bit'
    b'Content-Disposition: attachment; filename*=utf-8''p%C3%B6lice-report.txt"
    b'Content-Type: text/plain; charset="utf-8"; wrap="flow"'
    b''
    b'il est sorti de son vivarium.'
    b''
    b'--===============1710006838==--'

If you've used the email package enough to be annoyed by it, you may
notice that there are some nice things going on there, such as using
CTE 8bit for the text part by default, and quoted-printable instead of
base64 for utf8 when the lines are long enough to need wrapping.

(Hmm.  Looking at that I see I didn't fully fix a bug I had meant to fix:
some of the parts have a MIME-Version header that don't need it.)

All input strings are unicode, and the library takes care of doing
whatever encoding is required.  When you pull data out of a parsed
message, you get unicode, without having to worry about how to decode
it yourself.

On the parsing side, after the above message has been parsed into a
message object, we can do:

    >>> print(fullmsg['to'], fullmsg['from'])
    Foö Bar <"f...@example.com"> mè <m...@example.com>
    >>> print(fullmsg['subject'])
    j'ai un problème de python.
    >>> print(fullmsg['to'].addresses[0].display_name)
    Foö Bar

    >>> print(fullmsg.get_body(('plain',)).get_content())
    et la il est monté sur moi et il commence a m'étouffer.

    >>> for part in fullmsg.get_body().iter_parts():
    ...     print(part.get_content())
    <p>et la il est monté sur moi et il commence a m'étouffer.</p><img 
src='image1' />

    b'fake image data\n'

    >>> for attachment in fullmsg.iter_attachments():
    ...     print(attachment.get_content())
    ...     print(attachment['Content-Type'].params())
    il est sorti de son vivarium.

    {'charset': 'utf-8', 'wrap': 'flow'}

Of course, in a real program you'd actually be checking the mime types
via get_content_type() and friends before getting the content and doing
anything with it.

Please read the new contentmanager module docs in the patch for
full details of the content management part of the above API (and the
headerregistry docs if you want to review the (new in 3.3) header parsing
part of the above API).

Feedback welcome, here or on the issue.

--David

PS: python jokes courtesy of someone doing a drive-by on #python-dev the other
day.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to