Mail from ILUG-BOM list (Non-Digest Mode)
This is gonna be like a small info bulletin on how attachments are sent.
When you send mail using the SMTP protocol, you basically have only the
following commands:
HELO - Greet the mail server. Used once per session - at the
beginning of the session
MAIL FROM: <from> - Announce who the sender is. Used once per mail,
before specifying any recipients for each mail, or
after a RSET
RCPT TO: <rcpt> - Announce who the mail is to. Multiple recipients are
allowed, each must have its own RCPT TO:
entered immediately after a MAIL FROM:
DATA - Starts mail entry mode. Everything entered on the
line following DATA is treated as the body of the
message and is sent to the recipients. The DATA
terminates with a . (period) on a line by itself.
A mail may be queued or sent immediately when the . is
entered. It cannot however be reset at this stage.
RSET - Reset the state of the current transaction.
The MAIL FROM: and RCPT TO: for the current
transaction are cleared.
QUIT - End the session. No commits happen here.
I'll deal with return codes later (maybe a different mail).
All other commands simply extend the usage of these. If you look at the
commands above, you will notice a few things.
1. Almost all commands need an argument. The exceptions are RSET and
QUIT.
2. DATA takes a multiline argument on the line after DATA has been
acknowledged.
3. If multiline arguments (like DATA) terminate with a . on a line by
itself, then how do you enter a . on a line by itself?
4. What about the other header fields? Subject:, Reply-To:, custom
headers etc?
5. There is no provision for attaching files.
Let's now look at these points.
1. HELO, MAIL FROM: and RCPT TO: tell the SMTP server all it needs to know
to get the mail through, and bounce it in case of error. Actually,
HELO isn't required, but ensures that the client and server are
speaking the same language. (Ref: EHLO)
2. DATA tells the SMTP server what the contents of the mail are. If you
read through /var/spool/mail/$USER or the eqivalent on your system, you
will find it full of what was typed as DATA, with maybe a few lines
added by the SMTP server and mail client.
3. Escaping a . is the same as escaping a \ in C. Preceed it with another
. In fact, the SMTP protocol states that any line that starts with a .
should be preceeded by another .
So if I wanted to say this:
. is a period
I would have to enter this:
.. is a period
The MUA is supposed to translate .. at the start of a line into .
4. If you send a mail directly using the SMTP protocol (telnet to the smtp
server), and simply type the body of the message, then all you receive
is the body along with the Date: and From: fields. Try it. All other
fields have to be entered as part of the body. In fact, you could
override the default Date and From fields in the body of the mail.
These two fields serve a dual purpose, acting as what is known as the
UNIX_FROM_LINE in you mail file. At the start of every mail, you will
see a line like this:
From philip Mon May 22 17:35:13 2000
To enter other fields, just enter them as the first part of your DATA,
separated from the actual body by a completely blank line. Not even a
space is allowed on this line. eg:
DATA
354 Enter Data end with .
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: SMTP and Mail Attachments - [informational]
This is gonna be like a small info bulletin on how...
...
.
250 Mail accepted for delivery
When the mail is received, the MUA separates the header from the rest
of the body.
5. So, now we come to the final question. Attachments.
Attachments may be binary or text files. We do not know, nor should we
care. The main question is how do we get 8 bit data across all
networks. Note that the Internet is a collection of heterogenous
networks. Most speak TCP/IP, some speak X.25, some speak even more
obscure and outdated protocols. Some of these protocols have a 6 bit
character set - 64 characters that may pass through their networks. We
must be able to get all our data through in these 64 characters.
Fortunately though, these are the most used characters in emails -
A-Z, a-z, 0-9, +/
Unfortunately though, we also have 192 other characters that are used,
though not as often.
Now, most of the networks on the Internet can actually handle 7 bit
ASCII data, so we don't really worry about it too much. With plain
text that is. When you're sending attachments, you probably want it to
get through correctly. We therefore need to encode our 8 bit
attachment into 7 or 6 bits. We figure, since we're gonna encode it
anyway, let's go with 6 bits and cover the whole net.
We use what is called Base64 encoding and I am not going to go into it
here. There are other encoding formats - quoted printable being very
common - that only code characters that are outside the 64 character
set. Base 64 is the most used though.
Now, how do we get the attachment into the mail? Assuming that we have
already encoded it, two things remain.
i) Add appropriate headers to the mail so that the MUA knows that
there are attachments and where they can be found.
ii) Add headers to the attachments that will allow the MUA to properly
decode and save the attached file.
The main mail /must/ contain the following headers:
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="<boundary>"
A note about mail headers... any line that starts with white space is
appended to the preceeding line. Thus, boundary is actually part of
the Content-Type. The <boundary> is a random string of base64 allowed
characters.
You may also want to add a Content-Disposition: mentioning that the
attachments are attached inline with their own headers:
Content-Disposition: inline
Now, each attachment /must/ start with this same boundary, preceeded by
two hyphens --
--<boundary>
Attachment header
encoded attachment
After the final attachment, will be the boundary, preceeded and
succeeded by two hyphens:
--<boundary>--
Simple so far right?
Now, the attachment header.
Immediately after the boundary, you enter the attachment header. The
only required field is the Content-Transfer-Encoding: which tells the
MUA what was used to encode the data.
There is also the Content-Type and the Content-Disposition
that tell the MUA what the original mime type of the attachment was.
This would also mention the original file name.
Content-Transfer-Encoding: base64
Content-Type: text/plain;
name="<filename>"
Content-Disposition: attachment;
filename="<filename>"
<Encoded attachment>
As before, leave a blank line, and enter the attachment.
That pretty much looks like the end, except for a small addition. Your
attachment itself could be a multipart attachment, in which case it
would have a multipart/mixed mime type, and a second boundary and
attachments under it. So think of it as being pretty recursive.
Looking at it now, we could probably consider our entire mailbox file
as a single mail with each mail being an attachment and the
UNIX_FROM_LINE being the boundary.
Notes: MUA - Mail User Agent (Netscape, PINE, mutt etc).
SMTP - Simple Mail Transfer Protocol.
EHLO - Extended Helo protocol - the successor to HELO.
MIME - Multipart Internet Mail Extensions
References:
RFC 821 - SMTP
RFC 822 - Message headers
RFC 2045-2049 - MIME
Hope ya'll found this interesting. I'll put it up on my page soon.
Philip
--
Unfair animal names:
-- tsetse fly -- bullhead
-- booby -- duck-billed platypus
-- sapsucker -- Clarence
-- Gary Larson
_______________________________________________
Linuxers mailing list
[EMAIL PROTECTED]
http://ilug-bom.org.in/mailman/listinfo/linuxers/listinfo/linuxers