To the Mozilla developers, the Link mailing list and other people who
might be interested.

I will soon add this to: http://www.firstpr.com.au/sys-admin/HTML-mail/
where I will keep it updated and add links to relevant resources.


I was asked on this newsgroup (netscape.public.mozilla.mail-news) last
year to write about why plain text email should be the default for email
clients, rather than HTML.  Its such an obvious thing to me, that I
can't imagine why anyone needs to be convinced.

Anyway, here are some arguments.  I guess most or all of what follows
goes for Usenet as well as email.   I get frustrated at times that so
much effort goes into making things that look impressive at first, but
in fact lack depth and reliability - and are then presented to everyone
as the default way of doing things.  The result is a ***lot*** of
frustration, wasted time and effort and miscommunication.  

I understand that quite a few people on this newsgroup share my beliefs,
but find it hard to convince some management people about the benefits
of simplicity.  I hope this piece helps!


Plain text emails, when written and read in fixed width fonts, and with
good, WYSIWYG text wrapping, are superior for most people, and should be
the default arrangement for all email clients as opposed to HTML, for
reasons including the following:

1 -  Plain text is simple and understandable - nothing could be simpler.
     A character is a character.  A newline is a newline.  What
     you type is what you see and what the recipient receives.  There
     are no ways, in a proper system, for things changing or being
     obscured, provided sensible right margins are observed in the 
     original message.


2 -  You always see on screen *exactly* what is going to be sent.
     (Provided both ends use a fixed width font.  Proportional
     spacing fonts make it impossible to reliably lay out text
     for communicative purpose, such as this dot-point indenting
     or ASCII diagrams.)


3 -  Not counting emoticon graphics and these damn quote vertical lines 
     (which Mozilla has now) in the recipient's client, the recipient 
     will always see, on screen and on paper, exactly what you sent.

     (OK, I have said the same thing three ways - but email is the most
     important aspect of the Net as far as I am concerned, and it is
     vital that it be direct and pure - not subject to unseen changes
     glitches, complications etc.)


4 -  There is no dependency on fonts - provided that the Internet
     standards are adhered to (and so Microsoft practices must be
     eradicated).


5 -  The messages can be displayed with minimal fuss on mobile devices 
     and text consoles with no graphics capabilities and small 
     screen sizes.


6 -  The system *always* works, whereas HTML may not.  For instance
     I received an HTML email which was a completely black page - 
     black text on a black background.  I was able to sus it out, but 
     most people would not.


7 -  Plain text is easy to read, and never plasters a printed page with 
     expansive, soggy ink just because the writer thought that pink
     text on a very dark background graphic looked cool.


8 -  Less bytes and sent and stored than with HTML.  


9 -  There are no attachments, multi-parts to messages etc. unless 
     there is a real need to attach a file.  So the user is never 
     left wondering whether they have missed something - the message
     is just text, all in one piece.


10 - For the great majority of users, plain text is at least as
     communicative as HTML, and those who know how to use HTML can
     always decide to send HTML emails actively and wisely.
 
     (I am not saying HTML email should be banned - just that it should
     not be the default and should not be encouraged for general use.)


11 - HTML can have formatting which will not wrap to screen or 
     page margins, while if sensible 72 line limits (or longer for
     URLs and quoted text) plain text is used, the client will 
     always be able to display and print it without any wrapping 
     problems.


12 - HTML emails can specify fonts which are non-existent fonts on the
     recipient client, or fonts which are too small to read on screen.


13 - Complex HTML emails with attached files for graphics etc.
     can be hard to forward.


14 - There is a danger of complex email formats such as HTML making
     email with attachments or multi-parts the norm.  Virus and other
     security threatening things can easily lurk in all sorts of 
     special file formats, and especially in HTML.  So making people
     get used to receiving emails with complex file structures they
     don't understand makes it harder for them and everyone else to
     be wary about genuinely dangerous emails.


15 - Likewise - you can't get a virus or a web-bug from a plain text 
     email!


16 - HTML emails with graphics files may be very large compared to what 
     the unsophisticated sender thinks they are.  I found a web page 
     today with two B/W images which appeared small on screen, and 
     should have been 50 k bytes each at really high quality.  
  
        http://www.retrospank.com/cartoons.html

    In fact the images were raw scans and were huge in terms of 
    pixels and bytes 1.4 megabytes total!  The web site creator
    was obviously oblivious (though I now mailed him cleaned up
    re-sized images which look a lot better).

    So if a web designer can put up a clueless page like this, where
    it costs them money each time someone looks at it, what is to 
    stop them sending the same small page to someone as an email?

    With a cable modem, they wouldn't even notice that the email was
    2 megabytes.  But pity the poor recipient on a 28k modem or a 
    9.6kbps (if you are lucky) GSM mobile link.  


17 - HTML emails are impossible to deal with in many mailing list
     situations.  They make a mess of digests and archives, since the 
     HTML cannot be displayed properly as part of a larger page, even if 
     it was an HTML page, since the HTML itself involves text colours
     which rely on a background colour which is impossible to set
     in the larger page.    Obviously raw HTML in a digest makes it
     really hard for anyone to read, as well as greatly increasing 
     the digest size.

18 - HTML emails may be hard or impossible to quote from for a 
     client configured to send plain text.  Netscape 4.78 (in non-HTML 
     mode) regularly produces a blank Compose email when it is supposed 
     to begin with quotes from an HTML email.

     How does one quote text in tables, for instance?


19 - There are all sorts of security problems with HTML email, which
     can best be resolved by having the recipient client refuse to
     render HTML at all.  For instance the inclusion of a tiny image
     file from a remote server tells the server exactly when and 
     approximately where it was opened.  Likewise Javascript.

     There can be Javascript embedded in the HTML, which poses all sorts 
     of security problems for clients, and web-mail systems in 
     particular.  Badly written email clients can be prone to complete 
     compromise:

      
http://www.microsoft.com/technet/treeview/default.asp?url=/technet/security/bulletin/MS01-020.asp
   
     just be the email being rendered.  Even if this bug is fixed, the
     Microsoft client is still vulnerable to any (HTML) email starting
     a file download from a remote site!  

     Because HTML emails are a threat to the security of the user's 
     computer and their sanity, the best policy is to never render
     HTML emails.  Then, this means that users who are by default (that
     is they have made no choice, and probably have no idea what the
     terms "HTML" or "Computer Security" mean) send a message via 
     HTML email, then they don't know that the recipient will get some
     kind of text subset of what they wrote.

     A guide to turning off HTML email in various clients is at:

        http://helpdesk.rootsweb.com/help/html-off.html


20 - Turning these security problems around to the software vendor,
     lets say a company asks: "How can you assure me that this email
     client is not going to be a security threat to our computers."

     Its a pretty hard question to answer if the client does anything
     with HTML.   If it only displays plain text, or converts HTML
     so something viewable without any Javascript, any external 
     references and it uses very clearly designed and restricted ways
     to handle anything non-text (such as attachments) then you might
     be able to give the potential customers some solid assurances
     about security.

     These really are the bad old days of email - with so many ways
     malicious emails can wreak havoc.  Its the propellerhead 
     featuritis phase, and this rant is intent on closing the lid on
     this dangerous chapter in the history of the Net ASAP.
  

21 - A web archive containing HTML emails makes searching for 
     text a lot more complex.  If there is a plain text part, then 
     this may be OK, but what of searching through HTML and missing
     a word because there are bold face tags half-way through it?

     The same argument goes for the perfectly day-to-day business of
     searching a user's own mailboxes.


22 - There is no single reliable way of rendering HTML.  There are
     various flavours of HTML - there are various W3C "standards" and
     various proprietary extensions.  There is certainly no consensus
     on what is kosher HTML and what is not.  Even if there was, then
     in the time it takes for recipient systems to be upgraded, that
     notion of kosher HTML would change, so clients would be getting 
     messages from later clients which they would not be equipped to 
     handle properly.  The technical facets of the plain text email
     I am advocating have been static for decades (though I don't
     know when the ISO-8859 font was introduced.)  So the spec now
     for "plain-text" is simple and unlikely to change a lot in the
     future.

     There are endless possibilities for faulty HTML being generated or 
     copied from elsewhere, and for the sender's system to show it one 
     way which seems to work, but a buggy or totally standards compliant 
     renderer at the recipient's end to fail to cope with the bug, HTML 
     standard etc. at all.  For instance MS Front-Page Express 
     generates a web page with a line-break in the middle of an address
     for a graphic.  MSIE renders it (with a complaint about errors) 
     and any standards compliant browser or cache fails.


23 - An HTML message might look good on the sender's computer but
     fail completely on the recipient's because it depends on graphics
     files, fonts, CSS files etc. which are only resident on the 
     sender's system. 


24 - An HTML message might appear the way a sender wants it to due
     to the inclusion of Javascript or Java, but the recipient end
     might not render these due to technical incapacity, have it  
     turned off for security reasons, or the method of rendering might
     be quite different from what the sender experienced.


25 - Likewise, the default addition of V-cards and other attachments
     to ordinary emails should be fought against.  The average user
     has no way of knowing what all these attachments may be, and
     may be confused, unsure as to whether it is a virus, may not
     know whether to open it or not, and so may not know whether they
     have fully read the message.  As with HTML emails, the recipient's
     email client may not know how to deal with the attachment anyway.
     Similarly, they clutter mailing list archives and digests.


26 - Plain text emails are almost certainly easier to read and
     respond to (with quoting) for people with disabilities, including
     those who use speech synthesisers.


27 - There is a security problem with Javascript in email - which
     apparently affects Netscape 6, and so presumably originates
     with Mozilla.  It also affects later versions of Outlook and
     Outlook Express:

         http://www.privacyfoundation.org/privacywatch/report.asp?id=54

     The Javascript is carried with the email as people forward it
     with comments to third parties, and each time it is opened, the
     Javascript runs, reads the email (including the added text)
     and sends it back to a server by encoding it into a form response.
           
         The exploit is made possible because JavaScript is able to 
         read text in an e-mail message. If a message is forwarded to 
         someone else, the hidden JavaScript code in the page can read 
         any text that has been added to the message when it is 
         forwarded. This JavaScript code executes when the forwarded 
         message is read. The JavaScript code then silently sends off
         this text using a Web bug, or a hidden form, to a Web server 
         belonging to the original sender of the message. The sender 
         can then retrieve the text and read it.
    
     This uses a perfectly normal Javascript function - a useful 
     sounding feature which no-one thought to consider the security 
     implications of before foisting it on the world in the default
     arrangement of the latest and greatest software.


28 - Security problems with cookies . . .

        http://users.rcn.com/rms2000/privacy/cookleak.htm


29 - . . . malicious codes somehow related to frames . . . 

        http://www.vnunet.com/News/1120450

            
30 - . . . web bugs . . . 

        http://www.mackraz.com/trickybit/readreceipt/


31 - How can you have a consistent signature in HTML without making
     assumptions about the background image or colour of whatever  
     messages you are going to send?


32 - A great deal of email reading and writing is done via web-mail
     systems.  It is a challenge to make a functional system for
     composing and reading plain text (with all the security issues
     of stopping the browser caching the pages etc.) - but to make 
     HTML reading is a lot more complication still.  Likewise, the
     complexity (for the user and for the communications and 
     programming) in using a Web interface to compose HTML emails.

     
 
Other links regarding HTML email security are:

  http://www.strom.com/awards/192.html
  http://www.byte.com/documents/s=337/BYT20000412S0010/index.htm   

  http://networknews.vnunet.com/ReadersToTheRescue/16674

  http://www.icomm.ca/~dragon/posting.htm#frown


I am sure I could think of more - but I shouldn't have to.   What is
wrong with people to think that over-complex, flaky, hard to use and
hard to understand systems are better than simple ones?    Have people
lived such scrambled, TVed, Nintendoed, PlayStationed, caffienated and
conflicted lives that they never developed a gut feeling for the value
of elegance, simplicity and reliability?   Or are the programmers hip
and the marketing people keen for the flaky, zoomy, featuritis bits?  I
hope Mozilla could avoid the negative influence of the latter.


In a nutshell:

  Plain text always works - and HTML doesn't.

  Plain text is easy for anyone to understand - HTML isn't.

  Plain text has a direct, one-to-one, correspondence between what
  characters and placement you see on screen when writing it (assuming
  the compose editor shows how text will be wrapped) and what is in
  fact sent.  There are no layers of nonsense, interpretation etc.

  Plain text always conveys to the recipient exactly what the sender
  wrote (assuming they used a standard character set, not some 
  Microsoft abomination which renders quote characters as something
  else on a normal email client).  HTML may convey a message (or no 
  message) very different from what the sender thought they were 
  sending due to limitations with fonts, bugs in the HTML renderer
  (and there are many flavours of HTML) etc.


 *** So plain text should be the DEFAULT for all email clients! ***


Everyone is sick to death of computery things which look flashy but turn
out to be overcomplex, hard to use, hard to understand, and unreliable.  
It may be that people use computer systems like books in their bookshelf
they have never read  - as a means of conforming to social expectations
and impressing themselves and others.  But a more important function is
to have systems which are easy to use, clear and 100% reliable.  

Since email is arguably the most important function of the Internet, we
should fight to get email client programs right - which means defaulting
to sending plain text, with ISO-8859-1 character set, with fixed width
font, WYSIWYG wrapping to sensible margins (72, for instance).  I think
that, at least in the English-speaking world, these are both necessary
and sufficient to support the great majority of email communication
needs.  Those which are not supported can be achieved by the attachment
of files by users who choose to do so - and these can include HTML
files.


How is it that we can design 30 million transistor CPUs which do
something really rigorous and elegant, but can't decide to use a simple,
clear, long-established technique over something half-baked and full of
problems?


Do you want your program to be:

     Flashy and flaky?
or 
     Elegant, easy to use and understand, and 100% reliable?

Its common sense to me.  The world is over-full of flashy, flaky things
- and bad computer software, monstrous web-sites, marketing bumpf,
telemarketing calls, spam etc.  All these contribute to the corrosive,
complicated, entangling and distracting burden we struggle with every
day.

Making all email clients default to plain text with a sensible editor
(as Mozilla seems to have, though it forces all longer than 72 char 
URLs to start on the left margin . . . ) is the way to go!

  - Robin

      http://www.firstpr.com.au

Reply via email to