> > You might as well say that C code is not plain text because it too is > > subject to special canons of interpretation. > > C, C++ and Java source files are not plain text as well (they have their own
C, C++ and Java source files are plain text. > "text/*" MIME type, which is NOT "text/plain" notably because of the rules I've seen text/cpp and text/java, but really there are no such types. I've also seen text/x-source-code which is at least legal, if of little value to interoperability. The correct MIME type for C and C++ source files is text/plain. I'd be prepared to give good odds that that is the case with Java source files as well. > associated with end-of-lines, notably in presence of comments). As source files (that is, at the stage in processing at which a human user can see the source and edit it) the only handling required for end-of-lines is converstion of new line function characters, the same as for any other use of plain text. The treatment of end-of-lines as significant when processed (for example following one-line // comments) is a matter of what an application chooses to do with a particular character. This is no different than an indexer deciding that a plain text file contains a particular word, or for that matter in my putting coffee filters into my basket if I see "coffee filters" written on my shopping list. > > But both XML/HTML/SGML and the various programming languages are plain > text. > > See "text/xml", "text/html" and "text/sgml" MIME types. They also aren't > "text/plain" so they have their own interpretation of Unicode characters > which is not the one found in the Unicode standard. They have their own interpretation of tne Unicode characters which is *in addition to* the one found in the Unicode standard. As to all but the simplest applications that use Unicode (as interesting as many of them are, characters are of little use on their own).

