I couldn't agree more. "text" and "files" are just encoding and packaging. We routinely represent the same information in different ways during different stages of a program or system's lifecycle in order to obtain advantages relevant to the processing problems at hand. In the past, it has been convenient to encourage ubiquitous use of standard encoding (ASCII) and packaging (files) in exchange for the obvious benefits of simplicity, access to common tooling that understands those standards, and interchange between systems.
However, if we set simplicity aside for the moment, the goals of access and interchange can be accomplished by means of mapping. It is not essential to maintain ubiquitous lowest-common-denominator standards if suitable mapping functions exist. My personal feeling is that the design of practical next-generation languages and tools has been retarded for a very long time by an unexamined emotional need to cling to common historical standards that are insufficient to support the needs of forward-looking language concepts. For example, if we look beyond system interchange, the most significant value of core ASCII is its relatively good impedance match to keys found on most computer keyboards. When "standard typewriter" keyboards were the ubiquitous form of data entry, this was an overwhelmingly important consideration. However, we long ago broke the chains of this relationship: Data entry routinely encompasses entry from pointer devices such as mice and trackballs, tablets of various descriptions, incomplete keyboards such as numeric keypads, game controllers, etc. These axes of expression are not represented in the graphology of ASCII. In this world, the impedance mismatch to ASCII (and UNICODE, which could be seen as ASCII++, since it offers more glyphs but makes little attempt to increase the core semantics of graphology offered) invites examination. In this world, it seems to me that core expressiveness of a graphology trumps ubiquity. I'd like to see more languages being bold and looking beyond ASCII-derived symbology to find graphologies that allow for more powerful representation and manipulation of modern ontologies. A concrete example: ASCII only allows "to the right of" as a first class relationship in its representation ontology. (The word "at" is formed as the character "t" to the right of the character "a".) Even concepts such as "next line" or "backspace" are second-order concepts encoded by reserved symbols borrowed from the representable namespace. Advanced but still fundamental concepts such as "subordinate to" (i.e., subscript) are only available in so-called RichText systems. Even more powerful concepts like "contains" (for example, a "word" which is composed of the symbol "O" containing inside it the symbol "c") are not representable at all in the commonly available graphologies. The people who attempt to express mathematical formulae routinely grapple with these limitations. Even where a character set includes a root symbol, the underlying graphology does not implement rules by which characters can be arranged around it to represent the third root of x. Many of the excruciating design exercises language designers go thru these days are largely driven by limitations of the ASCII++ graphology we assume to be sacrosanct. (For example, the parts of this discussion thread analyzing the use of various compound-character combinations which intrude all the way to the parsing layer of a language because the core ASCII graphology doesn't feature enough bracket symbols.) This barrier is artificial, historic in nature and need no longer constrain us because we have the luxury of modern high-powered computing systems which allow us to impose abstraction in important ways that were historically infeasible to allow us to achieve new kinds of expressive power and simplicity. -- Mack On Mar 13, 2012, at 8:11 AM, David Barbour wrote: > > > On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams <j...@qualdan.com> wrote: > On 2012-03-13 02:13PM, Julian Leviston wrote: > >What is "text"? Do you store your "text" in ASCII, EBCDIC, SHIFT-JIS or > >UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 > >files? > > > >Just saying' ;-) Hopefully you understand my point. > > > >You probably won't initially, so hopefully you'll meditate a bit on my > >response without giving a knee-jerk reaction. > > OK, I've thought about it and I still don't get it. I understand that > there have been a number of different text encodings, but I thought that > the whole point of Unicode was to provide a future-proof way out of that > mess. And I could be totally wrong, but I have the impression that it > has pretty good penetration. I gather that some people who use the > Cyrillic alphabet often use some code page and China and Japan use > SHIFT-JIS or whatever in order to have a more compact representation, > but that even there UTF-8 tools are commonly available. > > So I would think that the sensible thing would be to use UTF-8 and > figure that anyone (now or in the future) will have tools which support > it, and that anyone dedicated enough to go digging into your data files > will have no trouble at all figuring out what it is. > > If that's your point it seems like a pretty minor nitpick. What am I > missing? > > Julian's point, AFAICT, is that text is just a class of storage that requires > appropriate viewers and editors, doesn't even describe a specific standard. > Thus, another class that requires appropriate viewers and editors can work > just as well - spreadsheets, tables, drawings. > > You mention `data files`. What is a `file`? Is it not a service provided by a > `file system`? Can we not just as easily hide a storage format behind a > standard service more convenient for ad-hoc views and analysis (perhaps > RDBMS). Why organize into files? Other than penetration, they don't seem to > be especially convenient. > > Penetration matters, which is one reason that text and filesystems matter. > > But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be > difficult to support editing of tables, spreadsheets, drawings, etc. atop a > web service platform. We probably have more freedom today than we've ever had > for language design, if we're willing to stretch just a little bit beyond the > traditional filesystem+text-editor framework. > > Regards, > > Dave > _______________________________________________ > fonc mailing list > fonc@vpri.org > http://vpri.org/mailman/listinfo/fonc
_______________________________________________ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc