I couldn't agree more.

"text" and "files" are just encoding and packaging.   We routinely represent 
the same information in different ways during different stages of a program or 
system's lifecycle in order to obtain advantages relevant to the processing 
problems at hand.  In the past, it has been convenient to encourage ubiquitous 
use of standard encoding (ASCII) and packaging (files) in exchange for the 
obvious benefits of simplicity, access to common tooling that understands those 
standards, and interchange between systems.

However, if we set simplicity aside for the moment, the goals of access and 
interchange can be accomplished by means of mapping.  It is not essential to 
maintain ubiquitous lowest-common-denominator standards if suitable mapping 
functions exist.

My personal feeling is that the design of practical next-generation languages 
and tools has been retarded for a very long time by an unexamined emotional 
need to cling to common historical standards that are insufficient to support 
the needs of forward-looking language concepts.

For example, if we look beyond system interchange, the most significant value 
of core ASCII is its relatively good impedance match to keys found on most 
computer keyboards.  When "standard typewriter" keyboards were the ubiquitous 
form of data entry, this was an overwhelmingly important consideration.  
However, we long ago broke the chains of this relationship:  Data entry 
routinely encompasses entry from pointer devices such as mice and trackballs, 
tablets of various descriptions, incomplete keyboards such as numeric keypads, 
game controllers, etc.  These axes of expression are not represented in the 
graphology of ASCII.

In this world, the impedance mismatch to ASCII (and UNICODE, which could be 
seen as ASCII++, since it offers more glyphs but makes little attempt to 
increase the core semantics of graphology offered) invites examination.  In 
this world, it seems to me that core expressiveness of a graphology trumps 
ubiquity.  I'd like to see more languages being bold and looking beyond 
ASCII-derived symbology to find graphologies that allow for more powerful 
representation and manipulation of modern ontologies.

A concrete example:  ASCII only allows "to the right of" as a first class 
relationship in its representation ontology.  (The word "at" is formed as the 
character "t" to the right of the character "a".)  Even concepts such as "next 
line" or "backspace" are second-order concepts encoded by reserved symbols 
borrowed from the representable namespace.  Advanced but still fundamental 
concepts such as "subordinate to" (i.e., subscript) are only available in 
so-called RichText systems.  Even more powerful concepts like "contains" (for 
example, a "word" which is composed of the symbol "O" containing inside it the 
symbol "c") are not representable at all in the commonly available 
graphologies.  The people who attempt to express mathematical formulae 
routinely grapple with these limitations.  Even where a character set includes 
a root symbol, the underlying graphology does not implement rules by which 
characters can be arranged around it to represent the third root of x.

Many of the excruciating design exercises language designers go thru these days 
are largely driven by limitations of the ASCII++ graphology we assume to be 
sacrosanct.  (For example, the parts of this discussion thread analyzing the 
use of various compound-character combinations which intrude all the way to the 
parsing layer of a language because the core ASCII graphology doesn't feature 
enough bracket symbols.)

This barrier is artificial, historic in nature and need no longer constrain us 
because we have the luxury of modern high-powered computing systems which allow 
us to impose abstraction in important ways that were historically infeasible to 
allow us to achieve new kinds of expressive power and simplicity.

-- Mack


On Mar 13, 2012, at 8:11 AM, David Barbour wrote:

> 
> 
> On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams <j...@qualdan.com> wrote:
> On 2012-03-13 02:13PM, Julian Leviston wrote:
> >What is "text"? Do you store your "text" in ASCII, EBCDIC, SHIFT-JIS or
> >UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit the UTF-8
> >files?
> >
> >Just saying' ;-) Hopefully you understand my point.
> >
> >You probably won't initially, so hopefully you'll meditate a bit on my
> >response without giving a knee-jerk reaction.
> 
> OK, I've thought about it and I still don't get it.  I understand that
> there have been a number of different text encodings, but I thought that
> the whole point of Unicode was to provide a future-proof way out of that
> mess.  And I could be totally wrong, but I have the impression that it
> has pretty good penetration.  I gather that some people who use the
> Cyrillic alphabet often use some code page and China and Japan use
> SHIFT-JIS or whatever in order to have a more compact representation,
> but that even there UTF-8 tools are commonly available.
> 
> So I would think that the sensible thing would be to use UTF-8 and
> figure that anyone (now or in the future) will have tools which support
> it, and that anyone dedicated enough to go digging into your data files
> will have no trouble at all figuring out what it is.
> 
> If that's your point it seems like a pretty minor nitpick.  What am I
> missing?
> 
> Julian's point, AFAICT, is that text is just a class of storage that requires 
> appropriate viewers and editors, doesn't even describe a specific standard. 
> Thus, another class that requires appropriate viewers and editors can work 
> just as well - spreadsheets, tables, drawings. 
> 
> You mention `data files`. What is a `file`? Is it not a service provided by a 
> `file system`? Can we not just as easily hide a storage format behind a 
> standard service more convenient for ad-hoc views and analysis (perhaps 
> RDBMS). Why organize into files? Other than penetration, they don't seem to 
> be especially convenient.
> 
> Penetration matters, which is one reason that text and filesystems matter.  
> 
> But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be 
> difficult to support editing of tables, spreadsheets, drawings, etc. atop a 
> web service platform. We probably have more freedom today than we've ever had 
> for language design, if we're willing to stretch just a little bit beyond the 
> traditional filesystem+text-editor framework. 
> 
> Regards,
> 
> Dave
> _______________________________________________
> fonc mailing list
> fonc@vpri.org
> http://vpri.org/mailman/listinfo/fonc

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to