Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

BGB Tue, 13 Mar 2012 18:29:21 -0700

On 3/13/2012 4:37 PM, Julian Leviston wrote:

On 14/03/2012, at 2:11 AM, David Barbour wrote:
On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams <j...@qualdan.com<mailto:j...@qualdan.com>> wrote:
    On 2012-03-13 02:13PM, Julian Leviston wrote:
    >What is "text"? Do you store your "text" in ASCII, EBCDIC,
    SHIFT-JIS or
    >UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit
    the UTF-8
    >files?
    >
    >Just saying' ;-) Hopefully you understand my point.
    >
    >You probably won't initially, so hopefully you'll meditate a bit
    on my
    >response without giving a knee-jerk reaction.

    OK, I've thought about it and I still don't get it.  I understand
    that
    there have been a number of different text encodings, but I
    thought that
    the whole point of Unicode was to provide a future-proof way out
    of that
    mess.  And I could be totally wrong, but I have the impression
    that it
    has pretty good penetration.  I gather that some people who use the
    Cyrillic alphabet often use some code page and China and Japan use
    SHIFT-JIS or whatever in order to have a more compact representation,
    but that even there UTF-8 tools are commonly available.

    So I would think that the sensible thing would be to use UTF-8 and
    figure that anyone (now or in the future) will have tools which
    support
    it, and that anyone dedicated enough to go digging into your data
    files
    will have no trouble at all figuring out what it is.

    If that's your point it seems like a pretty minor nitpick.  What am I
    missing?
Julian's point, AFAICT, is that text is just a class of storage thatrequires appropriate viewers and editors, doesn't even describe aspecific standard. Thus, another class that requires appropriateviewers and editors can work just as well - spreadsheets, tables,drawings.
You mention `data files`. What is a `file`? Is it not a serviceprovided by a `file system`? Can we not just as easily hide a storageformat behind a standard service more convenient for ad-hoc views andanalysis (perhaps RDBMS). Why organize into files? Other thanpenetration, they don't seem to be especially convenient.
Penetration matters, which is one reason that text and filesystemsmatter.
But what else has penetrated? Browsers. Wikis. Web services. Itwouldn't be difficult to support editing of tables, spreadsheets,drawings, etc. atop a web service platform. We probably have morefreedom today than we've ever had for language design, if we'rewilling to stretch just a little bit beyond the traditionalfilesystem+text-editor framework.
Regards,

Dave
Perfectly the point, David. A "token/character" in ASCII is equivalentto a byte. In SHIFT-JIS, it's two, but this doesn't mean you can'texpress the equivalent meaning in them (ie by selecting the samegraphemes) - this is called translation) ;-)


this is partly why there are "codepoints".
one can work in terms of codepoints, rather than bytes.

a text editor may internally work in UTF-16, but saves its output inUTF-8 or similar.

ironically, this is basically what I am planning/doing at the moment.

now, if/how the user will go about typing UTF-16 codepoints, this is notyet decided.

One of the most profound things for me has been understanding theramifications of OMeta. It doesn't "just" parse streams of"characters" (whatever they are) in fact it doesn't care what theindividual tokens of its parsing stream is. It's concerned merely withthe syntax of its elements (or tokens) - how they combine to formcertain rules - (here I mean "valid patterns of grammar" by rules). Ifone considers this well, it has amazing ramifications. OMeta invitesus to see the entire computing world in terms of sets ofproblem-oriented-languages, where language is a liberal word thatsimply means a pattern of sequence of the constituent elements of a"thing". To PEG, it basically adds proper translation and trueobject-orientism of individual parsing elements. This takes a while tounderstand, I think.
Formats here become "languages", protocols are "languages", and so areany other kind of representation system you care to name (computerprogramming languages, processor instruction sets, etc.).


possibly.

I was actually sort of aware of a lot of this already though, but didn'tconsider it particularly relevant.

I'm postulating, BGB, that you're perhaps so ingrained in the currentmodality and approach to thinking about computers, that you maybecan't break out of it to see what else might be possible. I think itwas turing, wasn't it, who postulated that his turing machines couldwork off ANY symbols... so if that's the case, and your programminglanguage grammar has a set of symbols, why not use arbitrary (ie notcomposed of english letters) ideograms for them? (I think these dayswe call these things icons ;-))
You might say "but how will people name their variables" - wellperhaps for those things, you could use english letters, but maybe youcould enforce that no one use more than 30 variables in their code inany one simple chunk, in which case build them in with the otherideograms.
I'm not attempting to build any kind of authoritative status here,merely provoke some different thought in you.

the issue is not that I can't imagine anything different, but ratherthat doing anything different would be a hassle with current keyboardtechnology:

pretty much anyone can type ASCII characters;

many other people have keyboards (or key-mappings) that can handleregion-specific characters.

however, otherwise, typing unusual characters (those outside theircurrent keyboard mapping) tends to be a bit more painful, and/orintroduces editor dependencies, and possibly increases the learningcurve (now people have to figure out how these various unorthodoxcharacters map to the keyboard, ...).


more graphical representations, however, have a secondary drawback:
they can't be manipulated nearly as quickly or as easily as text.

one could be like "drag and drop", but the problem is that drag and dropis still a fairly slow and painful process (vs, hitting keys on thekeyboard).



yes, there are scenarios where keyboards aren't ideal:

such as on an XBox360 or an Android tablet/phone/... or similar, butpeople probably aren't going to be using these for programming anyways,so it is likely a fairly moot point.

however, even in these cases, it is not clear there are many "clearlybetter" options either (on-screen keyboard, or on-screen tile selector,either way it is likely to be painful...).



simplest answer:

just assume that current text-editor technology is "basicallysufficient" and call it "good enough".

I'll take Dave's point that penetration matters, and at the same time,most "new ideas" have "old idea" constituents, so you can easily findsome matter for people stuck in the old methodologies and thinking torelate to when building your "new stuff" ;-)

well, it is like using alternate syntax designs (say, not a C-style"curly brace" syntax).


one can do so, but is it worth it?

in such a case, the syntax is no longer what most programmers arefamiliar or comfortable with, and it is more effort to convert codeto/from the language, ...


so, likely, the overall cheapest option is to use a fairly generic syntax.

most everything else then mostly amounts to various forms ofcost/benefit tradeoff and similar.

it isn't really about "authority" or things being "proper" or similar,but more about cost-benefit tradeoffs, and trying for the route likelyto result in the most benefit, ...

and, if/when something else catches on, then a person can use thatinstead, but if/when this happens is more of an issue for the future todeal with.

most trends tend to be fairly unexciting and slow moving (for example,the trends in programming language design and syntax tend to take placemostly over a period of decades, and their seems to be little evidencethat either ASCII or C-style syntax are likely to go away any time soon).


much past then? well, who knows?...


otherwise:

did mostly go and write a generic in-console text editor, and used anMS-Edit/QBasic style color scheme (white text on a blue background). maytry for a generally similar aesthetic.

made the observation that "tab" is a rather annoying character to dealwith (some fair amount of logic in the editor interface is spent mostlyworking around the behavior of the tab character...). ended up mostlyrepresenting tabs as a "real" tab character, followed by 0 or more"tab-spacer" characters (these spacer characters aren't intended to besaved to output, but are mostly for aligning stuff within the editor).


the "eval" key already works (used F5 as the "eval" key).

next up, probably:

probably implementing things like selection and cut/copy/paste, and theability to load/save files.

all of this is being kind of long and annoying, but I sort of expectedthis much (although I may have underestimated how much would go intonit-picky stuff related to dealing with user input, which is probablywhere the bulk of the effort has gone).



or such...

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

Reply via email to