Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
BGB wrote: On 3/13/2012 4:37 PM, Julian Leviston wrote: I'll take Dave's point that penetration matters, and at the same time, most new ideas have old idea constituents, so you can easily find some matter for people stuck in the old methodologies and thinking to relate to when building your new stuff ;-) well, it is like using alternate syntax designs (say, not a C-style curly brace syntax). one can do so, but is it worth it? in such a case, the syntax is no longer what most programmers are familiar or comfortable with, and it is more effort to convert code to/from the language, ... Alternate syntaxes are not always as awkward as you seem to think they are, especially the specialized ones. The trick is to ask yourself how you would have written such an such piece of program if there were no pesky parser to satisfy. Or how you would have written a complete spec in the comments. Then you write the parser which accepts such input. My point is, new syntax don't always have to be unfamiliar. For instance: +---+---+---+---+---+---+---+ | 0 | 2 | 3 | 4 | 5 | 6 | 7 | +---+---+---+---+---+---+---+ |foo|bar| +---+---+ |baz| +---+ It should be obvious to anyone who has read an RFC (or a STEPS progress report) that it describes a bit field (16 bits large, with 3 fields). And those who didn't should have learned this syntax by now. Now the only question left is, is it worth the trouble _implementing_ the syntax? Considering that code is more often read than written, I'd say it often is. Even if the code that parses the syntax isn't crystal clear, what the syntax should mean is. You could also play the human compiler: use the better syntax in the comments, and implement a translation of it in code just below. But then you have to manually make sure they are synchronized. Comments are good. Needing them is bad. Loup. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
Michael FIG wrote: Loup Vaillantl...@loup-vaillant.fr writes: You could also play the human compiler: use the better syntax in the comments, and implement a translation of it in code just below. But then you have to manually make sure they are synchronized. Comments are good. Needing them is bad. Or use a preprocessor that substitutes the translation inline automatically. Which is a way of implementing the syntax… How is this different than my Then you write the parser? Sure you can use a preprocessor, but you still have to write the macros for your new syntax. Loup. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/14/2012 8:57 AM, Loup Vaillant wrote: Michael FIG wrote: Loup Vaillantl...@loup-vaillant.fr writes: You could also play the human compiler: use the better syntax in the comments, and implement a translation of it in code just below. But then you have to manually make sure they are synchronized. Comments are good. Needing them is bad. Or use a preprocessor that substitutes the translation inline automatically. Which is a way of implementing the syntax… How is this different than my Then you write the parser? Sure you can use a preprocessor, but you still have to write the macros for your new syntax. in my case, this can be theoretically done already (writing new customized parsers), and was part of why I added block-strings. most likely route would be translating code into ASTs, and maybe using something like (defmacro) or similar at the AST level. another route could be I guess to make use of quote and unquote, both of which can be used as expression-building features (functionally, they are vaguely similar to quasiquote in Scheme, but they haven't enjoyed so much use thus far). a more practical matter though would be getting things nailed down enough so that larger parts of the system can be written in a language other than C. yes, there is the FFI (generally seems to work fairly well), and one can shove script closures into C-side function pointers (provided arguments and return types are annotated and the types match exactly, but I don't entirely trust its reliability, ...). slightly nicer would be if code could be written in various places which accepts script objects (either via interfaces or ex-nihilo objects). abstract example (ex-nihilo object): var obj={render: function() { ... } ... }; lbxModelRegisterScriptObject(models/script/somemodel, obj); so, if some code elsewhere creates an object using the given model-name, then the script code is invoked to go about rendering it. alternatively, using an interface: public interface IRender3D { ... }//contents omitted for brevity public class MyObject implements IRender3D { ... } lbxModelRegisterScriptObject(models/script/somemodel, new MyObject()); granted, there are probably better (and less likely to kill performance) ways to make use of script objects (as-is, using script code to write objects for use in the 3D renderer is not likely to turn out well regarding the framerate and similar, at least until if/when there is a good solid JIT in place, and it can compete more on equal terms with C regarding performance). mostly the script language was intended for use in the game's server end, where typically raw performance is less critical, but as-is, there is still a bit of a language-border issue that would need to be worked on here (I originally intended to write the server end mostly in script, but at the time the VM was a little less solid at the time (poorer performance, more prone to leak memory and trigger GC, ...), and so the server end was written more quick and dirty in plain C, using a design fairly similar to a mix of the Quake 1 and 2 server-ends). as-is, it is not entirely friendly to the script code, so a little more work is needed. another possible use case is related to world-construction tasks (procedural world-building and similar). but, yes, all of this is a bit more of a mundane ways of using a scripting language, but then again, everything tends to be built from the bottom up (and this just happens to be where I am currently at, at this point in time). (maybe at which point in time I am stuck less worrying about which language is used where and about cross-language interfacing issues, then allowing things like alternative syntax, ... could be more worth exploration. but, in many areas, both C and C++ have a bit of a gravity well...). or such... ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On Mar 13, 2012, at 6:27 PM, BGB wrote: SNIP the issue is not that I can't imagine anything different, but rather that doing anything different would be a hassle with current keyboard technology: pretty much anyone can type ASCII characters; many other people have keyboards (or key-mappings) that can handle region-specific characters. however, otherwise, typing unusual characters (those outside their current keyboard mapping) tends to be a bit more painful, and/or introduces editor dependencies, and possibly increases the learning curve (now people have to figure out how these various unorthodox characters map to the keyboard, ...). more graphical representations, however, have a secondary drawback: they can't be manipulated nearly as quickly or as easily as text. one could be like drag and drop, but the problem is that drag and drop is still a fairly slow and painful process (vs, hitting keys on the keyboard). yes, there are scenarios where keyboards aren't ideal: such as on an XBox360 or an Android tablet/phone/... or similar, but people probably aren't going to be using these for programming anyways, so it is likely a fairly moot point. however, even in these cases, it is not clear there are many clearly better options either (on-screen keyboard, or on-screen tile selector, either way it is likely to be painful...). simplest answer: just assume that current text-editor technology is basically sufficient and call it good enough. Stipulating that having the keys on the keyboard mean what the painted symbols show is the simplest path with the least impedance mismatch for the user, there are already alternatives in common use that bear thinking on. For example: On existing keyboards, multi-stroke operations to produce new characters (holding down shift key to get CAPS, CTRL-ALT-TAB-whatever to get a special character or function, etc…) are customary and have entered average user experience. Users of IDE's like EMACS, IntelliJ or Eclipse are well-acquainted with special keystrokes to get access to code completions and intention templates. So it's not inconceivable to consider a similar strategy for typing non-character graphical elements. One could think of say… CTRL-O, UP ARROW, UP ARROW, ESC to type a circle and size it, followed by CTRL-RIGHT ARROW, C to enter the circle and type a c inside it. An argument against these strategies is the same one against command-line interfaces in the CLI vs. GUI discussion: namely, that without visual prompting, the possibilities that are available to be typed are not readily visible to the user. The user has to already know what combination gives him what symbol. One solution for mitigating this, presuming rich graphical typing was desirable, would be to take a page from the way touch type cell phones and tablets work, showing symbol maps on the screen in response to user input, with the maps being progressively refined as the user types to guide the user through constructing their desired input. …just a thought :) SNIP On Mar 13, 2012, at 6:27 PM, BGB also wrote: I'll take Dave's point that penetration matters, and at the same time, most new ideas have old idea constituents, so you can easily find some matter for people stuck in the old methodologies and thinking to relate to when building your new stuff ;-) well, it is like using alternate syntax designs (say, not a C-style curly brace syntax). one can do so, but is it worth it? in such a case, the syntax is no longer what most programmers are familiar or comfortable with, and it is more effort to convert code to/from the language, … The degenerate endpoint of this argument (which, sadly I encounter on a daily basis in the larger business-technical community) is if it isn't Java, it is by definition alien and to uncomfortable (and therefore too expensive) to use. We can protest the myopia inherent in that objection, but the sad fact is that perception and emotional comfort are more important to the average person's decision-making process than coldly rational analysis. (I refer to this as the Discount Shirt problem. Despite the fact that a garment bought at a discount store doesn't fit well and falls apart after the first washing… not actually fulfilling our expectations of what a shirt should do, so ISN'T really a shirt from a usability perspective… because it LOOKS like a shirt and the store CALLS it a shirt, we still buy it, telling ourselves we've bought a shirt. Then we go home and complain that shirts are a failure.) Given this hurdle of perception, I have come to the conclusion that the only reasonable way to make advances is to live in the world of use case-driven design and measure the success of a language by how well it fits the perceived shape of the problem to be solved, looking for familiarity on the part of the user by means of keeping semantic distance between the language
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/14/2012 11:31 AM, Mack wrote: On Mar 13, 2012, at 6:27 PM, BGB wrote: SNIP the issue is not that I can't imagine anything different, but rather that doing anything different would be a hassle with current keyboard technology: pretty much anyone can type ASCII characters; many other people have keyboards (or key-mappings) that can handle region-specific characters. however, otherwise, typing unusual characters (those outside their current keyboard mapping) tends to be a bit more painful, and/or introduces editor dependencies, and possibly increases the learning curve (now people have to figure out how these various unorthodox characters map to the keyboard, ...). more graphical representations, however, have a secondary drawback: they can't be manipulated nearly as quickly or as easily as text. one could be like drag and drop, but the problem is that drag and drop is still a fairly slow and painful process (vs, hitting keys on the keyboard). yes, there are scenarios where keyboards aren't ideal: such as on an XBox360 or an Android tablet/phone/... or similar, but people probably aren't going to be using these for programming anyways, so it is likely a fairly moot point. however, even in these cases, it is not clear there are many clearly better options either (on-screen keyboard, or on-screen tile selector, either way it is likely to be painful...). simplest answer: just assume that current text-editor technology is basically sufficient and call it good enough. Stipulating that having the keys on the keyboard mean what the painted symbols show is the simplest path with the least impedance mismatch for the user, there are already alternatives in common use that bear thinking on. For example: On existing keyboards, multi-stroke operations to produce new characters (holding down shift key to get CAPS, CTRL-ALT-TAB-whatever to get a special character or function, etc…) are customary and have entered average user experience. Users of IDE's like EMACS, IntelliJ or Eclipse are well-acquainted with special keystrokes to get access to code completions and intention templates. So it's not inconceivable to consider a similar strategy for typing non-character graphical elements. One could think of say… CTRL-O, UP ARROW, UP ARROW, ESC to type a circle and size it, followed by CTRL-RIGHT ARROW, C to enter the circle and type a c inside it. An argument against these strategies is the same one against command-line interfaces in the CLI vs. GUI discussion: namely, that without visual prompting, the possibilities that are available to be typed are not readily visible to the user. The user has to already know what combination gives him what symbol. One solution for mitigating this, presuming rich graphical typing was desirable, would be to take a page from the way touch type cell phones and tablets work, showing symbol maps on the screen in response to user input, with the maps being progressively refined as the user types to guide the user through constructing their desired input. …just a thought :) typing, like on phones... I have seen 2 major ways of doing this: hit key multiple times to indicate the desired letter, with a certain timeout before it moves to the next character; type out characters, phone shows first/most-likely possibility, hit a key a bunch of times to cycle though the options. another idle thought would be some sort of graphical/touch-screen keyboard, but it would be a matter of finding a way to make it not suck. using on-screen inputs in Android devices and similar kind of sucks: pressure and sensitivity issues, comfort issues, lack of tactile feedback, smudges on the screen if one uses their fingers, and potentially scratches if one is using a stylus, ... so, say, a touch-screen with these properties: similar sized (or larger) than a conventional keyboard; resistant to smudging, fairly long lasting, and easy to clean; soft contact surface (me thinking sort of like those gel insoles for shoes), so that ideally typing isn't an experience of constantly hitting a piece of glass with ones' fingers (ideally, both impact pressure and responsiveness should be similar to a conventional keyboard, or at least a laptop keyboard); ideally, some sort of tactile feedback (so, one can feel whether or not they are actually hitting the keys); being dynamically reprogrammable (say, any app which knows about the keyboard can change its layout when it gains focus, or alternatively the user can supply per-app keyboard layouts); maybe, there could be tabs to change between layouts, such as a US-ASCII tab, ... ... with something like the above being common, I can more easily imagine people using non-ASCII based input methods. say, one is typing in US-ASCII, hits a math-symbol layout where, for example, the numeric keypad (or maybe the whole rest of the keyboard) is replaced by a grid of math symbols, or maybe also have a drawing tablet tab,
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/12/2012 9:01 PM, David Barbour wrote: On Mon, Mar 12, 2012 at 8:13 PM, Julian Leviston jul...@leviston.net mailto:jul...@leviston.net wrote: On 13/03/2012, at 1:21 PM, BGB wrote: although theoretically possible, I wouldn't really trust not having the ability to use conventional text editors whenever need-be (or mandate use of a particular editor). for most things I am using text-based formats, including for things like world-maps and 3D models (both are based on arguably mutilated versions of other formats: Quake maps and AC3D models). the power of text is that, if by some chance someone does need to break out a text editor and edit something, the format wont hinder them from doing so. What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. I typically work with the ASCII subset of UTF-8 (where ASCII and UTF-8 happen to be equivalent). most of the code is written to assume UTF-8, but languages are designed to not depend on any characters outside the ASCII range (leaving them purely for comments, and for those few people who consider using them for identifiers). EBCDIC and SHIFT-JIS are sufficiently obscure that one can generally pretend that they don't exist (FWIW, I don't generally support codepages either). a lot of code also tends to assume Modified UTF-8 (basically, the same variant of UTF-8 used by the JVM). typically, code will ignore things like character normalization or alternative orderings. a lot of code doesn't particularly know or care what the exact character encoding is. some amount of code internally uses UTF-16 as well, but this is less common as UTF-16 tends to eat a lot more memory (and, some code just pretends to use UTF-16, when really it is using UTF-8). Text is more than an arbitrary arcane linear sequence of characters. Its use suggests TRANSPARENCY - that a human could understand the grammar and content, from a relatively small sample, and effectively hand-modify the content to a particular end. If much of our text consisted of GUIDs: {21EC2020-3AEA-1069-A2DD-08002B30309D} This might as well be {BLAHBLAH-BLAH-BLAH-BLAH-BLAHBLAHBLAH} The structure is clear, but its meaning is quite opaque. yep. this is also a goal, and many of my formats are designed to at least try to be human editable. some number of them are still often hand-edited as well (such as texture information files). That said, structured editors are not incompatible with an underlying text format. I think that's really the best option. yes. for example, several editors/IDEs have expand/collapse, but still use plaintext for the source-code. Visual Studio and Notepad++ are examples of this, and a more advanced editor could do better (such as expand/collapse on arbitrary code blocks). these are also things like auto-completion, ... which are also nifty and work fine with text. Regarding multi-line quotes... well, if you aren't fixated on ASCII you could always use unicode to find a whole bunch more brackets: http://www.fileformat.info/info/unicode/block/cjk_symbols_and_punctuation/images.htm http://www.fileformat.info/info/unicode/block/miscellaneous_technical/images.htm http://www.fileformat.info/info/unicode/block/miscellaneous_mathematical_symbols_a/images.htm Probably more than you know what to do with. AFAIK, the common consensus in much of programmer-land, is that using Unicode characters as part of the basic syntax of a programming language borders on evil... I ended up using: [[ ... ]] and: ... (basically, same syntax as Python). these seem probably like good enough choices. currently, the [[ and ]] braces are not real tokens, and so will only be parsed specially as such in the particular contexts where they are expected to appear. so, if one types: 2[[3, 4], [5, 6]] the '' will be parsed as a less-than operator. but, if one writes instead: var str=[[ some text... more text... ]]; it will parse as a multi-line string... both types of string are handled specially by the parser (rather than being handled by the tokenizer, as are normal strings). or such... ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote: On 2012-03-13 02:13PM, Julian Leviston wrote: What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. OK, I've thought about it and I still don't get it. I understand that there have been a number of different text encodings, but I thought that the whole point of Unicode was to provide a future-proof way out of that mess. And I could be totally wrong, but I have the impression that it has pretty good penetration. I gather that some people who use the Cyrillic alphabet often use some code page and China and Japan use SHIFT-JIS or whatever in order to have a more compact representation, but that even there UTF-8 tools are commonly available. So I would think that the sensible thing would be to use UTF-8 and figure that anyone (now or in the future) will have tools which support it, and that anyone dedicated enough to go digging into your data files will have no trouble at all figuring out what it is. If that's your point it seems like a pretty minor nitpick. What am I missing? Julian's point, AFAICT, is that text is just a class of storage that requires appropriate viewers and editors, doesn't even describe a specific standard. Thus, another class that requires appropriate viewers and editors can work just as well - spreadsheets, tables, drawings. You mention `data files`. What is a `file`? Is it not a service provided by a `file system`? Can we not just as easily hide a storage format behind a standard service more convenient for ad-hoc views and analysis (perhaps RDBMS). Why organize into files? Other than penetration, they don't seem to be especially convenient. Penetration matters, which is one reason that text and filesystems matter. But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be difficult to support editing of tables, spreadsheets, drawings, etc. atop a web service platform. We probably have more freedom today than we've ever had for language design, if we're willing to stretch just a little bit beyond the traditional filesystem+text-editor framework. Regards, Dave ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
I couldn't agree more. text and files are just encoding and packaging. We routinely represent the same information in different ways during different stages of a program or system's lifecycle in order to obtain advantages relevant to the processing problems at hand. In the past, it has been convenient to encourage ubiquitous use of standard encoding (ASCII) and packaging (files) in exchange for the obvious benefits of simplicity, access to common tooling that understands those standards, and interchange between systems. However, if we set simplicity aside for the moment, the goals of access and interchange can be accomplished by means of mapping. It is not essential to maintain ubiquitous lowest-common-denominator standards if suitable mapping functions exist. My personal feeling is that the design of practical next-generation languages and tools has been retarded for a very long time by an unexamined emotional need to cling to common historical standards that are insufficient to support the needs of forward-looking language concepts. For example, if we look beyond system interchange, the most significant value of core ASCII is its relatively good impedance match to keys found on most computer keyboards. When standard typewriter keyboards were the ubiquitous form of data entry, this was an overwhelmingly important consideration. However, we long ago broke the chains of this relationship: Data entry routinely encompasses entry from pointer devices such as mice and trackballs, tablets of various descriptions, incomplete keyboards such as numeric keypads, game controllers, etc. These axes of expression are not represented in the graphology of ASCII. In this world, the impedance mismatch to ASCII (and UNICODE, which could be seen as ASCII++, since it offers more glyphs but makes little attempt to increase the core semantics of graphology offered) invites examination. In this world, it seems to me that core expressiveness of a graphology trumps ubiquity. I'd like to see more languages being bold and looking beyond ASCII-derived symbology to find graphologies that allow for more powerful representation and manipulation of modern ontologies. A concrete example: ASCII only allows to the right of as a first class relationship in its representation ontology. (The word at is formed as the character t to the right of the character a.) Even concepts such as next line or backspace are second-order concepts encoded by reserved symbols borrowed from the representable namespace. Advanced but still fundamental concepts such as subordinate to (i.e., subscript) are only available in so-called RichText systems. Even more powerful concepts like contains (for example, a word which is composed of the symbol O containing inside it the symbol c) are not representable at all in the commonly available graphologies. The people who attempt to express mathematical formulae routinely grapple with these limitations. Even where a character set includes a root symbol, the underlying graphology does not implement rules by which characters can be arranged around it to represent the third root of x. Many of the excruciating design exercises language designers go thru these days are largely driven by limitations of the ASCII++ graphology we assume to be sacrosanct. (For example, the parts of this discussion thread analyzing the use of various compound-character combinations which intrude all the way to the parsing layer of a language because the core ASCII graphology doesn't feature enough bracket symbols.) This barrier is artificial, historic in nature and need no longer constrain us because we have the luxury of modern high-powered computing systems which allow us to impose abstraction in important ways that were historically infeasible to allow us to achieve new kinds of expressive power and simplicity. -- Mack On Mar 13, 2012, at 8:11 AM, David Barbour wrote: On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote: On 2012-03-13 02:13PM, Julian Leviston wrote: What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. OK, I've thought about it and I still don't get it. I understand that there have been a number of different text encodings, but I thought that the whole point of Unicode was to provide a future-proof way out of that mess. And I could be totally wrong, but I have the impression that it has pretty good penetration. I gather that some people who use the Cyrillic alphabet often use some code page and China and Japan use SHIFT-JIS or whatever in order to have a more compact representation, but that even there UTF-8 tools are commonly available. So I would think
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 14/03/2012, at 2:11 AM, David Barbour wrote: On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote: On 2012-03-13 02:13PM, Julian Leviston wrote: What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. OK, I've thought about it and I still don't get it. I understand that there have been a number of different text encodings, but I thought that the whole point of Unicode was to provide a future-proof way out of that mess. And I could be totally wrong, but I have the impression that it has pretty good penetration. I gather that some people who use the Cyrillic alphabet often use some code page and China and Japan use SHIFT-JIS or whatever in order to have a more compact representation, but that even there UTF-8 tools are commonly available. So I would think that the sensible thing would be to use UTF-8 and figure that anyone (now or in the future) will have tools which support it, and that anyone dedicated enough to go digging into your data files will have no trouble at all figuring out what it is. If that's your point it seems like a pretty minor nitpick. What am I missing? Julian's point, AFAICT, is that text is just a class of storage that requires appropriate viewers and editors, doesn't even describe a specific standard. Thus, another class that requires appropriate viewers and editors can work just as well - spreadsheets, tables, drawings. You mention `data files`. What is a `file`? Is it not a service provided by a `file system`? Can we not just as easily hide a storage format behind a standard service more convenient for ad-hoc views and analysis (perhaps RDBMS). Why organize into files? Other than penetration, they don't seem to be especially convenient. Penetration matters, which is one reason that text and filesystems matter. But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be difficult to support editing of tables, spreadsheets, drawings, etc. atop a web service platform. We probably have more freedom today than we've ever had for language design, if we're willing to stretch just a little bit beyond the traditional filesystem+text-editor framework. Regards, Dave Perfectly the point, David. A token/character in ASCII is equivalent to a byte. In SHIFT-JIS, it's two, but this doesn't mean you can't express the equivalent meaning in them (ie by selecting the same graphemes) - this is called translation) ;-) One of the most profound things for me has been understanding the ramifications of OMeta. It doesn't just parse streams of characters (whatever they are) in fact it doesn't care what the individual tokens of its parsing stream is. It's concerned merely with the syntax of its elements (or tokens) - how they combine to form certain rules - (here I mean valid patterns of grammar by rules). If one considers this well, it has amazing ramifications. OMeta invites us to see the entire computing world in terms of sets of problem-oriented-languages, where language is a liberal word that simply means a pattern of sequence of the constituent elements of a thing. To PEG, it basically adds proper translation and true object-orientism of individual parsing elements. This takes a while to understand, I think. Formats here become languages, protocols are languages, and so are any other kind of representation system you care to name (computer programming languages, processor instruction sets, etc.). I'm postulating, BGB, that you're perhaps so ingrained in the current modality and approach to thinking about computers, that you maybe can't break out of it to see what else might be possible. I think it was turing, wasn't it, who postulated that his turing machines could work off ANY symbols... so if that's the case, and your programming language grammar has a set of symbols, why not use arbitrary (ie not composed of english letters) ideograms for them? (I think these days we call these things icons ;-)) You might say but how will people name their variables - well perhaps for those things, you could use english letters, but maybe you could enforce that no one use more than 30 variables in their code in any one simple chunk, in which case build them in with the other ideograms. I'm not attempting to build any kind of authoritative status here, merely provoke some different thought in you. I'll take Dave's point that penetration matters, and at the same time, most new ideas have old idea constituents, so you can easily find some matter for people stuck in the old methodologies and thinking to relate to when building your new stuff ;-) Regards, Julian___ fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/13/2012 4:37 PM, Julian Leviston wrote: On 14/03/2012, at 2:11 AM, David Barbour wrote: On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com mailto:j...@qualdan.com wrote: On 2012-03-13 02:13PM, Julian Leviston wrote: What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. OK, I've thought about it and I still don't get it. I understand that there have been a number of different text encodings, but I thought that the whole point of Unicode was to provide a future-proof way out of that mess. And I could be totally wrong, but I have the impression that it has pretty good penetration. I gather that some people who use the Cyrillic alphabet often use some code page and China and Japan use SHIFT-JIS or whatever in order to have a more compact representation, but that even there UTF-8 tools are commonly available. So I would think that the sensible thing would be to use UTF-8 and figure that anyone (now or in the future) will have tools which support it, and that anyone dedicated enough to go digging into your data files will have no trouble at all figuring out what it is. If that's your point it seems like a pretty minor nitpick. What am I missing? Julian's point, AFAICT, is that text is just a class of storage that requires appropriate viewers and editors, doesn't even describe a specific standard. Thus, another class that requires appropriate viewers and editors can work just as well - spreadsheets, tables, drawings. You mention `data files`. What is a `file`? Is it not a service provided by a `file system`? Can we not just as easily hide a storage format behind a standard service more convenient for ad-hoc views and analysis (perhaps RDBMS). Why organize into files? Other than penetration, they don't seem to be especially convenient. Penetration matters, which is one reason that text and filesystems matter. But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be difficult to support editing of tables, spreadsheets, drawings, etc. atop a web service platform. We probably have more freedom today than we've ever had for language design, if we're willing to stretch just a little bit beyond the traditional filesystem+text-editor framework. Regards, Dave Perfectly the point, David. A token/character in ASCII is equivalent to a byte. In SHIFT-JIS, it's two, but this doesn't mean you can't express the equivalent meaning in them (ie by selecting the same graphemes) - this is called translation) ;-) this is partly why there are codepoints. one can work in terms of codepoints, rather than bytes. a text editor may internally work in UTF-16, but saves its output in UTF-8 or similar. ironically, this is basically what I am planning/doing at the moment. now, if/how the user will go about typing UTF-16 codepoints, this is not yet decided. One of the most profound things for me has been understanding the ramifications of OMeta. It doesn't just parse streams of characters (whatever they are) in fact it doesn't care what the individual tokens of its parsing stream is. It's concerned merely with the syntax of its elements (or tokens) - how they combine to form certain rules - (here I mean valid patterns of grammar by rules). If one considers this well, it has amazing ramifications. OMeta invites us to see the entire computing world in terms of sets of problem-oriented-languages, where language is a liberal word that simply means a pattern of sequence of the constituent elements of a thing. To PEG, it basically adds proper translation and true object-orientism of individual parsing elements. This takes a while to understand, I think. Formats here become languages, protocols are languages, and so are any other kind of representation system you care to name (computer programming languages, processor instruction sets, etc.). possibly. I was actually sort of aware of a lot of this already though, but didn't consider it particularly relevant. I'm postulating, BGB, that you're perhaps so ingrained in the current modality and approach to thinking about computers, that you maybe can't break out of it to see what else might be possible. I think it was turing, wasn't it, who postulated that his turing machines could work off ANY symbols... so if that's the case, and your programming language grammar has a set of symbols, why not use arbitrary (ie not composed of english letters) ideograms for them? (I think these days we call these things icons ;-)) You might say but how will people name their variables - well perhaps for those
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
Since it's your own system end-to-end, why not just stop editing source as a stream of ascii characters? Some kind of simple structured editor would let you put whatever you please in strings without requiring any escaping at all. It'd also make the parsing simpler :) -- Enjoy every sandwich. - WZ Josh 'G-Funk' McDonald - j...@joshmcdonald.info On 11 March 2012 03:38, BGB cr88...@gmail.com wrote: On 3/10/2012 2:21 AM, Wesley Smith wrote: most notable thing I did recently (besides some fiddling with getting a new JIT written), was adding a syntax for block-strings. I used[[ ... ]] rather than triple-quotes (like in Python), mostly as this syntax is more friendly to nesting, and is also fairly unlikely to appear by accident, and couldn't come up with much obviously better at the moment, {{ ... }} was another considered option (but is IIRC already used for something), as was the option of just using triple-quote (would work, but isn't readily nestable). You should have a look at Lua's long string syntax if you haven't already: [[ my long string]] this was briefly considered, but would have a much higher risk of clashes. consider someone wants to type a nested array: [[1, 2, 3], [4, 5, 6], [7, 8, 9]] which is not so good if this array is (randomly) parsed as a string. preferable is to try to avoid syntax which is likely to appear by chance, as then programmers have to use extra caution to avoid any magic sigils which might have unintended behaviors, but can pop up randomly as a result of typing code using only more basic constructions (I try to avoid this much as I do ambiguities in general, and is partly also why, IMO, the common AS, T syntax for templates/generics is a bit nasty). the syntax: [[ ... ]] was chosen as it had little chance of clashing with other valid syntax (apart from, potentially, the CDATA end marker for XML, which at present would need to be escaped if using this syntax for globs of XML). it is possible, as the language does include unary and operators, which could, conceivably, be applied to a nested array. this is, however, rather unlikely, and could be fixed easily enough with a space. as-is, they have an even-nesting rule. WRT uneven-nesting, they can be escaped via '\' (don't really like, as it leaves the character as magic...). [[ this string has an embedded \]]... but this is ok. ]] OTOH (other remote possibilities): { ... } was already used for insert-here expressions in XML literals: foo{generateSomeNode()}/**foo (...) or ((...)) just wouldn't work (high chance of collision). #(...), #[...], and #{...} are already in use (tuple, float vector or matrix, and list). example: vector: #[0, 0, 0] quaternion: #[0, 0, 0, 1]Q matrix: #[[1, 0, 0] [0, 1, 0] [0, 0, 1]] list: #{#foo, 2, 3; #v} note: (...) parens, [...] array, {...} dictionary/object (example: {a: 3, y: 4}). @(...), @[...], and @{...} are still technically available. also possible: /[...]/ , /[[...]]/ would be passable mostly only as /.../ is already used for regex syntax (inherited from JS...). hmm: ? ... ? ? ... ? (available, currently syntactically invalid). likewise: \ ... \, ... | ... | ... so, the issue is mostly lacking sufficient numbers of available (good) brace types. in a few other cases, this lack has been addressed via the use of keywords and type-suffixes. but, a keyword would be lame for a string, and a suffix wouldn't work. You can nest by matching the number of '=' between the brackets: [===[ a long string [=[ with a long string inside it ]=] xx ]===] this would be possible, as otherwise this syntax would not be syntactically valid in the language. [=[...]=] would be at least possible. not that I particularly like this syntax though... (inlined): On 3/10/2012 2:43 AM, Ondřej Bílka wrote: On Sat, Mar 10, 2012 at 01:21:42AM -0800, Wesley Smith wrote: You should have a look at Lua's long string syntax if you haven't already: Better to be consistent with rest of scripting languages(bash,ruby,perl,* *python) and use heredocs. blarg... heredoc syntax is nasty IMO... I deliberately didn't use heredocs. if I did, I would probably use the syntax: #END; ... END or similar... Python uses triple-quotes, which I had also considered (just, they couldn't nest): lots of stuff... over multiple lines... this would mean: [[ lots of stuff... over multiple lines... ]] possibly also with the Python syntax: ... or such... __**_ fonc mailing list fonc@vpri.org http://vpri.org/mailman/**listinfo/fonchttp://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/12/2012 6:31 PM, Josh McDonald wrote: Since it's your own system end-to-end, why not just stop editing source as a stream of ascii characters? Some kind of simple structured editor would let you put whatever you please in strings without requiring any escaping at all. It'd also make the parsing simpler :) although theoretically possible, I wouldn't really trust not having the ability to use conventional text editors whenever need-be (or mandate use of a particular editor). for most things I am using text-based formats, including for things like world-maps and 3D models (both are based on arguably mutilated versions of other formats: Quake maps and AC3D models). the power of text is that, if by some chance someone does need to break out a text editor and edit something, the format wont hinder them from doing so. but, yes, that Inventing on Principle / Magic Ink video did rather get my interest up in terms of wanting to support much more streamlined script-editing interface. I recently had a bit of fun writing small script fragments to blow up light sources and other things, and figure if I can get a more advanced text-editing interface thrown together, more interesting things might also be possible. blow the lights, all nearby light sources explode (with fiery particle explosion effects and sounds), and the area goes dark. current leaning is to try to throw something together vaguely QBasic-like (with a proper text editor, and probably F5 as the Run key, ...). as-is, I already have an ed / edlin-style text editor, and ALT + 1-9 as console-change keys (and now have multiple consoles, sort of like Linux or similar), ... was considering maybe the fancier text editor would use ALT-SHIFT + A-Z for switching between modules. will see what I can do here. or such... -- Enjoy every sandwich. - WZ Josh 'G-Funk' McDonald - j...@joshmcdonald.info mailto:j...@joshmcdonald.info ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 13/03/2012, at 1:21 PM, BGB wrote: although theoretically possible, I wouldn't really trust not having the ability to use conventional text editors whenever need-be (or mandate use of a particular editor). for most things I am using text-based formats, including for things like world-maps and 3D models (both are based on arguably mutilated versions of other formats: Quake maps and AC3D models). the power of text is that, if by some chance someone does need to break out a text editor and edit something, the format wont hinder them from doing so. What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. Julian___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On Mon, Mar 12, 2012 at 8:13 PM, Julian Leviston jul...@leviston.netwrote: On 13/03/2012, at 1:21 PM, BGB wrote: although theoretically possible, I wouldn't really trust not having the ability to use conventional text editors whenever need-be (or mandate use of a particular editor). for most things I am using text-based formats, including for things like world-maps and 3D models (both are based on arguably mutilated versions of other formats: Quake maps and AC3D models). the power of text is that, if by some chance someone does need to break out a text editor and edit something, the format wont hinder them from doing so. What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files? Just saying' ;-) Hopefully you understand my point. You probably won't initially, so hopefully you'll meditate a bit on my response without giving a knee-jerk reaction. Text is more than an arbitrary arcane linear sequence of characters. Its use suggests TRANSPARENCY - that a human could understand the grammar and content, from a relatively small sample, and effectively hand-modify the content to a particular end. If much of our text consisted of GUIDs: {21EC2020-3AEA-1069-A2DD-08002B30309D} This might as well be {BLAHBLAH-BLAH-BLAH-BLAH-BLAHBLAHBLAH} The structure is clear, but its meaning is quite opaque. That said, structured editors are not incompatible with an underlying text format. I think that's really the best option. Regarding multi-line quotes... well, if you aren't fixated on ASCII you could always use unicode to find a whole bunch more brackets: http://www.fileformat.info/info/unicode/block/cjk_symbols_and_punctuation/images.htm http://www.fileformat.info/info/unicode/block/miscellaneous_technical/images.htm http://www.fileformat.info/info/unicode/block/miscellaneous_mathematical_symbols_a/images.htm Probably more than you know what to do with. Regards, Dave ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
[fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)
On 3/10/2012 2:21 AM, Wesley Smith wrote: most notable thing I did recently (besides some fiddling with getting a new JIT written), was adding a syntax for block-strings. I used[[ ... ]] rather than triple-quotes (like in Python), mostly as this syntax is more friendly to nesting, and is also fairly unlikely to appear by accident, and couldn't come up with much obviously better at the moment, {{ ... }} was another considered option (but is IIRC already used for something), as was the option of just using triple-quote (would work, but isn't readily nestable). You should have a look at Lua's long string syntax if you haven't already: [[ my long string]] this was briefly considered, but would have a much higher risk of clashes. consider someone wants to type a nested array: [[1, 2, 3], [4, 5, 6], [7, 8, 9]] which is not so good if this array is (randomly) parsed as a string. preferable is to try to avoid syntax which is likely to appear by chance, as then programmers have to use extra caution to avoid any magic sigils which might have unintended behaviors, but can pop up randomly as a result of typing code using only more basic constructions (I try to avoid this much as I do ambiguities in general, and is partly also why, IMO, the common AS, T syntax for templates/generics is a bit nasty). the syntax: [[ ... ]] was chosen as it had little chance of clashing with other valid syntax (apart from, potentially, the CDATA end marker for XML, which at present would need to be escaped if using this syntax for globs of XML). it is possible, as the language does include unary and operators, which could, conceivably, be applied to a nested array. this is, however, rather unlikely, and could be fixed easily enough with a space. as-is, they have an even-nesting rule. WRT uneven-nesting, they can be escaped via '\' (don't really like, as it leaves the character as magic...). [[ this string has an embedded \]]... but this is ok. ]] OTOH (other remote possibilities): { ... } was already used for insert-here expressions in XML literals: foo{generateSomeNode()}/foo (...) or ((...)) just wouldn't work (high chance of collision). #(...), #[...], and #{...} are already in use (tuple, float vector or matrix, and list). example: vector: #[0, 0, 0] quaternion: #[0, 0, 0, 1]Q matrix: #[[1, 0, 0] [0, 1, 0] [0, 0, 1]] list: #{#foo, 2, 3; #v} note: (...) parens, [...] array, {...} dictionary/object (example: {a: 3, y: 4}). @(...), @[...], and @{...} are still technically available. also possible: /[...]/ , /[[...]]/ would be passable mostly only as /.../ is already used for regex syntax (inherited from JS...). hmm: ? ... ? ? ... ? (available, currently syntactically invalid). likewise: \ ... \, ... | ... | ... so, the issue is mostly lacking sufficient numbers of available (good) brace types. in a few other cases, this lack has been addressed via the use of keywords and type-suffixes. but, a keyword would be lame for a string, and a suffix wouldn't work. You can nest by matching the number of '=' between the brackets: [===[ a long string [=[ with a long string inside it ]=] xx ]===] this would be possible, as otherwise this syntax would not be syntactically valid in the language. [=[...]=] would be at least possible. not that I particularly like this syntax though... (inlined): On 3/10/2012 2:43 AM, Ondřej Bílka wrote: On Sat, Mar 10, 2012 at 01:21:42AM -0800, Wesley Smith wrote: You should have a look at Lua's long string syntax if you haven't already: Better to be consistent with rest of scripting languages(bash,ruby,perl,python) and use heredocs. blarg... heredoc syntax is nasty IMO... I deliberately didn't use heredocs. if I did, I would probably use the syntax: #END; ... END or similar... Python uses triple-quotes, which I had also considered (just, they couldn't nest): lots of stuff... over multiple lines... this would mean: [[ lots of stuff... over multiple lines... ]] possibly also with the Python syntax: ... or such... ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc