Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-14 Thread Loup Vaillant

BGB wrote:

   On 3/13/2012 4:37 PM, Julian Leviston wrote:

I'll take Dave's point that penetration matters, and at the same time,
most new ideas have old idea constituents, so you can easily find
some matter for people stuck in the old methodologies and thinking to
relate to when building your new stuff ;-)



well, it is like using alternate syntax designs (say, not a C-style
curly brace syntax).

one can do so, but is it worth it?
in such a case, the syntax is no longer what most programmers are
familiar or comfortable with, and it is more effort to convert code
to/from the language, ...


Alternate syntaxes are not always as awkward as you seem to think they
are, especially the specialized ones.  The trick is to ask yourself how
you would have written such an such piece of program if there were no
pesky parser to satisfy.  Or how you would have written a complete spec
in the comments.  Then you write the parser which accepts such input.

My point is, new syntax don't always have to be unfamiliar.
For instance:

+---+---+---+---+---+---+---+
| 0 | 2 | 3 | 4 | 5 | 6 | 7 |
+---+---+---+---+---+---+---+
|foo|bar|
+---+---+
|baz|
+---+

It should be obvious to anyone who has read an RFC (or a STEPS progress
report) that it describes a bit field (16 bits large, with 3 fields).
And those who didn't should have learned this syntax by now.

Now the only question left is, is it worth the trouble _implementing_
the syntax?  Considering that code is more often read than written,
I'd say it often is.  Even if the code that parses the syntax isn't
crystal clear, what the syntax should mean is.

You could also play the human compiler: use the better syntax in the
comments, and implement a translation of it in code just below.  But
then you have to manually make sure they are synchronized.  Comments
are good.  Needing them is bad.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-14 Thread Loup Vaillant

Michael FIG wrote:

Loup Vaillantl...@loup-vaillant.fr  writes:


You could also play the human compiler: use the better syntax in the
comments, and implement a translation of it in code just below.  But
then you have to manually make sure they are synchronized.  Comments
are good.  Needing them is bad.


Or use a preprocessor that substitutes the translation inline
automatically.


Which is a way of implementing the syntax… How is this different than
my Then you write the parser?  Sure you can use a preprocessor, but
you still have to write the macros for your new syntax.

Loup.
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-14 Thread BGB

On 3/14/2012 8:57 AM, Loup Vaillant wrote:

Michael FIG wrote:

Loup Vaillantl...@loup-vaillant.fr  writes:


You could also play the human compiler: use the better syntax in the
comments, and implement a translation of it in code just below.  But
then you have to manually make sure they are synchronized.  Comments
are good.  Needing them is bad.


Or use a preprocessor that substitutes the translation inline
automatically.


Which is a way of implementing the syntax… How is this different than
my Then you write the parser?  Sure you can use a preprocessor, but
you still have to write the macros for your new syntax.



in my case, this can be theoretically done already (writing new 
customized parsers), and was part of why I added block-strings.


most likely route would be translating code into ASTs, and maybe using 
something like (defmacro) or similar at the AST level.


another route could be I guess to make use of quote and unquote, 
both of which can be used as expression-building features (functionally, 
they are vaguely similar to quasiquote in Scheme, but they haven't 
enjoyed so much use thus far).



a more practical matter though would be getting things nailed down 
enough so that larger parts of the system can be written in a language 
other than C.


yes, there is the FFI (generally seems to work fairly well), and one can 
shove script closures into C-side function pointers (provided arguments 
and return types are annotated and the types match exactly, but I don't 
entirely trust its reliability, ...).


slightly nicer would be if code could be written in various places which 
accepts script objects (either via interfaces or ex-nihilo objects).


abstract example (ex-nihilo object):
var obj={render: function() { ... } ... };
lbxModelRegisterScriptObject(models/script/somemodel, obj);

so, if some code elsewhere creates an object using the given model-name, 
then the script code is invoked to go about rendering it.


alternatively, using an interface:
public interface IRender3D { ... }//contents omitted for brevity
public class MyObject implements IRender3D { ... }
lbxModelRegisterScriptObject(models/script/somemodel, new MyObject());

granted, there are probably better (and less likely to kill performance) 
ways to make use of script objects (as-is, using script code to write 
objects for use in the 3D renderer is not likely to turn out well 
regarding the framerate and similar, at least until if/when there is a 
good solid JIT in place, and it can compete more on equal terms with C 
regarding performance).



mostly the script language was intended for use in the game's server 
end, where typically raw performance is less critical, but as-is, there 
is still a bit of a language-border issue that would need to be worked 
on here (I originally intended to write the server end mostly in script, 
but at the time the VM was a little less solid at the time (poorer 
performance, more prone to leak memory and trigger GC, ...), and so the 
server end was written more quick and dirty in plain C, using a design 
fairly similar to a mix of the Quake 1 and 2 server-ends). as-is, it is 
not entirely friendly to the script code, so a little more work is needed.


another possible use case is related to world-construction tasks 
(procedural world-building and similar).


but, yes, all of this is a bit more of a mundane ways of using a 
scripting language, but then again, everything tends to be built from 
the bottom up (and this just happens to be where I am currently at, at 
this point in time).


(maybe at which point in time I am stuck less worrying about which 
language is used where and about cross-language interfacing issues, then 
allowing things like alternative syntax, ... could be more worth 
exploration. but, in many areas, both C and C++ have a bit of a gravity 
well...).



or such...

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-14 Thread Mack

On Mar 13, 2012, at 6:27 PM, BGB wrote:

SNIP
 the issue is not that I can't imagine anything different, but rather that 
 doing anything different would be a hassle with current keyboard technology:
 pretty much anyone can type ASCII characters;
 many other people have keyboards (or key-mappings) that can handle 
 region-specific characters.
 
 however, otherwise, typing unusual characters (those outside their current 
 keyboard mapping) tends to be a bit more painful, and/or introduces editor 
 dependencies, and possibly increases the learning curve (now people have to 
 figure out how these various unorthodox characters map to the keyboard, ...).
 
 more graphical representations, however, have a secondary drawback:
 they can't be manipulated nearly as quickly or as easily as text.
 
 one could be like drag and drop, but the problem is that drag and drop is 
 still a fairly slow and painful process (vs, hitting keys on the keyboard).
 
 
 yes, there are scenarios where keyboards aren't ideal:
 such as on an XBox360 or an Android tablet/phone/... or similar, but people 
 probably aren't going to be using these for programming anyways, so it is 
 likely a fairly moot point. 
 
 however, even in these cases, it is not clear there are many clearly better 
 options either (on-screen keyboard, or on-screen tile selector, either way it 
 is likely to be painful...).
 
 
 simplest answer:
 just assume that current text-editor technology is basically sufficient and 
 call it good enough.

Stipulating that having the keys on the keyboard mean what the painted symbols 
show is the simplest path with the least impedance mismatch for the user, 
there are already alternatives in common use that bear thinking on.  For 
example:

On existing keyboards, multi-stroke operations to produce new characters 
(holding down shift key to get CAPS, CTRL-ALT-TAB-whatever to get a special 
character or function, etc…) are customary and have entered average user 
experience.

Users of IDE's like EMACS, IntelliJ or Eclipse are well-acquainted with special 
keystrokes to get access to code completions and intention templates.

So it's not inconceivable to consider a similar strategy for typing 
non-character graphical elements.  One could think of say… CTRL-O, UP ARROW, UP 
ARROW, ESC to type a circle and size it, followed by CTRL-RIGHT ARROW, C to 
enter the circle and type a c inside it.

An argument against these strategies is the same one against command-line 
interfaces in the CLI vs. GUI discussion: namely, that without visual 
prompting, the possibilities that are available to be typed are not readily 
visible to the user.  The user has to already know what combination gives him 
what symbol.

One solution for mitigating this, presuming rich graphical typing was 
desirable, would be to take a page from the way touch type cell phones and 
tablets work, showing symbol maps on the screen in response to user input, with 
the maps being progressively refined as the user types to guide the user 
through constructing their desired input.

…just a thought :)


SNIP
On Mar 13, 2012, at 6:27 PM, BGB also wrote:

 
 
 I'll take Dave's point that penetration matters, and at the same time, most 
 new ideas have old idea constituents, so you can easily find some matter 
 for people stuck in the old methodologies and thinking to relate to when 
 building your new stuff ;-)
 
 
 well, it is like using alternate syntax designs (say, not a C-style curly 
 brace syntax).
 
 one can do so, but is it worth it?
 in such a case, the syntax is no longer what most programmers are familiar or 
 comfortable with, and it is more effort to convert code to/from the language, 
 …

The degenerate endpoint of this argument (which, sadly I encounter on a daily 
basis in the larger business-technical community) is if it isn't Java, it is 
by definition alien and to uncomfortable (and therefore too expensive) to use.

We can protest the myopia inherent in that objection, but the sad fact is that 
perception and emotional comfort are more important to the average person's 
decision-making process than coldly rational analysis.  (I refer to this as the 
Discount Shirt problem.  Despite the fact that a garment bought at a discount 
store doesn't fit well and falls apart after the first washing… not actually 
fulfilling our expectations of what a shirt should do, so ISN'T really a shirt 
from a usability perspective… because it LOOKS like a shirt and the store CALLS 
it a shirt, we still buy it, telling ourselves we've bought a shirt.  Then we 
go home and complain that shirts are a failure.)

Given this hurdle of perception, I have come to the conclusion that the only 
reasonable way to make advances is to live in the world of use case-driven 
design and measure the success of a language by how well it fits the perceived 
shape of the problem to be solved, looking for familiarity on the part of the 
user by means of keeping semantic distance between the language 

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-14 Thread BGB

On 3/14/2012 11:31 AM, Mack wrote:

On Mar 13, 2012, at 6:27 PM, BGB wrote:

SNIP

the issue is not that I can't imagine anything different, but rather that doing 
anything different would be a hassle with current keyboard technology:
pretty much anyone can type ASCII characters;
many other people have keyboards (or key-mappings) that can handle 
region-specific characters.

however, otherwise, typing unusual characters (those outside their current 
keyboard mapping) tends to be a bit more painful, and/or introduces editor 
dependencies, and possibly increases the learning curve (now people have to 
figure out how these various unorthodox characters map to the keyboard, ...).

more graphical representations, however, have a secondary drawback:
they can't be manipulated nearly as quickly or as easily as text.

one could be like drag and drop, but the problem is that drag and drop is 
still a fairly slow and painful process (vs, hitting keys on the keyboard).


yes, there are scenarios where keyboards aren't ideal:
such as on an XBox360 or an Android tablet/phone/... or similar, but people 
probably aren't going to be using these for programming anyways, so it is 
likely a fairly moot point.

however, even in these cases, it is not clear there are many clearly better 
options either (on-screen keyboard, or on-screen tile selector, either way it is likely 
to be painful...).


simplest answer:
just assume that current text-editor technology is basically sufficient and call it 
good enough.

Stipulating that having the keys on the keyboard mean what the painted symbols 
show is the simplest path with the least impedance mismatch for the user, there are 
already alternatives in common use that bear thinking on.  For example:

On existing keyboards, multi-stroke operations to produce new characters 
(holding down shift key to get CAPS, CTRL-ALT-TAB-whatever to get a special 
character or function, etc…) are customary and have entered average user 
experience.

Users of IDE's like EMACS, IntelliJ or Eclipse are well-acquainted with special 
keystrokes to get access to code completions and intention templates.

So it's not inconceivable to consider a similar strategy for typing non-character graphical elements.  One 
could think of say… CTRL-O, UP ARROW, UP ARROW, ESC to type a circle and size it, followed by CTRL-RIGHT 
ARROW, C to enter the circle and type a c inside it.

An argument against these strategies is the same one against command-line 
interfaces in the CLI vs. GUI discussion: namely, that without visual 
prompting, the possibilities that are available to be typed are not readily 
visible to the user.  The user has to already know what combination gives him 
what symbol.

One solution for mitigating this, presuming rich graphical typing was desirable, would 
be to take a page from the way touch type cell phones and tablets work, showing symbol 
maps on the screen in response to user input, with the maps being progressively refined as the user 
types to guide the user through constructing their desired input.

…just a thought :)


typing, like on phones...
I have seen 2 major ways of doing this:
hit key multiple times to indicate the desired letter, with a certain 
timeout before it moves to the next character;
type out characters, phone shows first/most-likely possibility, hit a 
key a bunch of times to cycle though the options.



another idle thought would be some sort of graphical/touch-screen 
keyboard, but it would be a matter of finding a way to make it not suck. 
using on-screen inputs in Android devices and similar kind of sucks:
pressure and sensitivity issues, comfort issues, lack of tactile 
feedback, smudges on the screen if one uses their fingers, and 
potentially scratches if one is using a stylus, ...


so, say, a touch-screen with these properties:
similar sized (or larger) than a conventional keyboard;
resistant to smudging, fairly long lasting, and easy to clean;
soft contact surface (me thinking sort of like those gel insoles for 
shoes), so that ideally typing isn't an experience of constantly hitting 
a piece of glass with ones' fingers (ideally, both impact pressure and 
responsiveness should be similar to a conventional keyboard, or at least 
a laptop keyboard);
ideally, some sort of tactile feedback (so, one can feel whether or not 
they are actually hitting the keys);
being dynamically reprogrammable (say, any app which knows about the 
keyboard can change its layout when it gains focus, or alternatively the 
user can supply per-app keyboard layouts);
maybe, there could be tabs to change between layouts, such as a US-ASCII 
tab, ...

...

with something like the above being common, I can more easily imagine 
people using non-ASCII based input methods.


say, one is typing in US-ASCII, hits a math-symbol layout where, for 
example, the numeric keypad (or maybe the whole rest of the keyboard) is 
replaced by a grid of math symbols, or maybe also have a drawing 
tablet tab, 

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-13 Thread BGB

On 3/12/2012 9:01 PM, David Barbour wrote:


On Mon, Mar 12, 2012 at 8:13 PM, Julian Leviston jul...@leviston.net 
mailto:jul...@leviston.net wrote:



On 13/03/2012, at 1:21 PM, BGB wrote:


although theoretically possible, I wouldn't really trust not
having the ability to use conventional text editors whenever
need-be (or mandate use of a particular editor).

for most things I am using text-based formats, including for
things like world-maps and 3D models (both are based on arguably
mutilated versions of other formats: Quake maps and AC3D models).
the power of text is that, if by some chance someone does need to
break out a text editor and edit something, the format wont
hinder them from doing so.



What is text? Do you store your text in ASCII, EBCDIC,
SHIFT-JIS or UTF-8? If it's UTF-8, how do you use an ASCII editor
to edit the UTF-8 files?

Just saying' ;-) Hopefully you understand my point.

You probably won't initially, so hopefully you'll meditate a bit
on my response without giving a knee-jerk reaction.




I typically work with the ASCII subset of UTF-8 (where ASCII and UTF-8 
happen to be equivalent).


most of the code is written to assume UTF-8, but languages are designed 
to not depend on any characters outside the ASCII range (leaving them 
purely for comments, and for those few people who consider using them 
for identifiers).


EBCDIC and SHIFT-JIS are sufficiently obscure that one can generally 
pretend that they don't exist (FWIW, I don't generally support codepages 
either).


a lot of code also tends to assume Modified UTF-8 (basically, the same 
variant of UTF-8 used by the JVM). typically, code will ignore things 
like character normalization or alternative orderings. a lot of code 
doesn't particularly know or care what the exact character encoding is.


some amount of code internally uses UTF-16 as well, but this is less 
common as UTF-16 tends to eat a lot more memory (and, some code just 
pretends to use UTF-16, when really it is using UTF-8).




Text is more than an arbitrary arcane linear sequence of characters. 
Its use suggests TRANSPARENCY - that a human could understand the 
grammar and content, from a relatively small sample, and effectively 
hand-modify the content to a particular end.


If much of our text consisted of GUIDs:
  {21EC2020-3AEA-1069-A2DD-08002B30309D}
This might as well be
  {BLAHBLAH-BLAH-BLAH-BLAH-BLAHBLAHBLAH}

The structure is clear, but its meaning is quite opaque.



yep.

this is also a goal, and many of my formats are designed to at least try 
to be human editable.
some number of them are still often hand-edited as well (such as texture 
information files).



That said, structured editors are not incompatible with an underlying 
text format. I think that's really the best option.


yes.

for example, several editors/IDEs have expand/collapse, but still use 
plaintext for the source-code.


Visual Studio and Notepad++ are examples of this, and a more advanced 
editor could do better (such as expand/collapse on arbitrary code blocks).


these are also things like auto-completion, ... which are also nifty and 
work fine with text.



Regarding multi-line quotes... well, if you aren't fixated on ASCII 
you could always use unicode to find a whole bunch more brackets:

http://www.fileformat.info/info/unicode/block/cjk_symbols_and_punctuation/images.htm
http://www.fileformat.info/info/unicode/block/miscellaneous_technical/images.htm
http://www.fileformat.info/info/unicode/block/miscellaneous_mathematical_symbols_a/images.htm
Probably more than you know what to do with.



AFAIK, the common consensus in much of programmer-land, is that using 
Unicode characters as part of the basic syntax of a programming language 
borders on evil...



I ended up using:
[[ ... ]]
and:
 ...  (basically, same syntax as Python).

these seem probably like good enough choices.

currently, the [[ and ]] braces are not real tokens, and so will only 
be parsed specially as such in the particular contexts where they are 
expected to appear.


so, if one types:
2[[3, 4], [5, 6]]
the '' will be parsed as a less-than operator.

but, if one writes instead:
var str=[[
some text...
more text...
]];

it will parse as a multi-line string...

both types of string are handled specially by the parser (rather than 
being handled by the tokenizer, as are normal strings).



or such...

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-13 Thread David Barbour
On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote:

 On 2012-03-13 02:13PM, Julian Leviston wrote:
 What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or
 UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit the UTF-8
 files?
 
 Just saying' ;-) Hopefully you understand my point.
 
 You probably won't initially, so hopefully you'll meditate a bit on my
 response without giving a knee-jerk reaction.

 OK, I've thought about it and I still don't get it.  I understand that
 there have been a number of different text encodings, but I thought that
 the whole point of Unicode was to provide a future-proof way out of that
 mess.  And I could be totally wrong, but I have the impression that it
 has pretty good penetration.  I gather that some people who use the
 Cyrillic alphabet often use some code page and China and Japan use
 SHIFT-JIS or whatever in order to have a more compact representation,
 but that even there UTF-8 tools are commonly available.

 So I would think that the sensible thing would be to use UTF-8 and
 figure that anyone (now or in the future) will have tools which support
 it, and that anyone dedicated enough to go digging into your data files
 will have no trouble at all figuring out what it is.

 If that's your point it seems like a pretty minor nitpick.  What am I
 missing?


Julian's point, AFAICT, is that text is just a class of storage that
requires appropriate viewers and editors, doesn't even describe a specific
standard. Thus, another class that requires appropriate viewers and editors
can work just as well - spreadsheets, tables, drawings.

You mention `data files`. What is a `file`? Is it not a service provided by
a `file system`? Can we not just as easily hide a storage format behind a
standard service more convenient for ad-hoc views and analysis (perhaps
RDBMS). Why organize into files? Other than penetration, they don't seem to
be especially convenient.

Penetration matters, which is one reason that text and filesystems matter.

But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be
difficult to support editing of tables, spreadsheets, drawings, etc. atop a
web service platform. We probably have more freedom today than we've ever
had for language design, if we're willing to stretch just a little bit
beyond the traditional filesystem+text-editor framework.

Regards,

Dave
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-13 Thread Mack
I couldn't agree more.

text and files are just encoding and packaging.   We routinely represent 
the same information in different ways during different stages of a program or 
system's lifecycle in order to obtain advantages relevant to the processing 
problems at hand.  In the past, it has been convenient to encourage ubiquitous 
use of standard encoding (ASCII) and packaging (files) in exchange for the 
obvious benefits of simplicity, access to common tooling that understands those 
standards, and interchange between systems.

However, if we set simplicity aside for the moment, the goals of access and 
interchange can be accomplished by means of mapping.  It is not essential to 
maintain ubiquitous lowest-common-denominator standards if suitable mapping 
functions exist.

My personal feeling is that the design of practical next-generation languages 
and tools has been retarded for a very long time by an unexamined emotional 
need to cling to common historical standards that are insufficient to support 
the needs of forward-looking language concepts.

For example, if we look beyond system interchange, the most significant value 
of core ASCII is its relatively good impedance match to keys found on most 
computer keyboards.  When standard typewriter keyboards were the ubiquitous 
form of data entry, this was an overwhelmingly important consideration.  
However, we long ago broke the chains of this relationship:  Data entry 
routinely encompasses entry from pointer devices such as mice and trackballs, 
tablets of various descriptions, incomplete keyboards such as numeric keypads, 
game controllers, etc.  These axes of expression are not represented in the 
graphology of ASCII.

In this world, the impedance mismatch to ASCII (and UNICODE, which could be 
seen as ASCII++, since it offers more glyphs but makes little attempt to 
increase the core semantics of graphology offered) invites examination.  In 
this world, it seems to me that core expressiveness of a graphology trumps 
ubiquity.  I'd like to see more languages being bold and looking beyond 
ASCII-derived symbology to find graphologies that allow for more powerful 
representation and manipulation of modern ontologies.

A concrete example:  ASCII only allows to the right of as a first class 
relationship in its representation ontology.  (The word at is formed as the 
character t to the right of the character a.)  Even concepts such as next 
line or backspace are second-order concepts encoded by reserved symbols 
borrowed from the representable namespace.  Advanced but still fundamental 
concepts such as subordinate to (i.e., subscript) are only available in 
so-called RichText systems.  Even more powerful concepts like contains (for 
example, a word which is composed of the symbol O containing inside it the 
symbol c) are not representable at all in the commonly available 
graphologies.  The people who attempt to express mathematical formulae 
routinely grapple with these limitations.  Even where a character set includes 
a root symbol, the underlying graphology does not implement rules by which 
characters can be arranged around it to represent the third root of x.

Many of the excruciating design exercises language designers go thru these days 
are largely driven by limitations of the ASCII++ graphology we assume to be 
sacrosanct.  (For example, the parts of this discussion thread analyzing the 
use of various compound-character combinations which intrude all the way to the 
parsing layer of a language because the core ASCII graphology doesn't feature 
enough bracket symbols.)

This barrier is artificial, historic in nature and need no longer constrain us 
because we have the luxury of modern high-powered computing systems which allow 
us to impose abstraction in important ways that were historically infeasible to 
allow us to achieve new kinds of expressive power and simplicity.

-- Mack


On Mar 13, 2012, at 8:11 AM, David Barbour wrote:

 
 
 On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote:
 On 2012-03-13 02:13PM, Julian Leviston wrote:
 What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or
 UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit the UTF-8
 files?
 
 Just saying' ;-) Hopefully you understand my point.
 
 You probably won't initially, so hopefully you'll meditate a bit on my
 response without giving a knee-jerk reaction.
 
 OK, I've thought about it and I still don't get it.  I understand that
 there have been a number of different text encodings, but I thought that
 the whole point of Unicode was to provide a future-proof way out of that
 mess.  And I could be totally wrong, but I have the impression that it
 has pretty good penetration.  I gather that some people who use the
 Cyrillic alphabet often use some code page and China and Japan use
 SHIFT-JIS or whatever in order to have a more compact representation,
 but that even there UTF-8 tools are commonly available.
 
 So I would think 

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-13 Thread Julian Leviston

On 14/03/2012, at 2:11 AM, David Barbour wrote:

 
 
 On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com wrote:
 On 2012-03-13 02:13PM, Julian Leviston wrote:
 What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or
 UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit the UTF-8
 files?
 
 Just saying' ;-) Hopefully you understand my point.
 
 You probably won't initially, so hopefully you'll meditate a bit on my
 response without giving a knee-jerk reaction.
 
 OK, I've thought about it and I still don't get it.  I understand that
 there have been a number of different text encodings, but I thought that
 the whole point of Unicode was to provide a future-proof way out of that
 mess.  And I could be totally wrong, but I have the impression that it
 has pretty good penetration.  I gather that some people who use the
 Cyrillic alphabet often use some code page and China and Japan use
 SHIFT-JIS or whatever in order to have a more compact representation,
 but that even there UTF-8 tools are commonly available.
 
 So I would think that the sensible thing would be to use UTF-8 and
 figure that anyone (now or in the future) will have tools which support
 it, and that anyone dedicated enough to go digging into your data files
 will have no trouble at all figuring out what it is.
 
 If that's your point it seems like a pretty minor nitpick.  What am I
 missing?
 
 Julian's point, AFAICT, is that text is just a class of storage that requires 
 appropriate viewers and editors, doesn't even describe a specific standard. 
 Thus, another class that requires appropriate viewers and editors can work 
 just as well - spreadsheets, tables, drawings. 
 
 You mention `data files`. What is a `file`? Is it not a service provided by a 
 `file system`? Can we not just as easily hide a storage format behind a 
 standard service more convenient for ad-hoc views and analysis (perhaps 
 RDBMS). Why organize into files? Other than penetration, they don't seem to 
 be especially convenient.
 
 Penetration matters, which is one reason that text and filesystems matter.  
 
 But what else has penetrated? Browsers. Wikis. Web services. It wouldn't be 
 difficult to support editing of tables, spreadsheets, drawings, etc. atop a 
 web service platform. We probably have more freedom today than we've ever had 
 for language design, if we're willing to stretch just a little bit beyond the 
 traditional filesystem+text-editor framework. 
 
 Regards,
 
 Dave

Perfectly the point, David. A token/character in ASCII is equivalent to a 
byte. In SHIFT-JIS, it's two, but this doesn't mean you can't express the 
equivalent meaning in them (ie by selecting the same graphemes) - this is 
called translation) ;-)

One of the most profound things for me has been understanding the ramifications 
of OMeta. It doesn't just parse streams of characters (whatever they are) 
in fact it doesn't care what the individual tokens of its parsing stream is. 
It's concerned merely with the syntax of its elements (or tokens) - how they 
combine to form certain rules - (here I mean valid patterns of grammar by 
rules). If one considers this well, it has amazing ramifications. OMeta invites 
us to see the entire computing world in terms of sets of 
problem-oriented-languages, where language is a liberal word that simply means 
a pattern of sequence of the constituent elements of a thing. To PEG, it 
basically adds proper translation and true object-orientism of individual 
parsing elements. This takes a while to understand, I think.

Formats here become languages, protocols are languages, and so are any 
other kind of representation system you care to name (computer programming 
languages, processor instruction sets, etc.).

I'm postulating, BGB, that you're perhaps so ingrained in the current modality 
and approach to thinking about computers, that you maybe can't break out of it 
to see what else might be possible. I think it was turing, wasn't it, who 
postulated that his turing machines could work off ANY symbols... so if that's 
the case, and your programming language grammar has a set of symbols, why not 
use arbitrary (ie not composed of english letters) ideograms for them? (I think 
these days we call these things icons ;-))

You might say but how will people name their variables - well perhaps for 
those things, you could use english letters, but maybe you could enforce that 
no one use more than 30 variables in their code in any one simple chunk, in 
which case build them in with the other ideograms.

I'm not attempting to build any kind of authoritative status here, merely 
provoke some different thought in you.

I'll take Dave's point that penetration matters, and at the same time, most 
new ideas have old idea constituents, so you can easily find some matter 
for people stuck in the old methodologies and thinking to relate to when 
building your new stuff ;-)

Regards,
Julian___
fonc 

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-13 Thread BGB

On 3/13/2012 4:37 PM, Julian Leviston wrote:


On 14/03/2012, at 2:11 AM, David Barbour wrote:




On Tue, Mar 13, 2012 at 5:42 AM, Josh Grams j...@qualdan.com 
mailto:j...@qualdan.com wrote:


On 2012-03-13 02:13PM, Julian Leviston wrote:
What is text? Do you store your text in ASCII, EBCDIC,
SHIFT-JIS or
UTF-8?  If it's UTF-8, how do you use an ASCII editor to edit
the UTF-8
files?

Just saying' ;-) Hopefully you understand my point.

You probably won't initially, so hopefully you'll meditate a bit
on my
response without giving a knee-jerk reaction.

OK, I've thought about it and I still don't get it.  I understand
that
there have been a number of different text encodings, but I
thought that
the whole point of Unicode was to provide a future-proof way out
of that
mess.  And I could be totally wrong, but I have the impression
that it
has pretty good penetration.  I gather that some people who use the
Cyrillic alphabet often use some code page and China and Japan use
SHIFT-JIS or whatever in order to have a more compact representation,
but that even there UTF-8 tools are commonly available.

So I would think that the sensible thing would be to use UTF-8 and
figure that anyone (now or in the future) will have tools which
support
it, and that anyone dedicated enough to go digging into your data
files
will have no trouble at all figuring out what it is.

If that's your point it seems like a pretty minor nitpick.  What am I
missing?


Julian's point, AFAICT, is that text is just a class of storage that 
requires appropriate viewers and editors, doesn't even describe a 
specific standard. Thus, another class that requires appropriate 
viewers and editors can work just as well - spreadsheets, tables, 
drawings.


You mention `data files`. What is a `file`? Is it not a service 
provided by a `file system`? Can we not just as easily hide a storage 
format behind a standard service more convenient for ad-hoc views and 
analysis (perhaps RDBMS). Why organize into files? Other than 
penetration, they don't seem to be especially convenient.


Penetration matters, which is one reason that text and filesystems 
matter.


But what else has penetrated? Browsers. Wikis. Web services. It 
wouldn't be difficult to support editing of tables, spreadsheets, 
drawings, etc. atop a web service platform. We probably have more 
freedom today than we've ever had for language design, if we're 
willing to stretch just a little bit beyond the traditional 
filesystem+text-editor framework.


Regards,

Dave


Perfectly the point, David. A token/character in ASCII is equivalent 
to a byte. In SHIFT-JIS, it's two, but this doesn't mean you can't 
express the equivalent meaning in them (ie by selecting the same 
graphemes) - this is called translation) ;-)


this is partly why there are codepoints.
one can work in terms of codepoints, rather than bytes.

a text editor may internally work in UTF-16, but saves its output in 
UTF-8 or similar.

ironically, this is basically what I am planning/doing at the moment.

now, if/how the user will go about typing UTF-16 codepoints, this is not 
yet decided.



One of the most profound things for me has been understanding the 
ramifications of OMeta. It doesn't just parse streams of 
characters (whatever they are) in fact it doesn't care what the 
individual tokens of its parsing stream is. It's concerned merely with 
the syntax of its elements (or tokens) - how they combine to form 
certain rules - (here I mean valid patterns of grammar by rules). If 
one considers this well, it has amazing ramifications. OMeta invites 
us to see the entire computing world in terms of sets of 
problem-oriented-languages, where language is a liberal word that 
simply means a pattern of sequence of the constituent elements of a 
thing. To PEG, it basically adds proper translation and true 
object-orientism of individual parsing elements. This takes a while to 
understand, I think.


Formats here become languages, protocols are languages, and so are 
any other kind of representation system you care to name (computer 
programming languages, processor instruction sets, etc.).


possibly.

I was actually sort of aware of a lot of this already though, but didn't 
consider it particularly relevant.



I'm postulating, BGB, that you're perhaps so ingrained in the current 
modality and approach to thinking about computers, that you maybe 
can't break out of it to see what else might be possible. I think it 
was turing, wasn't it, who postulated that his turing machines could 
work off ANY symbols... so if that's the case, and your programming 
language grammar has a set of symbols, why not use arbitrary (ie not 
composed of english letters) ideograms for them? (I think these days 
we call these things icons ;-))


You might say but how will people name their variables - well 
perhaps for those 

Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-12 Thread Josh McDonald
Since it's your own system end-to-end, why not just stop editing source as
a stream of ascii characters? Some kind of simple structured editor would
let you put whatever you please in strings without requiring any escaping
at all. It'd also make the parsing simpler :)

--

Enjoy every sandwich. - WZ

Josh 'G-Funk' McDonald
   -  j...@joshmcdonald.info



On 11 March 2012 03:38, BGB cr88...@gmail.com wrote:


 On 3/10/2012 2:21 AM, Wesley Smith wrote:

 most notable thing I did recently (besides some fiddling with getting a
 new
 JIT written), was adding a syntax for block-strings. I used[[ ... ]]
 rather than triple-quotes (like in Python), mostly as this syntax is more
 friendly to nesting, and is also fairly unlikely to appear by accident,
 and
 couldn't come up with much obviously better at the moment, {{ ...
 }}
 was another considered option (but is IIRC already used for something),
 as
 was the option of just using triple-quote (would work, but isn't readily
 nestable).


 You should have a look at Lua's long string syntax if you haven't already:

 [[ my
 long
 string]]


 this was briefly considered, but would have a much higher risk of clashes.

 consider someone wants to type a nested array:
 [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
 which is not so good if this array is (randomly) parsed as a string.

 preferable is to try to avoid syntax which is likely to appear by chance,
 as then programmers have to use extra caution to avoid any magic sigils
 which might have unintended behaviors, but can pop up randomly as a result
 of typing code using only more basic constructions (I try to avoid this
 much as I do ambiguities in general, and is partly also why, IMO, the
 common AS, T syntax for templates/generics is a bit nasty).


 the syntax:
 [[ ... ]]

 was chosen as it had little chance of clashing with other valid syntax
 (apart from, potentially, the CDATA end marker for XML, which at present
 would need to be escaped if using this syntax for globs of XML).

 it is possible, as the language does include unary  and  operators,
 which could, conceivably, be applied to a nested array. this is, however,
 rather unlikely, and could be fixed easily enough with a space.

 as-is, they have an even-nesting rule.
 WRT uneven-nesting, they can be escaped via '\' (don't really like, as it
 leaves the character as magic...).

 [[
 this string has an embedded \]]...
 but this is ok.
 ]]


 OTOH (other remote possibilities):
 { ... }
 was already used for insert-here expressions in XML literals:
 foo{generateSomeNode()}/**foo

 (...) or ((...)) just wouldn't work (high chance of collision).

 #(...), #[...], and #{...} are already in use (tuple, float vector or
 matrix, and list).

 example:
 vector: #[0, 0, 0]
 quaternion: #[0, 0, 0, 1]Q
 matrix: #[[1, 0, 0] [0, 1, 0] [0, 0, 1]]
 list: #{#foo, 2, 3; #v}
 note: (...) parens, [...] array, {...} dictionary/object (example: {a: 3,
 y: 4}).

 @(...), @[...], and @{...} are still technically available.

 also possible:
 /[...]/ , /[[...]]/
 would be passable mostly only as /.../ is already used for regex syntax
 (inherited from JS...).

 hmm:
 ? ... ?
 ? ... ?
 (available, currently syntactically invalid).

 likewise:
 \ ... \, ...
 | ... |

 ...

 so, the issue is mostly lacking sufficient numbers of available (good)
 brace types.
 in a few other cases, this lack has been addressed via the use of keywords
 and type-suffixes.


 but, a keyword would be lame for a string, and a suffix wouldn't work.


  You can nest by matching the number of '=' between the brackets:

 [===[
 a
 long
 string [=[ with a long string inside it ]=]
 xx
 ]===]


 this would be possible, as otherwise this syntax would not be
 syntactically valid in the language.

 [=[...]=]
 would be at least possible.

 not that I particularly like this syntax though...


 (inlined):

 On 3/10/2012 2:43 AM, Ondřej Bílka wrote:

 On Sat, Mar 10, 2012 at 01:21:42AM -0800, Wesley Smith wrote:

 You should have a look at Lua's long string syntax if you haven't
 already:

 Better to be consistent with rest of scripting languages(bash,ruby,perl,*
 *python)
 and use heredocs.


 blarg...

 heredoc syntax is nasty IMO...

 I deliberately didn't use heredocs.

 if I did, I would probably use the syntax:
 #END; ... END
 or similar...


 Python uses triple-quotes, which I had also considered (just, they
 couldn't nest):
 
 lots of stuff...
 over multiple lines...
 


 this would mean:
 [[
 lots of stuff...
 over multiple lines...
 ]]
 possibly also with the Python syntax:
 
 ...
 


 or such...

 __**_
 fonc mailing list
 fonc@vpri.org
 http://vpri.org/mailman/**listinfo/fonchttp://vpri.org/mailman/listinfo/fonc

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-12 Thread BGB

On 3/12/2012 6:31 PM, Josh McDonald wrote:
Since it's your own system end-to-end, why not just stop editing 
source as a stream of ascii characters? Some kind of simple structured 
editor would let you put whatever you please in strings without 
requiring any escaping at all. It'd also make the parsing simpler :)




although theoretically possible, I wouldn't really trust not having the 
ability to use conventional text editors whenever need-be (or mandate 
use of a particular editor).


for most things I am using text-based formats, including for things like 
world-maps and 3D models (both are based on arguably mutilated versions 
of other formats: Quake maps and AC3D models). the power of text is 
that, if by some chance someone does need to break out a text editor and 
edit something, the format wont hinder them from doing so.



but, yes, that Inventing on Principle / Magic Ink video did rather get 
my interest up in terms of wanting to support much more streamlined 
script-editing interface.


I recently had a bit of fun writing small script fragments to blow up 
light sources and other things, and figure if I can get a more advanced 
text-editing interface thrown together, more interesting things might 
also be possible.


blow the lights, all nearby light sources explode (with fiery particle 
explosion effects and sounds), and the area goes dark.


current leaning is to try to throw something together vaguely 
QBasic-like (with a proper text editor, and probably F5 as the Run 
key, ...).


as-is, I already have an ed / edlin-style text editor, and ALT + 1-9 as 
console-change keys (and now have multiple consoles, sort of like Linux 
or similar), ... was considering maybe the fancier text editor would use 
ALT-SHIFT + A-Z for switching between modules. will see what I can do here.



or such...



--

Enjoy every sandwich. - WZ

Josh 'G-Funk' McDonald
   - j...@joshmcdonald.info mailto:j...@joshmcdonald.info




___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-12 Thread Julian Leviston

On 13/03/2012, at 1:21 PM, BGB wrote:

 although theoretically possible, I wouldn't really trust not having the 
 ability to use conventional text editors whenever need-be (or mandate use of 
 a particular editor).
 
 for most things I am using text-based formats, including for things like 
 world-maps and 3D models (both are based on arguably mutilated versions of 
 other formats: Quake maps and AC3D models). the power of text is that, if by 
 some chance someone does need to break out a text editor and edit something, 
 the format wont hinder them from doing so.


What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or UTF-8? 
If it's UTF-8, how do you use an ASCII editor to edit the UTF-8 files?

Just saying' ;-) Hopefully you understand my point.

You probably won't initially, so hopefully you'll meditate a bit on my response 
without giving a knee-jerk reaction.

Julian___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-12 Thread David Barbour
On Mon, Mar 12, 2012 at 8:13 PM, Julian Leviston jul...@leviston.netwrote:


 On 13/03/2012, at 1:21 PM, BGB wrote:

 although theoretically possible, I wouldn't really trust not having the
 ability to use conventional text editors whenever need-be (or mandate use
 of a particular editor).

 for most things I am using text-based formats, including for things like
 world-maps and 3D models (both are based on arguably mutilated versions of
 other formats: Quake maps and AC3D models). the power of text is that, if
 by some chance someone does need to break out a text editor and edit
 something, the format wont hinder them from doing so.



 What is text? Do you store your text in ASCII, EBCDIC, SHIFT-JIS or
 UTF-8? If it's UTF-8, how do you use an ASCII editor to edit the UTF-8
 files?

 Just saying' ;-) Hopefully you understand my point.

 You probably won't initially, so hopefully you'll meditate a bit on my
 response without giving a knee-jerk reaction.


Text is more than an arbitrary arcane linear sequence of characters. Its
use suggests TRANSPARENCY - that a human could understand the grammar and
content, from a relatively small sample, and effectively hand-modify the
content to a particular end.

If much of our text consisted of GUIDs:
  {21EC2020-3AEA-1069-A2DD-08002B30309D}
This might as well be
  {BLAHBLAH-BLAH-BLAH-BLAH-BLAHBLAHBLAH}

The structure is clear, but its meaning is quite opaque.

That said, structured editors are not incompatible with an underlying text
format. I think that's really the best option.

Regarding multi-line quotes... well, if you aren't fixated on ASCII you
could always use unicode to find a whole bunch more brackets:

http://www.fileformat.info/info/unicode/block/cjk_symbols_and_punctuation/images.htm

http://www.fileformat.info/info/unicode/block/miscellaneous_technical/images.htm

http://www.fileformat.info/info/unicode/block/miscellaneous_mathematical_symbols_a/images.htm
Probably more than you know what to do with.

Regards,

Dave
___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


[fonc] Block-Strings / Heredocs (Re: Magic Ink and Killing Math)

2012-03-10 Thread BGB


On 3/10/2012 2:21 AM, Wesley Smith wrote:

most notable thing I did recently (besides some fiddling with getting a new
JIT written), was adding a syntax for block-strings. I used[[ ... ]]
rather than triple-quotes (like in Python), mostly as this syntax is more
friendly to nesting, and is also fairly unlikely to appear by accident, and
couldn't come up with much obviously better at the moment, {{ ... }}
was another considered option (but is IIRC already used for something), as
was the option of just using triple-quote (would work, but isn't readily
nestable).


You should have a look at Lua's long string syntax if you haven't already:

[[ my
long
string]]


this was briefly considered, but would have a much higher risk of clashes.

consider someone wants to type a nested array:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
which is not so good if this array is (randomly) parsed as a string.

preferable is to try to avoid syntax which is likely to appear by 
chance, as then programmers have to use extra caution to avoid any 
magic sigils which might have unintended behaviors, but can pop up 
randomly as a result of typing code using only more basic constructions 
(I try to avoid this much as I do ambiguities in general, and is partly 
also why, IMO, the common AS, T syntax for templates/generics is a 
bit nasty).



the syntax:
[[ ... ]]

was chosen as it had little chance of clashing with other valid syntax 
(apart from, potentially, the CDATA end marker for XML, which at present 
would need to be escaped if using this syntax for globs of XML).


it is possible, as the language does include unary  and  
operators, which could, conceivably, be applied to a nested array. this 
is, however, rather unlikely, and could be fixed easily enough with a space.


as-is, they have an even-nesting rule.
WRT uneven-nesting, they can be escaped via '\' (don't really like, as 
it leaves the character as magic...).


[[
this string has an embedded \]]...
but this is ok.
]]


OTOH (other remote possibilities):
{ ... }
was already used for insert-here expressions in XML literals:
foo{generateSomeNode()}/foo

(...) or ((...)) just wouldn't work (high chance of collision).

#(...), #[...], and #{...} are already in use (tuple, float vector or 
matrix, and list).


example:
vector: #[0, 0, 0]
quaternion: #[0, 0, 0, 1]Q
matrix: #[[1, 0, 0] [0, 1, 0] [0, 0, 1]]
list: #{#foo, 2, 3; #v}
note: (...) parens, [...] array, {...} dictionary/object (example: {a: 
3, y: 4}).


@(...), @[...], and @{...} are still technically available.

also possible:
/[...]/ , /[[...]]/
would be passable mostly only as /.../ is already used for regex 
syntax (inherited from JS...).


hmm:
? ... ?
? ... ?
(available, currently syntactically invalid).

likewise:
\ ... \, ...
| ... |

...

so, the issue is mostly lacking sufficient numbers of available (good) 
brace types.
in a few other cases, this lack has been addressed via the use of 
keywords and type-suffixes.



but, a keyword would be lame for a string, and a suffix wouldn't work.



You can nest by matching the number of '=' between the brackets:

[===[
a
long
string [=[ with a long string inside it ]=]
xx
]===]


this would be possible, as otherwise this syntax would not be 
syntactically valid in the language.


[=[...]=]
would be at least possible.

not that I particularly like this syntax though...


(inlined):

On 3/10/2012 2:43 AM, Ondřej Bílka wrote:

On Sat, Mar 10, 2012 at 01:21:42AM -0800, Wesley Smith wrote:
You should have a look at Lua's long string syntax if you haven't 
already: 

Better to be consistent with rest of scripting languages(bash,ruby,perl,python)
and use heredocs.


blarg...

heredoc syntax is nasty IMO...

I deliberately didn't use heredocs.

if I did, I would probably use the syntax:
#END; ... END
or similar...


Python uses triple-quotes, which I had also considered (just, they 
couldn't nest):


lots of stuff...
over multiple lines...



this would mean:
[[
lots of stuff...
over multiple lines...
]]
possibly also with the Python syntax:

...



or such...

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc