Re: [Readable-discuss] Proposed update and expansion of SRFI-49 (I-expressions) - indented s-expressions

David A. Wheeler Sun, 01 Jul 2012 17:07:57 -0700

> Well, currently the specced parser will actively skip empty lines to
> look for the continuation of the body. So no amount of ENTER ENTER
> will actually get the expression read in on the REPL. LOL.


It depends on what you means by "specced" :-).  I take as the spec:
 http://www.dwheeler.com/readable/sweet-expressions.html
which specifically says "A blank line always terminates a datum".

> There are a few reasons why I skip over empty lines instead of
> completing and returning the expression:
> 
> 1. By doing so, I can treat either CR or LF as eol. As it happens, a
> DOS encoding means that the eol is actually encoded as CR LF. But by
> skipping over empty lines, the "extra" LF is simply treated as an
> empty line and skipped over.

There's no reason to do so.  Current systems will do that translation
for you, and if you want to do it yourself, you can act only on LF
and ignore CR.  Unix, Linux, and current MacOS uses LF (\n);
Windows, MS-DOS, and CP/M use CR LN (\r\n).  Only old systems like
old MacOS systems, Apple ][s, and ancient QNX systems
use a raw CR (\r) for end of line.

> 2. Because in a REALLY long program, we want to separate code with
> empty lines sometimes. Even "inner", indented code. In particular,
> consider that the "module" syntax in R6RS requires all module contents
> to be sub-expressions of the upper module syntax form: so, every
> internal function must be indented within that form. If empty lines
> ended an expression, then the writer of the module can't separate
> functions of the module with empty lines, because the expression being
> read in is the module expression..

Yes, but not being able to use ENTER ENTER at the interactive line is
really really annoying.

I REALLY don't like the loss of the ENTER ENTER functionality.
I tried it earlier, and hated it. I think you and others would hate it too.

I do agree that there needs to be a way to separate portions of a
larger expression.
The solution I came up with completely ignoring comment-only lines.
If you want to separate lines with content, just create a comment-only line.
By ensuring that the comment indentation is irrelevant, it's
not a hardship; that way, you can insert long comment lines without problems.

> One alternative is to simply make the following changes:
> 
> 1. rename eol-empty-lines to eol-comment-lines.
> 2. Modify eol-empty-lines to eol-comment-lines -> htspace* eol comment-line*
> 3. Modify eol to:
> 
> eol -> CR LF
> eol -> CR
> eol -> LF
> 
> 4. Add comment-line -> htspace* COMMENT-MARKER (not eol)* eol

Are we saying the same thing?  This looks similar.
Though I don't think you need to support CR for eol.


> The original purpose of the SPLICE rule was to support Arc and CL. In
> addition, Egil Moller mentioned that GROUP was intended to simply be
> an "invisible" symbol....
> i.e. it doesn't *actually* wrap an additional () layer: group is just
> a symbol that gets dropped "magically", even though indentation
> processing will see group.
> 
> However changing "group" to "\" and changing its meaning to "wrap
> another layer of ()" means:
> 
> \ foo bar
> 
> ===>
> 
> ((foo bar))
> 
> So this is definitely a change.


Yes, I'm sorry, I didn't make this clear in my reply.
Such a semantic definitely WOULD be a change from the current spec.

You're absolutely right about the original meaning of "\" and group.
Indeed, I say so in my current sweet-expression definition
as the proposed meaning for "\" at the beginning:
"Otherwise, if it's at the beginning of a line (after 0+ spaces/tabs),
it's ignored - but the first non-whitespace character's indentation level
is used."

But this discussion about using "\" as the group marker is making me wonder if
its meaning should change as well, so I started exploring that and didn't
say that it was a potential CHANGE (sorry about that confusion!).

I think one of the reasons to make "group" invisible is because Moller
used an ordinary symbol ("group") for a different purpose, and he needed
to be able to escape it.  Clearly, there needs to be a way to handle that.
But switching to "\" as the "group" marker actually makes possible another
change: We don't need a way to escape an ordinary symbol, because the
marker is no longer an ordinary symbol.  So if we're going to move away
from using "group", a different meaning for "\" might make sense.


> Note that my original proposal, for Arc, was this:
> 
> if
> cond1
> \ expr1
> cond2
> \ expr2
> \ expr3
> 
> Which was intended to be:
> 
> (if
> cond1
> expr1
> cond2
> expr2
> expr3)
> 
> But if we change \ to mean "definitely add another layer of ()"
> instead of "act as if we indented up to here, but skip this symbol"
> (the way GROUP currently acts), then the Arc example is parsed as:
> 
> (if
> cond1
> (expr1)
> cond2
> (expr2)
> (expr3))
> 
> So I'm wondering if we're going forward too fast and forgetting why
> the rule got there the first place.

That is *definitely* possible.  :-).
I've been meaning to getting back to sweet-expressions
but just haven't for a little while. Your emails are giving me the excuse,
as well as some interesting ideas.

It's just that every time something changes, it's worth examining to see
if other things should change too.

So let's call these two semantic options:
1. "initial-\-ignored" which is the current proposed semantic
2. "initial-\-creates-new list"

Under option #1, initial-\-ignored, the Arc "if" could be expressed as:
if
..cond1
..\ expr1
..cond2
..\ expr2
..\ expr3

Under both option #1 and option #2, it could be expressed as:

if
..cond1 \ expr1
..cond2 \ expr2
..\ expr3


This means that:
 \ a b
   c d
   e f
Under option#1 => "(a b (c d) (e f))"  (the leading \ is ignored)
Under option#2 => "((a b (c d) (e f))" (the leading \ adds a list level
 when there is the other material on its line)


An advantage of option #1 is that if both cond1 and expr1 get long,
but fit well enough on their own lines, it's easier to show the relationship
of the condition and the expression.

An advantage of option #2 (my new crazy idea)
is that expressions like "let" expressions can
cuddle better than in the past, because an initial "\" actually
does something (instead of being a no-op).  For example,
(let
  ((x (+ 2 3)) (y (+ 5 6)))
  (+ x y))
Can become:
let
..\ x{2 + 3} y{5 + 6}
..{x + y}

Currently, under option#1, a "let" like this needs another line to
create the sublists:
let
..\
....x{2 + 3} y{5 + 6}
..{x + y}

or you use parens (and we're trying to reduce their number):
let
..(x{2 + 3} y{5 + 6})
..{x + y}



Basically, if we're going to change the group marker anyway
(a REALLY MAJOR CHANGE), we may as well explore if we want to tweak
its semantics as well.  If we make "\" with following whitespace be
the group marker *AND* the split marker, we should figure out the
best way of using it.


> The reason why I think GROUP and SPLICE can be the same is because of
> Egil Moller's explanation that GROUP is intended to be an invisible
> symbol when at the head of an indentation. So, SPLICE and Egil
> Moller's GROUP act the same when at the start of a line.

Sure, but we can also just say "initial \ is special", if it suits us.

> If you're changing the meaning of GROUP from the meaning as expressed
> by Egil, then GROUP = SPLICE breaks.

We can redefine.  We'd be changing the symbol for GROUP, so if we do that,
it wouldn't be surprising if the semantics changed too :-).


That said, I'm simply floating the idea at this point.  Both
option#1 ("initial-\-ignored") and option#2 ("initial-\-creates-new list")
seem plausible enough.

I'm really more concerned about making it pleasant to use interactively
(if not ENTER ENTER, then there needs to be some easy & obvious way to
execute something... and the alternatives I've seen seem awful to me).


>> ... I wonder if we should allow leading periods as an alternative
> to whitespace (?) when indentation is significant....
> LOL. MAYBE.

I just finished re-reading "Ender's Game"; at the end of chapter
"Veni Vidi Vici", Bean "thought of a half dozen ideas before he
went to sleep. Ender would be pleased - every one of them was stupid."


> curly-infix doesn't have a parser spec - anyone want to write one?

There's not much to spec, the syntax is exactly the same as (...),
except that you use {} instead.

> If you really want a split implementation we could just factor out the
> nothing sentinel into a separate file and import it. But currently
> I'm leaning towards a single file because of module handling worries.

I understand.  Could you at least put them in separate parts of the file,
so that they COULD be separated by others?


--- David A. Wheeler



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Readable-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/readable-discuss

Re: [Readable-discuss] Proposed update and expansion of SRFI-49 (I-expressions) - indented s-expressions

Reply via email to