Re: [Readable-discuss] $ at end of line bug?
BTW the subject is misleading, shouldn't we be discussing this on Beni's proposal thread? On Sun, Feb 24, 2013 at 4:55 AM, Alan Manuel Gloria wrote: > On Sun, Feb 24, 2013 at 12:17 AM, David A. Wheeler > wrote: >> As I posted earlier, I'm *really* uncomfortable with losing the ability to >> auto-check the grammar. But I do understand the notion that extending >> SUBLIST, especially to handle "let" and similar constructs, could be useful. >> >> Beni Cherniavsky-Paskin: >>>[I'm asking this because if it's 'fixed, my >>>closing-SUBLIST-by-unmatched-dedent would allow: >>>let $ >>>! ! x $ compute 'x >>>! ! y $ compute 'y >>>! body... >> >> I have a counter-proposal, maybe I can call it "Beni-Lite" :-) ??? And I >> even have a sample implementation that we can try out. >> >> I've just posted to the "devel" branch a change to the ANTLR implementation >> that permits closing SUBLIST by an unmatched DEDENT, but *only* if the "$" >> is the last item on a line (and there's something before "$" other than >> indent chars). This limited semantic ("Beni-lite"?) covers the primary use >> cases I've seen, *AND* I've found a way to formulate it so that we can >> continue to use ANTLR's grammar checking and run-time input checking. >> >> To do this, I've tweaked the indent processor. After you dedent, if the >> dedent doesn't match the parent indent, it then generates a RE_INDENT. This >> retains a whole lot of error-checking, both of the BNF and of the input >> during processing. This means that: >> let $ >> ! ! var1 value1 >> ! body... >> >> becomes: >> let SUBLIST EOL >> INDENT var1 value2 EOL >> DEDENT RE_INDENT body... >> >> >> >> It includes a few test cases, which show how it works: >> >> let $ >> ! ! var1 value1 >> ! body... >> ; ==> (let ((var1 value1)) body...) >> >> let $ >> ! ! var1 value1 >> ! ! var2 value2 >> ! body... >> ; ==> (let ((var1 value1) (var2 value2)) body...) >> >> let $ >> ! ! var1 value1 >> ! ! var2 value2 >> ! ! var3 value3 >> ! body1 param1 >> ! body2 param2 >> ; ==> >> ; (let ((var1 value1) (var2 value2) (var3 value3)) >> ;(body1 param1) (body2 param2)) >> >> >> >> Even this backed-off version is complicated, but it's not MUCH more >> complicated, and it does retain all the error-checking that I'm very loathe >> to drop. It only works when "$" is at the end of the line... but that seems >> like a reasonable limitation. >> >> Comments? >> >> I *especially* want to hear from Beni Cherniavsky-Paskin and Alan Manuel >> Gloria, since both have expressed an interest in this kind of capability, >> but I certainly want to hear from all. I want to make this final notation a >> good balance between "simple" and "capable"... I worry that even this subset >> may be a step too far. >> > > I think that, conceptually, having a limitation is an additional > complication when teaching the notation. > > When explaining the Beni formulation we can say "A $ indicates a > further indent, with the promise that either you will have a > 'staggered dedent' like FIGURE X, or that you will close the sublist > with a dedent to a 'real' indentation level on this line or a parent > line of this line." > > Granted we could just mandate these patterns, but I worry that we are > now slipping into the "notation is tied to underlying semantic" bug. > Or in this case, "notation is tied to underlying legacy syntax". > > I'd rather have the full Beni formulation of SUBLIST or the classic > 0.4 formulation, in that preference order. > > I'll admit that I don't have a use for the full Beni formulation other > than for let, though. I suspect there may be further use cases; but I > haven't found any others yet. > > -- > > Beni-full formulation, informally: > > The SUBLIST or "$" marker indicates that the text following it on that > line will be indented by one more "virtual" indentation level than the > current line. The direct child lines of this line will then be > considered child lines of only the text after the last SUBLIST marker, > and the text after the SUBLIST marker will be considered a child of > the text before the SUBLIST marker. You can also chain SUBLIST > markers, like so: > > probe $ call/cc $ lambda (exit) > ! exit 42 > ==> > probe > ! call/cc > ! ! lambda (exit) > ! ! ! exit 42 > > In addition, the SUBLIST marker allows a "staggered dedent", like so: > > foo $ a b > ! ! c > ! d > > In this case, the "a b" text has as its child the directly succeeding > child line of its line, while the line with "staggered" dedent will be > the next sibling of the "a b" text. So the above is equivalent to: > > foo > ! a b > ! ! c > ! d > > In general, the "staggered dedent" capability of SUBLIST is not used; > you are more likely to just close it directly: > > foo $ a b > ! c > d > ===> > foo > ! a b > ! ! c > d > > However, the staggered dedent is useful for LET: > > let $ > ! ! var > ! ! value > ! ! var2 > ! ! value2 > ! body > ! ... ---
Re: [Readable-discuss] $ at end of line bug?
On Sun, Feb 24, 2013 at 12:17 AM, David A. Wheeler wrote: > As I posted earlier, I'm *really* uncomfortable with losing the ability to > auto-check the grammar. But I do understand the notion that extending > SUBLIST, especially to handle "let" and similar constructs, could be useful. > > Beni Cherniavsky-Paskin: >>[I'm asking this because if it's 'fixed, my >>closing-SUBLIST-by-unmatched-dedent would allow: >>let $ >>! ! x $ compute 'x >>! ! y $ compute 'y >>! body... > > I have a counter-proposal, maybe I can call it "Beni-Lite" :-) ??? And I > even have a sample implementation that we can try out. > > I've just posted to the "devel" branch a change to the ANTLR implementation > that permits closing SUBLIST by an unmatched DEDENT, but *only* if the "$" is > the last item on a line (and there's something before "$" other than indent > chars). This limited semantic ("Beni-lite"?) covers the primary use cases > I've seen, *AND* I've found a way to formulate it so that we can continue to > use ANTLR's grammar checking and run-time input checking. > > To do this, I've tweaked the indent processor. After you dedent, if the > dedent doesn't match the parent indent, it then generates a RE_INDENT. This > retains a whole lot of error-checking, both of the BNF and of the input > during processing. This means that: > let $ > ! ! var1 value1 > ! body... > > becomes: > let SUBLIST EOL > INDENT var1 value2 EOL > DEDENT RE_INDENT body... > > > > It includes a few test cases, which show how it works: > > let $ > ! ! var1 value1 > ! body... > ; ==> (let ((var1 value1)) body...) > > let $ > ! ! var1 value1 > ! ! var2 value2 > ! body... > ; ==> (let ((var1 value1) (var2 value2)) body...) > > let $ > ! ! var1 value1 > ! ! var2 value2 > ! ! var3 value3 > ! body1 param1 > ! body2 param2 > ; ==> > ; (let ((var1 value1) (var2 value2) (var3 value3)) > ;(body1 param1) (body2 param2)) > > > > Even this backed-off version is complicated, but it's not MUCH more > complicated, and it does retain all the error-checking that I'm very loathe > to drop. It only works when "$" is at the end of the line... but that seems > like a reasonable limitation. > > Comments? > > I *especially* want to hear from Beni Cherniavsky-Paskin and Alan Manuel > Gloria, since both have expressed an interest in this kind of capability, but > I certainly want to hear from all. I want to make this final notation a good > balance between "simple" and "capable"... I worry that even this subset may > be a step too far. > I think that, conceptually, having a limitation is an additional complication when teaching the notation. When explaining the Beni formulation we can say "A $ indicates a further indent, with the promise that either you will have a 'staggered dedent' like FIGURE X, or that you will close the sublist with a dedent to a 'real' indentation level on this line or a parent line of this line." Granted we could just mandate these patterns, but I worry that we are now slipping into the "notation is tied to underlying semantic" bug. Or in this case, "notation is tied to underlying legacy syntax". I'd rather have the full Beni formulation of SUBLIST or the classic 0.4 formulation, in that preference order. I'll admit that I don't have a use for the full Beni formulation other than for let, though. I suspect there may be further use cases; but I haven't found any others yet. -- Beni-full formulation, informally: The SUBLIST or "$" marker indicates that the text following it on that line will be indented by one more "virtual" indentation level than the current line. The direct child lines of this line will then be considered child lines of only the text after the last SUBLIST marker, and the text after the SUBLIST marker will be considered a child of the text before the SUBLIST marker. You can also chain SUBLIST markers, like so: probe $ call/cc $ lambda (exit) ! exit 42 ==> probe ! call/cc ! ! lambda (exit) ! ! ! exit 42 In addition, the SUBLIST marker allows a "staggered dedent", like so: foo $ a b ! ! c ! d In this case, the "a b" text has as its child the directly succeeding child line of its line, while the line with "staggered" dedent will be the next sibling of the "a b" text. So the above is equivalent to: foo ! a b ! ! c ! d In general, the "staggered dedent" capability of SUBLIST is not used; you are more likely to just close it directly: foo $ a b ! c d ===> foo ! a b ! ! c d However, the staggered dedent is useful for LET: let $ ! ! var ! ! value ! ! var2 ! ! value2 ! body ! ... -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
As I posted earlier, I'm *really* uncomfortable with losing the ability to auto-check the grammar. But I do understand the notion that extending SUBLIST, especially to handle "let" and similar constructs, could be useful. Beni Cherniavsky-Paskin: >[I'm asking this because if it's 'fixed, my >closing-SUBLIST-by-unmatched-dedent would allow: >let $ >! ! x $ compute 'x >! ! y $ compute 'y >! body... I have a counter-proposal, maybe I can call it "Beni-Lite" :-) ??? And I even have a sample implementation that we can try out. I've just posted to the "devel" branch a change to the ANTLR implementation that permits closing SUBLIST by an unmatched DEDENT, but *only* if the "$" is the last item on a line (and there's something before "$" other than indent chars). This limited semantic ("Beni-lite"?) covers the primary use cases I've seen, *AND* I've found a way to formulate it so that we can continue to use ANTLR's grammar checking and run-time input checking. To do this, I've tweaked the indent processor. After you dedent, if the dedent doesn't match the parent indent, it then generates a RE_INDENT. This retains a whole lot of error-checking, both of the BNF and of the input during processing. This means that: let $ ! ! var1 value1 ! body... becomes: let SUBLIST EOL INDENT var1 value2 EOL DEDENT RE_INDENT body... It includes a few test cases, which show how it works: let $ ! ! var1 value1 ! body... ; ==> (let ((var1 value1)) body...) let $ ! ! var1 value1 ! ! var2 value2 ! body... ; ==> (let ((var1 value1) (var2 value2)) body...) let $ ! ! var1 value1 ! ! var2 value2 ! ! var3 value3 ! body1 param1 ! body2 param2 ; ==> ; (let ((var1 value1) (var2 value2) (var3 value3)) ;(body1 param1) (body2 param2)) Even this backed-off version is complicated, but it's not MUCH more complicated, and it does retain all the error-checking that I'm very loathe to drop. It only works when "$" is at the end of the line... but that seems like a reasonable limitation. Comments? I *especially* want to hear from Beni Cherniavsky-Paskin and Alan Manuel Gloria, since both have expressed an interest in this kind of capability, but I certainly want to hear from all. I want to make this final notation a good balance between "simple" and "capable"... I worry that even this subset may be a step too far. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
Alan Manuel Gloria: > Instead, I think this calls for a more complicated indentation preprocessor: > 1. If you encounter a SUBLIST, emit an INDENT (or EOL-INDENT since > that seems to be your preferred formulation) and push ? on the indent > stack This algorithm pushes indents for SUBLIST even if SUBLIST is *not* at the end. That's flexible, but rather complicated. I think we can simplify this "weird indentation" processing greatly by only accepting these odd dedents when "$" is at the end, to close that ending "$". That's the only use case I've seen. Thoughts? Too limiting? If we do that (see my BNF example), it not only makes things simpler... I believe it completely eliminates any ambiguity of matching the "$" to the correct "partial dedent" (as I'm calling it). I'm trying to look at this idea from various angles. I want whatever grammar is finalized to be as correct as we can make it, and I see ANTLR's grammar-checking mechanisms as a key tool to help do that. I am very loathe to give that up. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
The BNF tweak I posted earlier may make it not TOO hard to implement $-at-the-end with the varying indents, per Beni Cherniavsky-Paskin. That's not completely clear to me, but... maybe. With the following change, ANTLR claims that the grammar is unambiguous, and if the indentation processor will generate partial dedents for an indented line between current and parent, I think it's work: -| comment_eol indent sub_b=body {$v = append($head.v, list($sub_b.v));} +| comment_eol indent sub_b=body + ( dedent_partial partial_out=body + {$v = append(append($head.v, list($sub_b.v)), $partial_out.v);} + | empty {$v = append($head.v, list($sub_b.v));} ) ) Of course, the other part is, SHOULD we do something like this? In particular, can we do without? The key use case for $-at-the-end that I've seen is let statements, e.g.: let $ ! ! x $ compute 'x ! ! y $ compute 'y ! body... ; ==> (let ((x (compute 'x)) (y (compute 'y))) body...) and: let $ ! ! x $ compute 'x ! body... ; ===> (let ((x (compute 'x))) body...) We can support these two use cases easily WITHOUT supporting $-at-the-end with varying child indentation, just using the current draft notation. Examples: let ! \\ ! ! x $ compute 'x ! ! y $ compute 'y ! body... ; ==> (let ((x (compute 'x)) (y (compute 'y))) body...) For the one-variable case, we can do: let $ ! $ x $ compute 'x ! body... ; ===> (let ((x (compute 'x))) body...) --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
I earlier said: > > I plan to do some experimenting with the ANTLR BNF, and see if there's a way > > to tweak what we have while keeping the automated analysis working. > > Suggestions welcome. Here's what I had in mind. Currently: it_expr returns [Object v] : head ... | SUBLIST hspace* /* head SUBLIST ... case */ (sub_i=it_expr {$v=append($head.v, list($sub_i.v));} | comment_eol indent sub_b=body {$v = append($head.v, list($sub_b.v));} ) ... Perhaps we could have the indent processor detect a partial dedent, and have it match to the most recent line ending with "$": it_expr returns [Object v] : head ... | SUBLIST hspace* /* head SUBLIST ... case */ (sub_i=it_expr {$v=append($head.v, list($sub_i.v));} | comment_eol indent sub_b=body ( dedent_partial partial_out=body {$v = i_have_no_idea();} | empty {$v = append($head.v, list($sub_b.v));} ) ) ... The "dedent_partial" is generated by the indent processor when it sees an indent that is strictly BETWEEN the current indent and the parent indent (and the partial dedent then becomes the new current indent). *IF* we can do it this way, *AND* if it's even a good idea, this would preserve our ability to automatically analyze nearly all the grammar. Of course, I'm still not so sure we *should* do this, but the first step is to evaluate the pros and cons. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
On Fri, Feb 22, 2013 at 7:51 AM, David A. Wheeler wrote: > Alan Manuel Gloria: >> I think a different approach is better. > > Definitely a possibility; I'm still trying to figure out if it's possible to > do this in a simpler/cleaner way. > > BTW, I'm realizing that this creates yet another potential problem: Disabled > error-checking. > With this, a line with indentation that doesn't match its parents might be > okay (!). > We may be able to quickly detect that and deal with it; I'd like to make sure > we > can still quickly detect bad indents. > > >> The problem is that DEDENT_PARTIAL cannot give information about >> *how many* ? exist on the indent stack. > > But how much of that information do we really need? > > >> Instead, I think this calls for a more complicated indentation preprocessor: > > (pft)... That's the sound of my head exploding :-). > > I've read that several times and I don't think I fully understand it. > I understand each line separately, but not why you believe they > work properly together. I'm imagining trying to create a math proof that > this algorithm is correct... and failing completely. Well, so far the only property I can prove is that the number of INDENT's emitted is equal to the the number of DEDENT's emitted. This is due to the fact that every event that pushes an entry on the stack also emits exactly one INDENT, and any event that pops an entry off the stack also emits a DEDENT; the only exception is the part where the top ? item is replaced, and that does not emit either an INDENT or DEDENT, while stack height is preserved. Thus each stack entry represents a pending INDENT that is not yet matched by a DEDENT. As long as we empty the stack at EOF, then every INDENT gets paired at some point with DEDENT. As for SUBLIST working properly, what exactly about SUBLIST should we prove? What needs to get proven in an indentation processor? Basically, this indentation processor is just a more formal expression of what Beni said in his initial email about DEDENT and SUBLIST. > > Also, I can't begin to imagine *explaining* that algorithm to someone. > While the BNF has many lines, many people have had lots of training in > BNFs and can pick them up quickly. Indentation processing like this... not > so much. > > Granted, you could argue that's a limitation on MY end, and that's probably > true enough. But if I have trouble understanding it, I doubt I'm the only > one. > > >> Basically, the formulation would remove all mention of GROUP_SPLIT and >> SUBLIST (and all branches where they occur) but complicate the >> indentation preprocessor. > > That's a significant part of the definition of these expressions, > rendering them basically invisible to automated checking and analysis. > I want this notation to work "because it's clearly correct"; using ANTLR > to check it rigorously is a valuable way to get there. That's a dangerous > loss. I agree. > > Is there a way to simplify this, perhaps by finding some half-way approach? > > I plan to do some experimenting with the ANTLR BNF, and see if there's a way > to tweak what we have while keeping the automated analysis working. > Suggestions welcome. > > --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
Alan Manuel Gloria: > I think a different approach is better. Definitely a possibility; I'm still trying to figure out if it's possible to do this in a simpler/cleaner way. BTW, I'm realizing that this creates yet another potential problem: Disabled error-checking. With this, a line with indentation that doesn't match its parents might be okay (!). We may be able to quickly detect that and deal with it; I'd like to make sure we can still quickly detect bad indents. > The problem is that DEDENT_PARTIAL cannot give information about > *how many* ? exist on the indent stack. But how much of that information do we really need? > Instead, I think this calls for a more complicated indentation preprocessor: (pft)... That's the sound of my head exploding :-). I've read that several times and I don't think I fully understand it. I understand each line separately, but not why you believe they work properly together. I'm imagining trying to create a math proof that this algorithm is correct... and failing completely. Also, I can't begin to imagine *explaining* that algorithm to someone. While the BNF has many lines, many people have had lots of training in BNFs and can pick them up quickly. Indentation processing like this... not so much. Granted, you could argue that's a limitation on MY end, and that's probably true enough. But if I have trouble understanding it, I doubt I'm the only one. > Basically, the formulation would remove all mention of GROUP_SPLIT and > SUBLIST (and all branches where they occur) but complicate the > indentation preprocessor. That's a significant part of the definition of these expressions, rendering them basically invisible to automated checking and analysis. I want this notation to work "because it's clearly correct"; using ANTLR to check it rigorously is a valuable way to get there. That's a dangerous loss. Is there a way to simplify this, perhaps by finding some half-way approach? I plan to do some experimenting with the ANTLR BNF, and see if there's a way to tweak what we have while keeping the automated analysis working. Suggestions welcome. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
Alan Manuel Gloria: > So, the problems with accepting this are: > > 1. The new syntax is complicated to explain informally. > 2. It's easier to misuse. You have to be a bit more careful of your > indentation after the line that you use SUBLIST on. > 3. It's not clear that the benefits are worth it - there seems little gain. > 4. It removes a bunch of code from the parser and places it into the > indentation preprocessor, whose code we cannot prove in ANTLR. I'd add: 5. It appears (to dwheeler) to be more complicated to define and to implement. 6. This is "partial dedenting" approach is backwards-compatible with the current spec, and thus could be added *later* if desired. The more I look at this, the more complicated it gets. I'd rather document it as a potential future extension/direction, and not try to get this into sweet-expressions version 1.0. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
On 2/21/13, David A. Wheeler wrote: > I said: >> > I have a *lot* of concerns with that particular construct. > > Alan Manuel Gloria: >> Why? Compare: > > > It's not "must never happen", but I have a lot of concerns. Here are ones > that come to mind: > > 1. It really complicates explanation and implementation of "$". Some people > require a second explanation now, and "$" is really simple. Adding this > capability to "$" makes it much more complicated to describe. Every time we > add a complication, we risk losing some potential users and implementers. Fair point. "$" in current semantics is already difficult to explain as-is. > > 2. I'm not sure that there's enough *value* to adding it. There *ARE* use > cases, and these use cases are definitely common enough to discuss doing > something special with them. But I worry that the contravening downsides > will overwhelm it. Currently, in certain cases we have to add "\\"-only > lines; that's not really a hardship, especially since the resulting > constructs are pretty easy to understand. OK > > 3. It can be viewed as complicating the reading of code that uses it. Up to > this point, a dedent always ended the whole line above; now it can end it a > part. Perhaps the reduction in line count is fair compensation; that's not > clear to me. > I suppose the main reason is "it's too easy to abuse". Beni's formulation has a single use case so far, the aforementioned let, but excess misuse of the Beni SUBLIST can make users suspicious of using it. > 4. There's already a body of material on how to handle indentation-based > languages, which tend to follow Python approaches and specifically do NOT > differentiate between "indent 3 spaces" and "indent 1 space", just INDENT. > We leave better-understood parsing theory if we do this. I want to have it > easily implemented, with many reasons to be *confident* it is > well-designed... the more we leave established theory, the harder it is to > do that. Well, my formulation of Beni's formulation removes SUBLIST and SPLIT (\\-inline) handling from the hands of the indentation parser and puts it into the hands of the indentation preprocessor. It could even remove GROUP (\\-at-start) handling from the indentation parser and keep it in the preprocessor, as long as the indentation parser can handle two INDENT's in sequence. > > > Let me speak to the last point. If we *did* go this way (and I'm dubious > right now), we need to make sure that this construct is clearly and > unambiguously defined as part of some well-checked BNF grammar. Turning > every space into an INDENT, and reduced space into a DEDENT, seems to make > this much worse. I don't know of anyone who handles indent/dedent > processing this way; people normally tokenize indentation to make parsing > easier. I want to stick to better-understood ground where we can, so we > avoid any surprise disasters. > > So if we went this way, I suspect it would be better to model this by adding > a new indentation token, DEDENT_PARTIAL, in addition to DEDENT. A DEDENT > undents back to the previous parent level; a DEDENT_PARTIAL undents back to > something consistent with the parent and the current indent, but is > (strictly) between them. The indent parser would have to change to generate > a DEDENT_PARTIAL, and the BNF would have to change to support > DEDENT_PARTIAL. That way, we at least continue to tokenize indentation > changes. I don't know if the BNF change would be easy or hard; if it's > hard, I'm *really* disinclined. I think a different approach is better. The problem is that DEDENT_PARTIAL cannot give information about *how many* ? exist on the indent stack. Instead, I think this calls for a more complicated indentation preprocessor: 1. If you encounter a SUBLIST, emit an INDENT (or EOL-INDENT since that seems to be your preferred formulation) and push ? on the indent stack. 2. If you encounter a GROUP/SPLIT that is inline (SPLIT meaning): 2.1. If there is at least one ? on the indent stack top, pop off all ? until you reach a non-? item; emit a DEDENT for each ? popped. 2.2. Otherwise, emit SAME (or just EOL, since that is how the current BNF works). 3. If you encounter an EOL, slurp the indentation, then: 3.1. If the topmost non-? stack item is less than the indentation, push the indentation on the stack and emit INDENT. 3.2. If the topmost non-? stack item is equal to the indentation: ; comment: 3.2.1 and 3.2.2 are copies of 2.1 and 2.2, respectively 3.2.1. If there is at least one ? on the indent stack top, pop off all ? until you reach a non-? item; emit a DEDENT for each ? popped. 3.2.2. Otherwise, emit SAME (or just EOL, since that is how the current BNF works). 3.3. Otherwise, the topmost non-? stack item is greater than the indentation, so: 3.3.1. Pop off stack items until the topmost non-? stack item is less than or equal to the indentation; emit a DEDENT for each. 3.3.2. If the topmost non-? stack item is e
Re: [Readable-discuss] $ at end of line bug?
Per recent discussion, I've changed the BNF so that the sweet-expression BNF supports "$" at the end of the line, whether "$" begins a line or happens later. With this change: a $ ! b c ; ==> (a ((b c))) and also: $ ! ddd eee ; ==> (((ddd eee))) Basically, "$" at the end of the line works the same way as "$ \\" at the end of the line. If this looks right, I can do the same to the Scheme implementation. Does everyone agree that this is the expected mapping? This does NOT provide partial-dedent support, but it certainly moves towards it somewhat. This change can be justified purely on the grounds of consistency, so I'm fine with this tweak. If we DO implement partial-dedents, this would be the first step. --- David A. Wheeler --- a/sweet.g +++ b/sweet.g @@ -1110,8 +1110,9 @@ it_expr returns [Object v] // comment_eol same more=it_expr {$v = append($head.v, $more.v);} comment_eol error | empty {$v = monify($head.v);} ) - | SUBLIST hspace* sub_i=it_expr /* head SUBLIST it_expr case */ - {$v=append($head.v, list($sub_i.v));} + | SUBLIST hspace* /* head SUBLIST ... case */ + (sub_i=it_expr {$v=append($head.v, list($sub_i.v));} +| comment_eol indent sub_b=body {$v = append($head.v, list($sub_b.v));} | comment_eol // Normal case, handle child lines if any: (indent children=body {$v = append($head.v, $children.v);} | empty {$v = monify($head.v);} /* No child lines */ ) @@ -1126,7 +1127,9 @@ it_expr returns [Object v] /* Handle #!sweet EOL EOL t_expr */ | comment_eol restart=t_expr {$v = $restart.v;} ) | dedent error )) - | SUBLIST hspace* is_i=it_expr {$v=list($is_i.v);} /* "$" first on line */ + | SUBLIST hspace* /* "$" first on line */ +(is_i=it_expr {$v=list($is_i.v);} + | comment_eol indent sub_body=body {$v = list($sub_body.v);} ) | abbrevw hspace* (comment_eol indent ab=body {$v = append(list($abbrevw.v), $ab.v);} -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
I said: > > I have a *lot* of concerns with that particular construct. Alan Manuel Gloria: > Why? Compare: It's not "must never happen", but I have a lot of concerns. Here are ones that come to mind: 1. It really complicates explanation and implementation of "$". Some people require a second explanation now, and "$" is really simple. Adding this capability to "$" makes it much more complicated to describe. Every time we add a complication, we risk losing some potential users and implementers. 2. I'm not sure that there's enough *value* to adding it. There *ARE* use cases, and these use cases are definitely common enough to discuss doing something special with them. But I worry that the contravening downsides will overwhelm it. Currently, in certain cases we have to add "\\"-only lines; that's not really a hardship, especially since the resulting constructs are pretty easy to understand. 3. It can be viewed as complicating the reading of code that uses it. Up to this point, a dedent always ended the whole line above; now it can end it a part. Perhaps the reduction in line count is fair compensation; that's not clear to me. 4. There's already a body of material on how to handle indentation-based languages, which tend to follow Python approaches and specifically do NOT differentiate between "indent 3 spaces" and "indent 1 space", just INDENT. We leave better-understood parsing theory if we do this. I want to have it easily implemented, with many reasons to be *confident* it is well-designed... the more we leave established theory, the harder it is to do that. Let me speak to the last point. If we *did* go this way (and I'm dubious right now), we need to make sure that this construct is clearly and unambiguously defined as part of some well-checked BNF grammar. Turning every space into an INDENT, and reduced space into a DEDENT, seems to make this much worse. I don't know of anyone who handles indent/dedent processing this way; people normally tokenize indentation to make parsing easier. I want to stick to better-understood ground where we can, so we avoid any surprise disasters. So if we went this way, I suspect it would be better to model this by adding a new indentation token, DEDENT_PARTIAL, in addition to DEDENT. A DEDENT undents back to the previous parent level; a DEDENT_PARTIAL undents back to something consistent with the parent and the current indent, but is (strictly) between them. The indent parser would have to change to generate a DEDENT_PARTIAL, and the BNF would have to change to support DEDENT_PARTIAL. That way, we at least continue to tokenize indentation changes. I don't know if the BNF change would be easy or hard; if it's hard, I'm *really* disinclined. Anyone want to try out trying to define this meaning in BNF using DEDENT_PARTIAL? Seeing what it would mean, as a BNF, might make it much easier to understand its pros and cons. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
Beni Cherniavsky-Paskin: > [I'm asking this because if it's 'fixed, my > closing-SUBLIST-by-unmatched-dedent would allow: > > let $ > ! ! x $ compute 'x > ! ! y $ compute 'y > ! body... > ] Understand, but that kind of construct can already be handled this way: let ! \\ ! x $ compute 'x ! y $ compute 'y ! body... ; > (let ((x (compute (quote x))) (y (compute (quote y body...) --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
On 2/20/13, David A. Wheeler wrote: >> > [I'm asking this because if it's 'fixed, my >> > closing-SUBLIST-by-unmatched-dedent would allow: >> > >> > let $ >> > ! ! x $ compute 'x >> > ! ! y $ compute 'y >> > ! body... >> > ] > > I have a *lot* of concerns with that particular construct. > Why? Compare: let ! \\ ! ! x $ compute 'x ! ! y $ compute 'y ! use x y to: let $ ! ! x $ compute 'x ! ! y $ compute 'y ! use x y Basically, Beni's formulation extends our "monotonically increasing indentation = SUBLIST" theorem, by allowing any subsequence of monotonically increasing indentation to be compressed using SUBLIST. The above cannot be compressed further since the x line is followed by a line on the same indent, and is thus no longer monotonically increasing. So: foo bar quux quuux yod zod wod <==> foo $ bar quux quuux yod $ zod $ wod Since the indent of "bar" is where the indentation stops monotonically increasing, that is the extent to which SUBLIST can be used to compress the indentation (hence "foo $ bar"). Sincerely, AmkG -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
On 2/20/13, David A. Wheeler wrote: > Beni Cherniavsky-Paskin: >> > This behaves surprisingly: >> > >> > $ >> > ! a b >> > ! c d >> > ==> >> > ((a b (c d))) >> > >> > it seems $ consumes the following newline, resulting in same parsing as >> > if I >> > wrote >> > >> > $ a b >> >c d >> > >> > Is this deliberate? > > Alan Manuel Gloria >> No (at least not by me; check David's answer, but I suspect he didn't >> implement it deliberately that way). > > Alan's right, that's unintentional in the Scheme implementation. > > The BNF does not permit this construct at all, so the ANTLR implementation > will give an error in this case. > > The relevant production is it_expr, which permits only: > | SUBLIST hspace* is_i=it_expr {$v=list($is_i.v);} /* "$" first on line > */ > That is, "$", after any hspaces, MUST be followed with an it_expr, and > CANNOT > be followed currently by ";" or an end-of-line marker. > > >> Every example we have has some >> other datum after the "$", I never said anything about $-at-eol ever >> since I first proposed SUBLIST on the mailinglist, and so on, so you >> might legitimately say that this is "unspecified". >> >> > Since "a b" is on a child line, I'd it to parse in the same manner as "c >> > d", >> > resulting in ((a b) (c d)). >> >> That seems reasonable, given your rules. One might say that: >> >> $ >> ! a b >> ! c d >> ==> >> $ \\ >> ! a b >> ! c d > > I'm okay with that, especially if it makes using the construct "more > natural" > and avoids turning a plausible use into an error. > > It's a trivial 1-line addition to the BNF. If we *don't* add that, then I > clearly > need to add an error-check to the Scheme implementation. > Hmmm $ ! a b ! c d INDENT ; stack: (0 ?) INDENT a b ; stack: (0 ? 2) SAME c d ; stack: (0 ? 2) DEDENT DEDENT \\ !\\ !!a b !!c d ( ( (a b) (c d))) ==> (((a b) (c d))), not ((a b) (c d)) - note the extra () introduced by $ compared to \\ -- However despite that, Beni's let example is correct: let $ x $ compute 'x y $ compute 'y use x let INDENT ; stack: (0 ?) INDENT x INDENT compute 'x ; (0 ? 4 ?) DEDENT ; stack (0 ? 4), indentation 4 y INDENT compute 'y ; (0 ? 4 ?) DEDENT DEDENT ; stack (0 ?), indentation 2 use x ; stack (0 2) DEDENT let !\\ !!x !!!compute 'x !!y !!!compute 'y !use x >> > [I'm asking this because if it's 'fixed, my >> > closing-SUBLIST-by-unmatched-dedent would allow: >> > >> > let $ >> > ! ! x $ compute 'x >> > ! ! y $ compute 'y >> > ! body... >> > ] > > I have a *lot* of concerns with that particular construct. > > But we could certainly allow $-at-end-of-line regardless, > on the grounds of consistency. > > So let's add $-at-EOL, unless someone objects soon. > > --- David A. Wheeler > > -- > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > ___ > Readable-discuss mailing list > Readable-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/readable-discuss > -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
Beni Cherniavsky-Paskin: > > This behaves surprisingly: > > > > $ > > ! a b > > ! c d > > ==> > > ((a b (c d))) > > > > it seems $ consumes the following newline, resulting in same parsing as if I > > wrote > > > > $ a b > >c d > > > > Is this deliberate? Alan Manuel Gloria > No (at least not by me; check David's answer, but I suspect he didn't > implement it deliberately that way). Alan's right, that's unintentional in the Scheme implementation. The BNF does not permit this construct at all, so the ANTLR implementation will give an error in this case. The relevant production is it_expr, which permits only: | SUBLIST hspace* is_i=it_expr {$v=list($is_i.v);} /* "$" first on line */ That is, "$", after any hspaces, MUST be followed with an it_expr, and CANNOT be followed currently by ";" or an end-of-line marker. > Every example we have has some > other datum after the "$", I never said anything about $-at-eol ever > since I first proposed SUBLIST on the mailinglist, and so on, so you > might legitimately say that this is "unspecified". > > > Since "a b" is on a child line, I'd it to parse in the same manner as "c d", > > resulting in ((a b) (c d)). > > That seems reasonable, given your rules. One might say that: > > $ > ! a b > ! c d > ==> > $ \\ > ! a b > ! c d I'm okay with that, especially if it makes using the construct "more natural" and avoids turning a plausible use into an error. It's a trivial 1-line addition to the BNF. If we *don't* add that, then I clearly need to add an error-check to the Scheme implementation. > > [I'm asking this because if it's 'fixed, my > > closing-SUBLIST-by-unmatched-dedent would allow: > > > > let $ > > ! ! x $ compute 'x > > ! ! y $ compute 'y > > ! body... > > ] I have a *lot* of concerns with that particular construct. But we could certainly allow $-at-end-of-line regardless, on the grounds of consistency. So let's add $-at-EOL, unless someone objects soon. --- David A. Wheeler -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss
Re: [Readable-discuss] $ at end of line bug?
On Tue, Feb 19, 2013 at 9:04 PM, Beni Cherniavsky-Paskin wrote: > This behaves surprisingly: > > $ > ! a b > ! c d > ==> > ((a b (c d))) > > it seems $ consumes the following newline, resulting in same parsing as if I > wrote > > $ a b >c d > > Is this deliberate? No (at least not by me; check David's answer, but I suspect he didn't implement it deliberately that way). Every example we have has some other datum after the "$", I never said anything about $-at-eol ever since I first proposed SUBLIST on the mailinglist, and so on, so you might legitimately say that this is "unspecified". > Since "a b" is on a child line, I'd it to parse in the same manner as "c d", > resulting in ((a b) (c d)). That seems reasonable, given your rules. One might say that: $ ! a b ! c d ==> $ \\ ! a b ! c d > > [I'm asking this because if it's 'fixed, my > closing-SUBLIST-by-unmatched-dedent would allow: > > let $ > ! ! x $ compute 'x > ! ! y $ compute 'y > ! body... > ] Hmm... let INDENT ; stack: (0 ?) INDENT x INDENT compute 'x ; stack: (0 ? 4 ?) DEDENT ; stack: (0 ? 4), indentation = 4 y INDENT compute 'y ; stack: (0 ? 4 ?) DEDENT DEDENT ; stack: (0 ?), indentation = 2 body ... ; stack: (0 2) DEDENT ==> let !\\ !!x !!!compute 'x !!y !!!compute 'y !body -- looks legit. Hmm, let's try the SUBLIST and monotonic-indentation equivalence theorem... i.e.: foo ! bar <===> foo $ bar So, let's try it: let $ ! ! x ! ! ! compute 'x ! body ... ==> let INDENT ; stack: (0 ?) INDENT x ; stack: (0 ? 4) INDENT compute 'x ; stack: (0 ? 4 6) DEDENT DEDENT ; stack: (0 ?), indentation = 2 body ... ; stack: (0 2) DEDENT ==> let !\\ !!x !!!compute 'x !body ... -- Looks good so far. I think I prefer your formulation of SUBLIST if it's truly back-compatible (and even if shown non-back-compatible, if the back compatibility loss is acceptable in the typical case). It seems to me that much of SUBLIST's power may be due to the fact that it has a hidden surprisingly elegant formulation like yours, leading to its hidden surprisingly elegant semantics. Sincerely, AmkG -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb ___ Readable-discuss mailing list Readable-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/readable-discuss