Oops. ----- Forwarded message from Johnicholas Hines <[email protected]> -----
Date: Fri, 13 Apr 2012 08:30:39 -0400 From: Johnicholas Hines <[email protected]> To: Kragen Javier Sitaker <[email protected]> Subject: Re: backtracking HTML templating I tried to send this to kragen-discuss, but it was rejected. On Fri, Apr 13, 2012 at 8:20 AM, Johnicholas Hines < [email protected]> wrote: > The way you handle loops seems different from Prolog. > If I understand correctly, there are two ways Prolog does loops: via > recursion like in a functional programming language, decomposing a > structured list or an integer, or using 'repeat', which plays with the > four-port model of backtracking. > I can't see what technique would be more advantageous, but looking at the > alternatives might be interesting. > > A little googling reveals that there is a webby logic programming tool > called Pillow. > http://www.clip.dia.fi.upm.es/Software/pillow/ > Alternatively, you could do backtracking with MonadPlus in Haskell, and > connect it to the web using something like HappStack: > http://happstack.com/docs/crashcourse/index.html > > Why is it valuable (I'm not contesting that it is valuable) to do > backtracking in formatting? > Is it something about the pragmatics of speech? > For example, if someone says "I have a bus to catch, how do I get to the > bus station?", > a reasonable pragmatic response would be both directions and, > if it is a holiday and the bus is not running, the information that the > bus is not running. > There's a speaker/listener game, where a speaker does better if they have > a model of how the listener will interpret what they say: > http://www.aclweb.org/anthology/D/D10/D10-1040.pdf > > Can we make some sort of Grice implicature model of the web page reader, > and use it to explain what went wrong with the non-backtracking formatting? > For example, if you just leave off the author information, rather than > explicitly marking "No author available", then the listener, leaning on a > presumption that the web site is cooperating, might presume that the web > site generally does not have author information. > Saying "No author available" is actively cancelling that presumption, to > create a more accurate schema of what the web site generally provides. > > Johnicholas > > On Fri, Apr 13, 2012 at 12:30 AM, Kragen Javier Sitaker < > [email protected]> wrote: > >> This is an idea originally from 2010-04-08 that I just never got >> around to publishing until now. >> >> Now that I’ve written the below, I’m tempted to try to hack this >> together tonight, but I think I should probably sleep instead, and >> maybe do this on the weekend. >> >> Motivation >> ---------- >> >> This week I’ve been working on a bibliographic website, which, among >> other things, renders citations into HTML. And this is resulting in >> me writing a lot of HAML templates that say things like: >> >> - if @publication.booktitle? >> In #{publication.booktitle}. >> - if @publication.month? >> #{@publication.month} #{@publication.year}. >> >> The general pattern here is that there are one or more properties that >> need to be present, and if they’re present, we format them together, >> along with some other window dressing intended to format them >> properly. But this involves a lot of very local duplication in the >> code, and even though that duplication is local, it is still >> error-prone; consider what happens in the above case if the year is >> missing. >> >> Now, aside from the question of whether there are already existing >> BibTeX HTML formatters for Rails, this is an interesting kind of >> problem to solve. >> >> Introducing the solution: an example >> ------------------------------------ >> >> So I remembered this thing I’d written up in a notebook a couple of >> years ago, under the name "Backtracking HTML Templates", which makes >> that kind of thing very simple, and eliminates the duplication. >> Here’s a presentation by example: >> >> In $booktitle.<;> $month $year.<;> >> >> This template consists of three "transactions", separated by `<;>`. >> The first one concatenates "In ", the value of the variable >> `booktitle`, and ".". If it succeeds, that concatenation will be >> emitted. If it fails, nothing is emitted for that transaction. It >> will fail if any of the three things it’s concatenating fail. Two of >> them are literal strings, which always succeed, but the middle one is >> a variable reference, which will only succeed if the variable is set. >> The second transaction is similar, but it interpolates two variables >> instead of one, so it fails unless *both* variables are set. The >> third transaction is the empty string after the second `<;>`, which is >> boring but worth mentioning. >> >> A longer example >> ---------------- >> >> This shows off the full feature set of this templating language, >> albeit in a fairly artificial setting: >> >> <!DOCTYPE html> >> <html> >> <head><title><{>$title - rabujogopo.com >> <|>RABUJOGOPO<}></title></head> >> <body><;> >> <h1>$title</h1><;> >> >> <p> Hello<{>, $fullname<|>, $firstname $lastname<;><}>. <;> You >> have arrived from a search engine seeking $searchterms. <;> >> This article is tagged as >> <{><@tags><a href="/tag/$tagname">$tagname</a><,> <}>. >> <|> This article is not tagged.<;> >> This article is important. <if $importance == high> <;> >> </p> >> <{> >> <@sections> >> <h2>$title</h2> <!-- the section title --> >> <@contents> >> <if $type == paragraph> >> <p>$text</p> >> <|> >> <@subsections> >> <h3>$title</h3> >> <@contents> >> <p>$p</p> >> <}> >> </body></html> >> >> ### Informal explanation ### >> >> This contains literal text, `$var`, `<;>`, `<{><}>`, `<|>`, `<if >> ...>`, `<@var>`, and `<,>`, which are the entire feature set of the >> language. I’ve already explained literal text, `$var`, and `<;>`, and >> concatenation. >> >> #### Alternation: `<|>` and `<{><}>` #### >> >> The `<;>` is used to isolate a transaction, so that if it fails, you >> just get an empty string instead of, say, an error-message page. It’s >> just syntactic sugar for the more general alternation construct >> written with `<|>`, though. This template: >> >> $foo bar<;> >> >> could be written like this, and mean exactly the same thing: >> >> <{>$foo bar<|><}> >> >> The `<|>` makes an *alternation* of two transactions. If the first >> transaction succeeds, the alternation yields the result of the first >> transaction; if it fails, the alternation runs the other transaction >> and yields its result. In this case, the second transaction is the >> empty string, which will always succeed. The `<{><}>` syntactically >> delimit the reach of the `<|>` operator, so that it doesn’t suck up >> your entire document as its operands. >> >> But your second transaction could be something more interesting than >> an empty string. For example, it could explain that some piece of >> data was missing. >> >> <{>By $author.<|>Author unknown.<}> >> >> So, you see, in the longer example template above, the page title has >> a fallback of “RABUJOGOPO”. And we greet the user by the variable >> `fullname` if we have it, otherwise `firstname lastname`, or otherwise >> just with “Hello.” >> >> #### `<if condition>` #### >> >> `<if condition>` succeeds and produces the empty string if `condition` >> is true, and fails otherwise. Because of how failure works, you can >> put it before, after, or in the middle of the text you want it to >> control. So far I’ve only thought about string-equality conditions. >> >> So, in the example above, the sentence “This article is important.” is >> only emitted if the variable `importance` has the value `high`, >> because otherwise the `<if>` fails, failing the whole transaction >> containing that sentence. >> >> #### Iteration: `<@var>` and `<,>` #### >> >> `<@var>` begins a loop. The variable named needs to name a list of >> dicts. If the variable doesn’t exist or is an empty list, the >> transaction fails. Otherwise, the text that follows `<@var>` is >> evaluated once for each of the dict in the list — with the names from >> the dict added to the local namespace — and the resulting values are >> concatenated. If any of the iterations of the loop fails, the whole >> loop fails. >> >> (This namespace thing is the one thing I’m not sure about here; >> merging the global namespace with the per-iteration namespace means >> that both human readers and the compiler have to guess which scope a >> given variable refers to.) >> >> Because the loop fails if it executes zero times, you can use `<|>` to >> provide an alternative to display in the case where a collection came >> up empty, as in the “This article is not tagged.” example above. >> >> `<,>` is an optional part of the loop construct. `<@var>a<,>b` >> evaluates `ab` once for each item in the list *except the last*, and >> for the last item, evaluates only `a`. >> >> So, in the above, we actually have loops nested four deep, with >> sections containing contents, which may contain either text or >> subsections, which contain further contents. And in the loop over >> `<@tags>`, the links to the individual tags are separated by spaces, >> but the last tag is not followed by a space, because the space is >> between the `<,>` and the outside of the `<{><}>`. >> >> Some notes on grammar >> --------------------- >> >> For the most part, this is a relatively traditional infix grammar, >> masquerading as a markup language. `<{><}>` provide nothing more than >> syntactic grouping; `<|>`, `<;>`, bare juxtaposition, and `<@var>` are >> infix operators; and `<,>` can be thought of as an infix operator that >> only makes sense within the right operand of `<@var>`. >> >> The precedence order I’ve implicitly used above, from tightest binding >> to loosest: >> >> * concatenation or juxtaposition; >> * `<|>` >> * `<;>` >> * `<@var>` and `<,>` >> >> I don’t know if this is the best order, or even a sane one. >> >> As it happens, all of juxtaposition, `<|>`, and `<;>` are associative, >> which provides some liberty; but the looping construct is >> not. `<{> a <@b> c <}> <@d> e` is not the same as `a <@b> <{> c <@d> e >> <}>`, >> because in the first case, `<@b>` and `<@d>` are separate iterations >> that get concatenated, and in the second case, `<@d>` is nested inside >> `<@b>`. >> >> (It’s kind of nice to have a syntactically flat structure that’s >> nevertheless capable of expressing complex things.) >> >> In the above, I treated `<@var>` as right-associative: `<@a><@b>x` is >> `<@a><{><@b>x<}>`, not `<{><@a><}><@b>x`. That was just because it >> made that particular example come out syntactically simpler, not >> because of any deep analysis. >> >> Variations >> ---------- >> >> The current syntax completely fails to take advantage of the most >> desirable ASCII punctuation characters, which are “.”, “:”, “'”, and >> “"”. You could write `<.sections>` instead of `<@sections>`, and `<: >> $importance == high >` instead of `<if $importance == high>`, thus >> reducing the visual noise a bit. However, `<if ...>` is more >> understandable. Maybe `<:if ...>` to avoid clashes with SGML. >> >> Similarly, `<<` and `>>` might be better than `<{>` and `<}>`, which >> look a bit unbalanced. `>>` has the disadvantage that it could >> legitimately occur in normal HTML, and both could occur in normal JS >> or similar languages. Also, they’re ambiguous when they’re used with >> HTML tags adjacent to them. >> >> With those two variations we would get: >> >> <!DOCTYPE html> >> <html> >> <head><title><<$title - rabujogopo.com<|>RABUJOGOPO>></title></head> >> <body><;> >> <h1>$title</h1><;> >> >> <p> Hello<<, $fullname<|>, $firstname $lastname<;>>>. <;> You >> have arrived from a search engine seeking $searchterms. <;> >> This article is tagged as >> <<<.tags><a href="/tag/$tagname">$tagname</a><,> >>. >> <|> This article is not tagged.<;> >> This article is important. <:$importance == high> <;> >> </p> >> << >> <.sections> >> <h2>$title</h2> <!-- the section title --> >> <.contents> >> <:$type == paragraph> >> <p>$text</p> >> <|> >> <.subsections> >> <h3>$title</h3> >> <.contents> >> <p>$p</p> >> >> >> </body></html> >> >> You need some kind of escaping mechanism in any case. >> >> The loop-namespace problem bothers me. One approach is to name the >> loop variable, and extend `$var` to `$var.prop`: >> >> <@sec in $sections> >> <h2>$sec.title</h2> >> <@c in $contents> >> <if $c.type == paragraph> >> <p>$c.text</p> >> <|> >> <@s in $c.subsections> >> <h3>$s.title</h3> >> <@p in $s.contents> >> <p>$p.p</p> >> >> Another approach might be to use Perl/BASIC/Ruby/CoffeeScript sigils >> to distinguish variables from different scopes. Ruby has >> `@instance_variables` and `@@class_variables`. If we only supported >> two levels of scope — globals, and variables inside the innermost loop >> — we could do the same thing, using separate sigils, like `$.` and >> `<@. >`, for loop-locals: >> >> <@sections> >> <h2>$.title</h2> <!-- the section title --> >> <@.contents> >> <if $.type == paragraph> >> <p>$.text</p> >> <|> >> <@.subsections> >> <h3>$.title</h3> >> <@.contents> >> <p>$.p</p> >> >> I think that if you want to have more than two active scopes, you >> can’t rely on sigils; you end up having to count nested scopes, and >> basically writing de Bruijn indices by hand, which is not an >> acceptable user interface. You’d need to use the `$c.text` approach I >> suggested earlier. >> >> I had at one point thought about making the data model simpler: >> instead of a namespace being a mapping from names to either strings or >> lists of namespaces, a namespace would be a mapping from names to >> values, where a value was either a string or a list of values, like in >> Scheme’s macro system. So you’d iterate down lists in parallel, >> hoping they were the same length and depth: >> >> <dl> >> <{> <@ $n $v> >> <dt>$n</dt> >> <dd>$v</dd> >> <}> >> </dl> >> <|> >> Empty. >> >> Or: >> >> <p> <@ $word $synonym> >> <b>$word</b>: <{><@ $synonym>$synonym<,>, <}> >> </p> >> >> In Scheme, you only have to mention the variable names once, instead >> of twice; you stick a `...` after the structure containing them to >> indicate at which syntactic level the repetition is supposed to occur. >> If you did the same thing here, using `<[><]>` to indicate repetition, >> you’d get this: >> >> <[> >> <p><b>$word</b>: <[>$synonym<,>, <]></p> >> <]> >> >> That gives the same kind of DRY implicit-DWIM feeling to iteration >> that `<|>` gives to conditionals, and with the sigil approach, you >> could even mix global variables with loop variables. Still, I’m not >> convinced that it’s better, particularly since it forces what I think >> of as a fairly inflexible data model. >> -- >> To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol > > > ----- End forwarded message ----- -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss
