[[email protected]: Re: backtracking HTML templating]

Kragen Javier Sitaker Fri, 13 Apr 2012 07:34:58 -0700

Oops.

----- Forwarded message from Johnicholas Hines <[email protected]> 
-----


Date: Fri, 13 Apr 2012 08:30:39 -0400
From: Johnicholas Hines <[email protected]>
To: Kragen Javier Sitaker <[email protected]>
Subject: Re: backtracking HTML templating

I tried to send this to kragen-discuss, but it was rejected.

On Fri, Apr 13, 2012 at 8:20 AM, Johnicholas Hines <
[email protected]> wrote:

> The way you handle loops seems different from Prolog.
> If I understand correctly, there are two ways Prolog does loops: via
> recursion like in a functional programming language, decomposing a
> structured list or an integer, or using 'repeat', which plays with the
> four-port model of backtracking.
> I can't see what technique would be more advantageous, but looking at the
> alternatives might be interesting.
>
> A little googling reveals that there is a webby logic programming tool
> called Pillow.
> http://www.clip.dia.fi.upm.es/Software/pillow/
> Alternatively, you could do backtracking with MonadPlus in Haskell, and
> connect it to the web using something like HappStack:
> http://happstack.com/docs/crashcourse/index.html
>
> Why is it valuable (I'm not contesting that it is valuable) to do
> backtracking in formatting?
> Is it something about the pragmatics of speech?
> For example, if someone says "I have a bus to catch, how do I get to the
> bus station?",
> a reasonable pragmatic response would be both directions and,
> if it is a holiday and the bus is not running, the information that the
> bus is not running.
> There's a speaker/listener game, where a speaker does better if they have
> a model of how the listener will interpret what they say:
> http://www.aclweb.org/anthology/D/D10/D10-1040.pdf
>
> Can we make some sort of Grice implicature model of the web page reader,
> and use it to explain what went wrong with the non-backtracking formatting?
> For example, if you just leave off the author information, rather than
> explicitly marking "No author available", then the listener, leaning on a
> presumption that the web site is cooperating, might presume that the web
> site generally does not have author information.
> Saying "No author available" is actively cancelling that presumption, to
> create a more accurate schema of what the web site generally provides.
>
> Johnicholas
>
> On Fri, Apr 13, 2012 at 12:30 AM, Kragen Javier Sitaker <
> [email protected]> wrote:
>
>> This is an idea originally from 2010-04-08 that I just never got
>> around to publishing until now.
>>
>> Now that I’ve written the below, I’m tempted to try to hack this
>> together tonight, but I think I should probably sleep instead, and
>> maybe do this on the weekend.
>>
>> Motivation
>> ----------
>>
>> This week I’ve been working on a bibliographic website, which, among
>> other things, renders citations into HTML.  And this is resulting in
>> me writing a lot of HAML templates that say things like:
>>
>>    - if @publication.booktitle?
>>      In #{publication.booktitle}.
>>    - if @publication.month?
>>      #{@publication.month} #{@publication.year}.
>>
>> The general pattern here is that there are one or more properties that
>> need to be present, and if they’re present, we format them together,
>> along with some other window dressing intended to format them
>> properly.  But this involves a lot of very local duplication in the
>> code, and even though that duplication is local, it is still
>> error-prone; consider what happens in the above case if the year is
>> missing.
>>
>> Now, aside from the question of whether there are already existing
>> BibTeX HTML formatters for Rails, this is an interesting kind of
>> problem to solve.
>>
>> Introducing the solution: an example
>> ------------------------------------
>>
>> So I remembered this thing I’d written up in a notebook a couple of
>> years ago, under the name "Backtracking HTML Templates", which makes
>> that kind of thing very simple, and eliminates the duplication.
>> Here’s a presentation by example:
>>
>>    In $booktitle.<;> $month $year.<;>
>>
>> This template consists of three "transactions", separated by `<;>`.
>> The first one concatenates "In ", the value of the variable
>> `booktitle`, and ".".  If it succeeds, that concatenation will be
>> emitted.  If it fails, nothing is emitted for that transaction.  It
>> will fail if any of the three things it’s concatenating fail.  Two of
>> them are literal strings, which always succeed, but the middle one is
>> a variable reference, which will only succeed if the variable is set.
>> The second transaction is similar, but it interpolates two variables
>> instead of one, so it fails unless *both* variables are set.  The
>> third transaction is the empty string after the second `<;>`, which is
>> boring but worth mentioning.
>>
>> A longer example
>> ----------------
>>
>> This shows off the full feature set of this templating language,
>> albeit in a fairly artificial setting:
>>
>>    <!DOCTYPE html>
>>    <html>
>>      <head><title><{>$title - rabujogopo.com
>> <|>RABUJOGOPO<}></title></head>
>>    <body><;>
>>      <h1>$title</h1><;>
>>
>>      <p> Hello<{>, $fullname<|>, $firstname $lastname<;><}>. <;> You
>>          have arrived from a search engine seeking $searchterms. <;>
>>          This article is tagged as
>>          <{><@tags><a href="/tag/$tagname">$tagname</a><,> <}>.
>>          <|> This article is not tagged.<;>
>>          This article is important. <if $importance == high> <;>
>>      </p>
>>      <{>
>>        <@sections>
>>        <h2>$title</h2> <!-- the section title -->
>>        <@contents>
>>        <if $type == paragraph>
>>          <p>$text</p>
>>        <|>
>>          <@subsections>
>>          <h3>$title</h3>
>>          <@contents>
>>          <p>$p</p>
>>      <}>
>>    </body></html>
>>
>> ### Informal explanation ###
>>
>> This contains literal text, `$var`, `<;>`, `<{><}>`, `<|>`, `<if
>> ...>`, `<@var>`, and `<,>`, which are the entire feature set of the
>> language.  I’ve already explained literal text, `$var`, and `<;>`, and
>> concatenation.
>>
>> #### Alternation: `<|>` and `<{><}>` ####
>>
>> The `<;>` is used to isolate a transaction, so that if it fails, you
>> just get an empty string instead of, say, an error-message page.  It’s
>> just syntactic sugar for the more general alternation construct
>> written with `<|>`, though.  This template:
>>
>>    $foo bar<;>
>>
>> could be written like this, and mean exactly the same thing:
>>
>>    <{>$foo bar<|><}>
>>
>> The `<|>` makes an *alternation* of two transactions.  If the first
>> transaction succeeds, the alternation yields the result of the first
>> transaction; if it fails, the alternation runs the other transaction
>> and yields its result.  In this case, the second transaction is the
>> empty string, which will always succeed.  The `<{><}>` syntactically
>> delimit the reach of the `<|>` operator, so that it doesn’t suck up
>> your entire document as its operands.
>>
>> But your second transaction could be something more interesting than
>> an empty string.  For example, it could explain that some piece of
>> data was missing.
>>
>>    <{>By $author.<|>Author unknown.<}>
>>
>> So, you see, in the longer example template above, the page title has
>> a fallback of “RABUJOGOPO”.  And we greet the user by the variable
>> `fullname` if we have it, otherwise `firstname lastname`, or otherwise
>> just with “Hello.”
>>
>> #### `<if condition>` ####
>>
>> `<if condition>` succeeds and produces the empty string if `condition`
>> is true, and fails otherwise.  Because of how failure works, you can
>> put it before, after, or in the middle of the text you want it to
>> control.  So far I’ve only thought about string-equality conditions.
>>
>> So, in the example above, the sentence “This article is important.” is
>> only emitted if the variable `importance` has the value `high`,
>> because otherwise the `<if>` fails, failing the whole transaction
>> containing that sentence.
>>
>> #### Iteration: `<@var>` and `<,>` ####
>>
>> `<@var>` begins a loop.  The variable named needs to name a list of
>> dicts.  If the variable doesn’t exist or is an empty list, the
>> transaction fails.  Otherwise, the text that follows `<@var>` is
>> evaluated once for each of the dict in the list — with the names from
>> the dict added to the local namespace — and the resulting values are
>> concatenated.  If any of the iterations of the loop fails, the whole
>> loop fails.
>>
>> (This namespace thing is the one thing I’m not sure about here;
>> merging the global namespace with the per-iteration namespace means
>> that both human readers and the compiler have to guess which scope a
>> given variable refers to.)
>>
>> Because the loop fails if it executes zero times, you can use `<|>` to
>> provide an alternative to display in the case where a collection came
>> up empty, as in the “This article is not tagged.” example above.
>>
>> `<,>` is an optional part of the loop construct.  `<@var>a<,>b`
>> evaluates `ab` once for each item in the list *except the last*, and
>> for the last item, evaluates only `a`.
>>
>> So, in the above, we actually have loops nested four deep, with
>> sections containing contents, which may contain either text or
>> subsections, which contain further contents.  And in the loop over
>> `<@tags>`, the links to the individual tags are separated by spaces,
>> but the last tag is not followed by a space, because the space is
>> between the `<,>` and the outside of the `<{><}>`.
>>
>> Some notes on grammar
>> ---------------------
>>
>> For the most part, this is a relatively traditional infix grammar,
>> masquerading as a markup language.  `<{><}>` provide nothing more than
>> syntactic grouping; `<|>`, `<;>`, bare juxtaposition, and `<@var>` are
>> infix operators; and `<,>` can be thought of as an infix operator that
>> only makes sense within the right operand of `<@var>`.
>>
>> The precedence order I’ve implicitly used above, from tightest binding
>> to loosest:
>>
>> * concatenation or juxtaposition;
>> * `<|>`
>> * `<;>`
>> * `<@var>` and `<,>`
>>
>> I don’t know if this is the best order, or even a sane one.
>>
>> As it happens, all of juxtaposition, `<|>`, and `<;>` are associative,
>> which provides some liberty; but the looping construct is
>> not. `<{> a <@b> c <}> <@d> e` is not the same as `a <@b> <{> c <@d> e
>> <}>`,
>> because in the first case, `<@b>` and `<@d>` are separate iterations
>> that get concatenated, and in the second case, `<@d>` is nested inside
>> `<@b>`.
>>
>> (It’s kind of nice to have a syntactically flat structure that’s
>> nevertheless capable of expressing complex things.)
>>
>> In the above, I treated `<@var>` as right-associative: `<@a><@b>x` is
>> `<@a><{><@b>x<}>`, not `<{><@a><}><@b>x`.  That was just because it
>> made that particular example come out syntactically simpler, not
>> because of any deep analysis.
>>
>> Variations
>> ----------
>>
>> The current syntax completely fails to take advantage of the most
>> desirable ASCII punctuation characters, which are “.”, “:”, “'”, and
>> “"”.  You could write `<.sections>` instead of `<@sections>`, and `<:
>> $importance == high >` instead of `<if $importance == high>`, thus
>> reducing the visual noise a bit.  However, `<if ...>` is more
>> understandable.  Maybe `<:if ...>` to avoid clashes with SGML.
>>
>> Similarly, `<<` and `>>` might be better than `<{>` and `<}>`, which
>> look a bit unbalanced.  `>>` has the disadvantage that it could
>> legitimately occur in normal HTML, and both could occur in normal JS
>> or similar languages.  Also, they’re ambiguous when they’re used with
>> HTML tags adjacent to them.
>>
>> With those two variations we would get:
>>
>>    <!DOCTYPE html>
>>    <html>
>>      <head><title><<$title - rabujogopo.com<|>RABUJOGOPO>></title></head>
>>    <body><;>
>>      <h1>$title</h1><;>
>>
>>      <p> Hello<<, $fullname<|>, $firstname $lastname<;>>>. <;> You
>>          have arrived from a search engine seeking $searchterms. <;>
>>          This article is tagged as
>>          <<<.tags><a href="/tag/$tagname">$tagname</a><,> >>.
>>          <|> This article is not tagged.<;>
>>          This article is important. <:$importance == high> <;>
>>      </p>
>>      <<
>>        <.sections>
>>        <h2>$title</h2> <!-- the section title -->
>>        <.contents>
>>        <:$type == paragraph>
>>          <p>$text</p>
>>        <|>
>>          <.subsections>
>>          <h3>$title</h3>
>>          <.contents>
>>          <p>$p</p>
>>      >>
>>    </body></html>
>>
>> You need some kind of escaping mechanism in any case.
>>
>> The loop-namespace problem bothers me.  One approach is to name the
>> loop variable, and extend `$var` to `$var.prop`:
>>
>>        <@sec in $sections>
>>        <h2>$sec.title</h2>
>>        <@c in $contents>
>>        <if $c.type == paragraph>
>>          <p>$c.text</p>
>>        <|>
>>          <@s in $c.subsections>
>>          <h3>$s.title</h3>
>>          <@p in $s.contents>
>>          <p>$p.p</p>
>>
>> Another approach might be to use Perl/BASIC/Ruby/CoffeeScript sigils
>> to distinguish variables from different scopes.  Ruby has
>> `@instance_variables` and `@@class_variables`.  If we only supported
>> two levels of scope — globals, and variables inside the innermost loop
>> — we could do the same thing, using separate sigils, like `$.` and
>> `<@. >`, for loop-locals:
>>
>>        <@sections>
>>        <h2>$.title</h2> <!-- the section title -->
>>        <@.contents>
>>        <if $.type == paragraph>
>>          <p>$.text</p>
>>        <|>
>>          <@.subsections>
>>          <h3>$.title</h3>
>>          <@.contents>
>>          <p>$.p</p>
>>
>> I think that if you want to have more than two active scopes, you
>> can’t rely on sigils; you end up having to count nested scopes, and
>> basically writing de Bruijn indices by hand, which is not an
>> acceptable user interface.  You’d need to use the `$c.text` approach I
>> suggested earlier.
>>
>> I had at one point thought about making the data model simpler:
>> instead of a namespace being a mapping from names to either strings or
>> lists of namespaces, a namespace would be a mapping from names to
>> values, where a value was either a string or a list of values, like in
>> Scheme’s macro system.  So you’d iterate down lists in parallel,
>> hoping they were the same length and depth:
>>
>>      <dl>
>>        <{> <@ $n $v>
>>          <dt>$n</dt>
>>          <dd>$v</dd>
>>        <}>
>>      </dl>
>>    <|>
>>      Empty.
>>
>> Or:
>>
>>    <p> <@ $word $synonym>
>>    <b>$word</b>: <{><@ $synonym>$synonym<,>, <}>
>>    </p>
>>
>> In Scheme, you only have to mention the variable names once, instead
>> of twice; you stick a `...` after the structure containing them to
>> indicate at which syntactic level the repetition is supposed to occur.
>> If you did the same thing here, using `<[><]>` to indicate repetition,
>> you’d get this:
>>
>>    <[>
>>    <p><b>$word</b>: <[>$synonym<,>, <]></p>
>>    <]>
>>
>> That gives the same kind of DRY implicit-DWIM feeling to iteration
>> that `<|>` gives to conditionals, and with the sigil approach, you
>> could even mix global variables with loop variables.  Still, I’m not
>> convinced that it’s better, particularly since it forces what I think
>> of as a fairly inflexible data model.
>> --
>> To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-tol
>
>
>

----- End forwarded message -----
-- 
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss

[[email protected]: Re: backtracking HTML templating]

Reply via email to