On 22 okt 2009, at 15:56, Robert Atkey wrote:
....
Previously parsed input /can/ determine what the parser will accept
in
the future (as pointed out by Peter Ljunglöf in his licentiate
thesis).
Consider the following grammar for the context-sensitive language
{aⁿbⁿcⁿ| n ∈ ℕ}:
Yes, sorry, I was sloppy in what I said there. Do you know of a
characterisation of what languages having a possibly infinite amount
of
nonterminals gives you. Is it all context-sensitive languages or a
subset?
The answer is: all context-sensitive languages. This is a very old
insight which has come back in various forms in computer science. The
earliest conception in CS terms is the concept of an affix-grammar, in
which the infinite number of nonterminals is generated by
parameterising non-terminals by trees. They were invented by Kees
koster and Lambert Meertens (who applied them to generate music: http://en.wikipedia.org/wiki/index.html?curid=5314967)
in the beginning of the sixties of the last century. There is a long
follow up on this idea, of which the two most well-known versions are
the so-called two-level grammars which were used in the Algol68 report
and the attribute grammar formalism first described by Knuth. The full
Algol68 language is defined in terms of a two-level grammar. Key
publications/starting points if you want to learn more about these are:
- the Algol68 report:
http://burks.brighton.ac.uk/burks/language/other/a68rr/rrtoc.htm
- the wikipedia paper on affix grammars:
http://en.wikipedia.org/wiki/Affix_grammar
- a nice book about the basics od two-level grammars is the
Cleaveland & Uzgalis book, "Grammars for programming languages", which
may be hard to get,
but there is hope:
http://www.amazon.com/Grammars-Programming-Languages-languages/dp/0444001875
- http://www.agfl.cs.ru.nl/papers/agpl.ps
- http://comjnl.oxfordjournals.org/cgi/content/abstract/32/1/36
Doaitse Swierstra
And a general definition for parsing single-digit numbers. This
works
for any set of non-terminals, so it is a reusable component that
works
for any grammar:
Things become more complicated if the reusable component is defined
using non-terminals which take rules (defined using an arbitrary
non-terminal type) as arguments. Exercise: Define a reusable
variant of
the Kleene star, without using grammars of infinite depth.
I see that you have an answer in the paper you linked to above.
Another
possible answer is to consider open sets of rules in a grammar:
data OpenRuleSet inp exp =
forall hidden. OpenRuleSet (forall a. (exp :+: hidden) a ->
Rule (exp :+: hidden :+: inp) a)
data (f :+: g) a = Left2 (f a) | Right2 (g a)
So OpenRuleSet inp exp, exports definitions of the nonterminals in
'exp', imports definitions of nonterminals in 'inp' (and has a
collection of hidden nonterminals).
It is then possible to combine them with a function of type:
combineG :: (inp1 :=> exp1 :+: inp) ->
(inp2 :=> exp2 :+: inp) ->
OpenRuleSet inp1 exp1 ->
OpenRuleSet inp2 exp2 ->
OpenRuleSet inp (exp1 :+: exp2)
One can then give a reusable Kleene star by stating it as an open rule
set:
star :: forall a nt. Rule nt a -> OpenRuleSet nt (Equal [a])
where Equal is the usual equality GADT.
Obviously, this would be a bit clunky to use in practice, but maybe
more
specialised versions combineG could be given.
Bob
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe