On 06/26/2012 02:04 PM, Lard Farnwell wrote:
Hi guys,
To understand and play around with perl6 grammars I was trying to do a simple
NLP parts of speech parser in perl6 grammars. This is sort of what I did:
---
grammar Sentence{
proto rule VP {*}
proto rule NP {*}
rule sentence {
imperative|statement
}
rule imperative {VP}
rule statement {NP VP}
}
grammar VerbPhrase is Sentence{
rule VP:symhit {sym NP}
rule VP:symkill {sym NP}
}
grammar NounPhrase is Sentence{
#define NP:sym etc
}
grammar English is NounPhrase is VerbPhrase {
rule TOP {
Sentence[\. Sentence]*
}
}
So in case you don't get it, A sentence is made up of phrases which in turn
can be made up on other phrases. And English is made up of Sentences.
This sort of thing works but doesn't make much sense.
The obvious problem is that to get the correct definitions of the proto rules
in Sentence I have to say verbPhrase is Sentence and then English is
NounPhrase is VerbPhrase etc . This makes me feel like I'm doing it wrong.
Indeed. The intended mechanism for code reuse in object oriented Perl 6
code is role composition.
Grammar rules and regexes are just methods, so defining them in a role
and applying it to a class sounds like a good idea to me.
role VerbPhrase {
rule VP { verb NP }
proto token verb {*}
token verb:symhit { sym }
token verb:symkill { sym }
}
Define NounPhrase in a similar way, leave out the definition of NP and
VP from Sentence, and then write
grammar English does NounPhrase does VerbPhrase is Sentence {
token TOP { ... }
}
Role composition has much more transparent error modes than inheritance,
and probably works better for you.
How do I build a flexible dynamic grammar in a OO sort of way. For example
how could I do this so:
1) I define all my phrase structures (NP,VP,PP etc) in their own file while
still being able to use each other. There are VPs can be made of NPs and NPs
can be made up of VPs.
See above
2) Add to these definitions dynamically. For example, here I have defined
hit and kill VPs. What if I wanted to add dance VP definition at run time?
In theory you can write
role VerbPhrases[@verbs] {
token verb:symdynamic { @verbs }
# note that 'dynamic' has no special meaning here, but since
# we don't use sym in the regex body, it doesn't matter what
# we write
}
And then instantiate your grammar as
my $g = English.new does VerbPhrases[dance listen juggle ...];
my $match = $g.parse($yourstring);
But Rakudo doesn't yet properly handle array variables in regexes, so
you have to write something like
role AdditionalVerbPhrase[$verb] {
token verb:symdynamic { $verb };
}
my $g = English.new;
$g does AddtionalVerbPhrase[$_] for dance listen juggle ...;
my $match = $g.parse(...);
I haven't tested it though.
If you experiment with it, please report your findings here, I'm curious
about what works right now. If it doesn't work, we can surely find some
way to make it work by going through the meta object to add methods to
the grammar.
Cheers,
Moritz