In the Divvun grammar checker project, we have input
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
and want to write a rule like
COPY (Superl &SUGGEST) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
where the expected output is
"<jierpmálaš>"
"jierpmálaš" A Superl &SUGGEST Sg Nom &syn-super-part2
What are the heuristics for placement of the new tags here? I can't seem
to make them go anywhere except at the end; e.g. the actual output of
the above rule is
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
"jierpmálaš" A Sg Nom &syn-super-part2 Superl &SUGGEST COPY:4
Is there a simple way to control tag placement? I know it's possible to
do
COPY (Superl Sg Nom &SUGGEST) EXCEPT (Sg Nom) TARGET (A &syn-super-part2) IF
(NOT 0 (&SUGGEST));
and get
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
"jierpmálaš" A &syn-super-part2 Superl Sg Nom &SUGGEST COPY:4
which is pretty much what I want. I don't care about the &-tags, but the
other tags go into an FST where order matters. But with two numbers and
seven cases we have to have 14 COPY rules. Add possessive tags and so
on, and it quickly turns unmaintainable.
I notice that the docs for SUBSTITUTE do specify insertion point ("at the
last removed tag"), so a better workaround might be
COPY (&SUGGEST) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
SUBSTITUTE (A) (A Superl) TARGET (A &syn-super-part2 &SUGGEST) IF (NOT 0
(Superl));
but then the rule writer has to juggle a lot of "marker tags" in order
to avoid adding the Superl to irrelevant readings or getting loops.
I know it's not trivial to make heuristics for tag placement here, but
it seems like it could be possible to have a heuristic similar to the
SUBSTITUTE heuristic one, like
place <extra tags> after the last removed tag from <extra tags>,
otherwise at the end of the reading
so that you could write
COPY (A Superl &SUGGEST) EXCEPT (A) TARGET (A &syn-super-part2) IF (NOT 0
(&SUGGEST));
(On the other hand, if people depend on the current behaviour, perhaps new
rule options BEFORE/AFTER <tag> might be a better solution …)
--
Kevin Brubeck Unhammer
--
You received this message because you are subscribed to the Google Groups
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.
signature.asc
Description: PGP signature
