Hi all,
The Anaphora Resolution module is currently being tested on the
Spanish-English pair, and it would be helpful to test it on several pairs.
The module basically detects certain patterns and gives the potential
antecedents scores based on these patterns, for example:

1. A noun that is part of a Prepositional Phrase is given a -1 score as it
is less likely to be the antecedent of a pronoun. So if the phrase is
"groups of parliament", then *groups* will be more likely to be the
antecedent than *parliament*.

2. First NP in a sentence is given a +1 score as statistically it's seen
that the first NP is more likely to be the antecedent than the other nouns
in a sentence.

And several other antecedent indicators..

All of these conclusions have been based on thorough research, and can be
found in this paper
<https://link.springer.com/content/pdf/10.1023%2FA%3A1011184828072.pdf>.

Since NPs, PPs, etc. have different patterns for different languages, these
patterns are defined in an external XML file which will be different for
each language pair. *You can find an example of this in the file attached:
apertium-eng-spa.spa-eng.arx (Spanish-English).*

I need help from speakers of these languages (or any others) to make this
xml file for their language so that I can test the system on that pair:
Catalan
Galician
Portuguese

If you are working on a language pair and think that Anaphora Resolution
can improve the translation then please reply to this and we can work on
adapting the tool for the language pair :)

Thanks and Regards,
Tanmai Khanna

-- 
*Khanna, Tanmai*

Attachment: apertium-eng-spa.spa-eng.arx
Description: Binary data

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to