Hèctor, please correct me if I am wrong. In Catalan, for example, we have gender annotated for proper nouns, because as Hèctor explained, it's useful in the some cases when translating to French. So Catalan monolingual generates rich tags for np.
However, when translating to Spanish, that information (from Catalan) is not that useful, so we didn't bother adding genders there. And the way we managed it was adding RL rules both in spa and cat that consume "genderless" nps, regardless of how they are generated. So I can think that could be an approach: annotate only when output is useful, but account for simpler input when generating. -- Xavi Ivars < http://xavi.ivars.me > El dt., 2 de febr. 2021, 22:40, Kevin Brubeck Unhammer <unham...@fsfe.org> va escriure: > Hèctor Alòs i Font <hectoralos-re5jqeeqqe8avxtiumw...@public.gmane.org> > čálii: > > > I am more sceptical about the need to distinguish between toponyms and > > hydronyms. In some languages one will have an article and the other will > > not, but these are rare cases. On the other hand, we do not distinguish > > between countries (or regions) and cities, which in French is quite > > important both for generating the article and the preposition preceding > it, > > if you translate from Catalan or Spanish: for instance, "New-York" is the > > city, but "le New-York" is the state, so will have "à New-York" or "au > > New-York" for "in New-York" (or "à Paris" but "en France"). The > generation > > of articles may also not be the same whether "Barcelona" stands for the > > city or the (football or whatever) team, nor is the gender often the > same. > > So, are we then going to create more and more subtypes ad nauseam? Better > > not! > > > > In short, we can find casuistries in certain pairs that may make us think > > that some distinctions are appropriate, but adding them in monolingual > > dictionaries and forcing them to be maintained for all languages seems > > doubtful to me. > > So the city-vs-region distinction is only useful for target (structural) > generation, not source analysis/disambiguation/anaphora. I think that > can be a good guide to when something should be in monodixen or not. > > One solution here would be to add it in bidix (with a pardef so you > don't need it when going the other way) and strip it in transfer, or > even just use a def-list in the transfer files. > > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff