Hello,

I’m going to complete a bit the variant functionality of lt-comp.

1. We want to have the possibility to mark an entry with more than one variant 
simultaneously. In XML, the way to go is to define the attribute as NMTOKEN, 
and define variants as identifiers, space-separated in the v attribute. So 
where now you need to do:

<e v=“valencian”><i>escombra</i><par n=“casa”/></e>
<e v=“balearic”><i>escombra</i><par n=“casa”/></e>
<e><i>granera</i><par n=“casa/></e>

After the modification is done, you will be able to do as well:

<e v=“valencian balearic”><i>escombra</i><par n=“casa”/></e>
<e><i>granera</i><par n=“casa/></e>

So you can compile lt-comp -v (-vr, -vl) whatever variant and it should work.

2. And other minor change. As you know, we tend to accept (LR) everything from 
every variant, but generate (RL) just one variant, the one we specify. But some 
variants create problems (i.e. Balearic variant generates a lot of additional 
homography in standard and valencian Catalan, and this problem has caused its 
exclusion historically in Apertium), so I want to add a —exclude-variant option 
in lt-comp to forbid explicitly variants that we don’t want even to analyze. 
The exclusion will be nondestructive, so if the entry belongs to more than one 
variant and at least one of them is not excluded, it will be included at least 
as analysis (LR).

Any suggestion/piece of advice?

Sergio
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to