[cg] cg-proc -w makes single-letter elements all caps

marc . riera . irigoyen Fri, 21 Dec 2018 02:51:08 -0800

Hello,

While contributing to two Apertium language pairs (English-Catalan and 
Romanian-Catalan), which both use CG in all directions, I have noticed what 
seems to be a bug in how cg-proc -w normalises the case of single-letter 
elements.


Take these two examples:

^I/PRPERS<prn><subj><p1><mf><sg>$ (English)
^A/AVEA<vbavea><pri><p3><sg>$ (Romanian)

In both cases, the dictionary forms are all caps, which incorrectly makes 
the translations to Catalan automatically all caps as well, when they 
should be "Prpers" and "Avea", respectively. My guess is that CG does not 
make a difference between the previous examples and multi-letter examples 
such as "HOUSE" or "TREE".

In comparison, if CG is disabled in these pairs and normalisation is done 
directly by lt-proc, the correct "Prpers" and "Avea" analyses are given as 
output.

Thanks!

Marc Riera

-- 
You received this message because you are subscribed to the Google Groups 
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

[cg] cg-proc -w makes single-letter elements all caps

Reply via email to