Sorry, I incorrectly wrote "Prpers" and "Avea" instead of "prpers" and "avea" in my previous message. I meant to use all-lowercase.
After doing some further tests with lt-proc (without -w), this seems to be tricky. Lt-proc decides the case based on dictionary information (which it has access to, as it analyses the surface forms). According to this, the previous examples should be all-lowercase, given that is how they appear in the dictionary. However, if we add an example proper noun (<np>) called "E" with a surface form "E" to the dictionary, and we analyse it using lt-proc without the -w flag, lt-proc correctly outputs "^E/E<np>$ ", as it is uppercase in the dictionary. If cg-proc has to take such decisions, it may be impossible to do it reliably without checking the dictionary (which it currently does not have access to). Thanks! Marc Riera El divendres, 21 desembre de 2018 11:50:53 UTC+1, [email protected] va escriure: > > Hello, > > While contributing to two Apertium language pairs (English-Catalan and > Romanian-Catalan), which both use CG in all directions, I have noticed what > seems to be a bug in how cg-proc -w normalises the case of single-letter > elements. > > Take these two examples: > > ^I/PRPERS<prn><subj><p1><mf><sg>$ (English) > ^A/AVEA<vbavea><pri><p3><sg>$ (Romanian) > > In both cases, the dictionary forms are all caps, which incorrectly makes > the translations to Catalan automatically all caps as well, when they > should be "Prpers" and "Avea", respectively. My guess is that CG does not > make a difference between the previous examples and multi-letter examples > such as "HOUSE" or "TREE". > > In comparison, if CG is disabled in these pairs and normalisation is done > directly by lt-proc, the correct "Prpers" and "Avea" analyses are given as > output. > > Thanks! > > Marc Riera > -- You received this message because you are subscribed to the Google Groups "Constraint Grammar" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/constraint-grammar. For more options, visit https://groups.google.com/d/optout.
