> Cool, I've been working with some language pairs I know, I have few > questions on cli usage and stuff: > > Commonly when I test things I get like: > > > Corpus 1 of 5: deu-fin-pending > > 11/27 (40.74%) tests pass (11/11 (100.0%) match gold) > > so I start up cli and see: > > deu-fin 1 of 1 > INPUT: > Haus > EXPECTED OUTPUT: > KOTI > ACTUAL OUTPUT: > koti > IDEAL OUTPUTS: > talo > > if the case was that it's a bug in the (dix/t*x) code I'm not sure what > command to use to skip and see next error?
There is now 'skip' or 'k' for that. > > Another question is that in lot of expected files there seems to be > all-capsed words for fin-* pairs, I am not sure how this has happened? > I am guessing my apertium is older and some ICU changes have affected > the output, perhaps in some code I've copypasted ununderstandingly to > all fin-* pairs. > For that example in particular (and likely others as well), the t1x output code says <chunk name="NP" case="caseFirstWord"> which gives the chunk an all-caps lemma (caseFirstWord is a non-existent variable, so it has no effect) and then the default behavior of postchunk is to copy the chunk case to the words, so ^NP<N><FOOFOO>{^koti<n><sg><nom>$}$ becomes ^KOTI<n><sg><nom>$^.<punct>$ so I think this might be a case of inadvertently relying on a bug in the old transfer case-handling functions and I'm not quite sure what the appropriate solution is. Daniel _______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff