> Cool, I've been working with some language pairs I know, I have few
> questions on cli usage and stuff:
>
> Commonly when I test things I get like:
>
> > Corpus 1 of 5: deu-fin-pending
> > 11/27 (40.74%) tests pass (11/11 (100.0%) match gold)
>
> so I start up cli and see:
>
> deu-fin 1 of 1
> INPUT:
>   Haus
> EXPECTED OUTPUT:
>   KOTI
> ACTUAL OUTPUT:
>   koti
> IDEAL OUTPUTS:
>   talo
>
> if the case was that it's a bug in the (dix/t*x) code I'm not sure what
> command to use to skip and see next error?

There is now 'skip' or 'k' for that.

>
> Another question is that in lot of expected files there seems to be
> all-capsed words for fin-* pairs, I am not sure how this has happened?
> I am guessing my apertium is older and some ICU changes have affected
> the output, perhaps in some code I've copypasted ununderstandingly to
> all fin-* pairs.
>

For that example in particular (and likely others as well), the t1x
output code says

<chunk name="NP" case="caseFirstWord">

which gives the chunk an all-caps lemma (caseFirstWord is a
non-existent variable, so it has no effect) and then the default
behavior of postchunk is to copy the chunk case to the words, so

^NP<N><FOOFOO>{^koti<n><sg><nom>$}$

becomes

^KOTI<n><sg><nom>$^.<punct>$

so I think this might be a case of inadvertently relying on a bug in
the old transfer case-handling functions and I'm not quite sure what
the appropriate solution is.

Daniel


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to