Jaume Ortolà i Font
<jaumeort...@gmail.com> čálii:

> Missatge de egea piñeiro helena 
> <helena.egea-tryufelddafe5aofshc...@public.gmane.org> del dia dc., 1
> d’abr. 2020 a les 10:48:
>
>> How to show the text translated with the multiple options due to polisemy.
>> "The *season/station* more rainy is...."
>>
>
> This is a recurrent request, that could be useful in some applications, but
> there is no way to do it in Apertium now.

You can make a new pipeline that splits into separate lexical units
instead of disambiguating. There's an example for eng-ita at
http://wiki.apertium.org/wiki/Translate_without_disambiguation

Basically, replace cg-proc+apertium-tagger

#!/usr/bin/python3
import streamparser,sys
for (b, lu) in streamparser.parse_file(sys.stdin,with_text=True):
 print(b+"[/]".join(["^"+streamparser.reading_to_string(r)+"$" for r in 
lu.readings]),end="")'

and replace lrx-proc with

#!/usr/bin/python3
import streamparser,sys
for (b, lu) in streamparser.parse_file(sys.stdin, with_text=True):
  print(b + 
"[/]".join(["^"+lu.wordform+"/"+streamparser.reading_to_string(r)+"$" for r in 
lu.readings]), end="")'

in your pipeline and you get slash-separated alternatives.


Of course, this won't get handled correctly by transfer (transfer will
see e.g. several nouns in a row where there was one source noun), but if
all you want is to send all alternatives through, it may be Good Enough
for some purposes (e.g. testvoc, or MT for language learning).

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to