I think the last solution mentioned sounds best.

On Sat, Mar 21, 2020, 07:38 Tanmai Khanna <khanna.tan...@gmail.com> wrote:

> Hey guys,
> Dictionary trimming is the process of removing those words and their
> analyses from monolingual language models (FSTs compiled from monodixes)
> which don't have an entry in the bidix, to avoid a lot of untranslated
> lemmas (with an @ if debugging) in the output, which lead to issues with
> comprehension and post-editing the output.
>
> There is a GSoC project
> <http://wiki.apertium.org/wiki/Ideas_for_Google_Summer_of_Code/Eliminate_trimming>
> which aims to eliminate this trimming and propose a solution such that you
> don't lose the benefits of dictionary trimming as well. In this email I
> will list a summary of the discussion that has taken place up until now.
>
> By trimming the dictionary, you throw away valuable analyses of words in
> the source language, which, if preserved, can be used as context for
> lexical selection and analysis of the input. Also, several transfer rules
> don't match as the word is shown as unknown.
>
> Several solutions are possible for avoiding trimming, some of which have
> been discussed by Unhammer here
> <http://wiki.apertium.org/wiki/Talk:Why_we_trim>. These involve keeping
> the surface form of the source word, and the lemma+analysis as well - use
> the analysis till you need it in the pipe and then propagate the source
> form as an unknown word (like it would be done in trimming).
>
> Another interesting solution that was discussed was that instead of just
> propagating the source surface form, we can output [source-word lemma +
> target morphology], as is shown in this example by Mikel:
>
> Translating from Basque to English:
> "Andonik izarak izeki zuen" ('Andoni hung up the sheets') → 'Andoni
> *izeki-ed the sheets".
>
> This might help in comprehensibility of the output, and to some extent
> even the post-editability.
>
> If you have any significant pros, cons, or suggestions to add for this
> project, you're requested to reply to this thread so that if I work on this
> project, I can do it fully informed.
>
> Thanks and Regards,
> Tanmai Khanna
>
> --
> *Khanna, Tanmai*
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to