Francis Tyers <fty...@prompsit.com> čálii:

> El 2019-03-01 12:41, Kevin Brubeck Unhammer escribió:

[...]

>> IIUC, those kinds of word boundary-crossing changes are exactly what
>> the
>> postgenerator is supposed to handle, though it is annoying to have to
>> insert the mark. I've been manually inserting the <a/> on double
>> consonants at the ends of words that can compound (to avoid getting
>> triple consonants if the next word starts with the same one), but
>> manual
>> is error prone, and it's noisy in the .dix file.
>>
>> Is there any reason postgen couldn't just run on *everything* LRLM and
>> only apply the changes where it matches (as if it were a version of sed
>> that respects deformatting)? Then you could just do
>> <l>inh<b/>t<l> <r>is</r>
>> in post.dix and have no changes to the hfst.
>>
>
> I think that would be a wonderful idea!

https://github.com/apertium/lttoolbox/issues/42

(GSoC C++ applicants might want to try their hand at that one.)

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to