Francis Tyers <fty...@prompsit.com> čálii: > El 2019-03-01 12:41, Kevin Brubeck Unhammer escribió:
[...] >> IIUC, those kinds of word boundary-crossing changes are exactly what >> the >> postgenerator is supposed to handle, though it is annoying to have to >> insert the mark. I've been manually inserting the <a/> on double >> consonants at the ends of words that can compound (to avoid getting >> triple consonants if the next word starts with the same one), but >> manual >> is error prone, and it's noisy in the .dix file. >> >> Is there any reason postgen couldn't just run on *everything* LRLM and >> only apply the changes where it matches (as if it were a version of sed >> that respects deformatting)? Then you could just do >> <l>inh<b/>t<l> <r>is</r> >> in post.dix and have no changes to the hfst. >> > > I think that would be a wonderful idea! https://github.com/apertium/lttoolbox/issues/42 (GSoC C++ applicants might want to try their hand at that one.)
signature.asc
Description: PGP signature
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff