Hey Xavi,
Postgeneration has been modified to deal with wordbound blanks. It deals
with wordbound blanks in one-one, one-many, many-one and many-many rules in
postgeneration. (Pull Request
<https://github.com/apertium/lttoolbox/pull/102>)

Regards,
*तन्मय खन्ना *
*Tanmai Khanna*


On Tue, Jul 21, 2020 at 4:57 AM Xavi Ivars <xavi.iv...@gmail.com> wrote:

> Hello! I just found a (potential) issue, and wanted to double check with
> you (it's probably something you already looked into and is not a real
> issue): looking at transfuse's code, I saw tf-mangle-mode is doing tf-close
> on the generation step.
>
> How does it work when postgen steps merges some of words?
>
>
> --
> Xavi Ivars
> < http://xavi.ivars.me >
>
> El dl., 20 de jul. 2020, 20:34, Tanmai Khanna <khanna.tan...@gmail.com>
> va escriure:
>
>> Hey guys,
>> Using wordbound blanks, we've modified the Apertium pipeline, modules and
>> stream such that inline markup tags now move around with words in transfer,
>> merge when LUs merge, split when LUs split, to preserve the formatting of
>> the input document. If you want to follow the further development of this
>> project, see here
>> <https://wiki.apertium.org/wiki/User:Khannatanmai/Wordbound_blanks>.
>>
>> We have a decent version that is ready to test that does markup handling
>> for html documents. It will undergo extensive testing as part of this
>> project, but I thought it'll be a good idea to let the community test it
>> themselves on their language pairs based on their needs so that we can
>> understand what features need to be added, and what needs to be fixed.
>> Apertium users have been asking for markup handling for quite some time now
>> and had no other option but to use wrappers that try to guess alignments.
>> I'm hoping this project helps in that regard. Here's what you need to test
>> this:
>> - Make sure you have the latest commits of apertium and lttoolbox
>> installed.
>> - Latest commits of -separable, -anaphora, etc. if you're using those in
>> your mode.
>> - Clone and install https://github.com/TinoDidriksen/transfuse .
>>
>> After this all you need to do is pipe your html document to
>> tf-html-fragment and give as argument a translation mode of your language
>> pair of choice (full translation modes).
>>
>> Example:
>>
>> $ echo 'Hello <b>big green</b> <i>world</i>!' | tf-html-fragment
>> /Users/khannatanmai/Documents/GSoC/repo/main/apertium-eng-spa/modes/eng-spa.mode
>>
>>
>> Hola <i>Mundo</i> <b>verde grande</b> !
>>
>>
>> It only works for html right now, but we're in the process of supporting
>> all usual document types.
>>
>>
>> *Known issues:*
>>
>> - If a transfer rule has multiple words in the pattern, and in the output
>> there is a LU that wasn't clipped from any word in the input, it won't put
>> a wordbound blank on that LU.
>>
>> - If -separable detects a string of words then the format of each will be
>> combined and added on the entire string of words.
>>
>> - apertium-recursive isn't supported as of now. It will be by the end of
>> the project though.
>>
>>
>> If you have any questions, suggestions, I'd be glad to respond to them on
>> this thread. If you need help testing this on your language pair you can
>> contact us on the IRC. Same if you find any bugs, or have any feature
>> requests.
>>
>>
>> Enjoy!
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>> _______________________________________________
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to