Hello! I just found a (potential) issue, and wanted to double check with
you (it's probably something you already looked into and is not a real
issue): looking at transfuse's code, I saw tf-mangle-mode is doing tf-close
on the generation step.

How does it work when postgen steps merges some of words?


--
Xavi Ivars
< http://xavi.ivars.me >

El dl., 20 de jul. 2020, 20:34, Tanmai Khanna <khanna.tan...@gmail.com> va
escriure:

> Hey guys,
> Using wordbound blanks, we've modified the Apertium pipeline, modules and
> stream such that inline markup tags now move around with words in transfer,
> merge when LUs merge, split when LUs split, to preserve the formatting of
> the input document. If you want to follow the further development of this
> project, see here
> <https://wiki.apertium.org/wiki/User:Khannatanmai/Wordbound_blanks>.
>
> We have a decent version that is ready to test that does markup handling
> for html documents. It will undergo extensive testing as part of this
> project, but I thought it'll be a good idea to let the community test it
> themselves on their language pairs based on their needs so that we can
> understand what features need to be added, and what needs to be fixed.
> Apertium users have been asking for markup handling for quite some time now
> and had no other option but to use wrappers that try to guess alignments.
> I'm hoping this project helps in that regard. Here's what you need to test
> this:
> - Make sure you have the latest commits of apertium and lttoolbox
> installed.
> - Latest commits of -separable, -anaphora, etc. if you're using those in
> your mode.
> - Clone and install https://github.com/TinoDidriksen/transfuse .
>
> After this all you need to do is pipe your html document to
> tf-html-fragment and give as argument a translation mode of your language
> pair of choice (full translation modes).
>
> Example:
>
> $ echo 'Hello <b>big green</b> <i>world</i>!' | tf-html-fragment
> /Users/khannatanmai/Documents/GSoC/repo/main/apertium-eng-spa/modes/eng-spa.mode
>
>
> Hola <i>Mundo</i> <b>verde grande</b> !
>
>
> It only works for html right now, but we're in the process of supporting
> all usual document types.
>
>
> *Known issues:*
>
> - If a transfer rule has multiple words in the pattern, and in the output
> there is a LU that wasn't clipped from any word in the input, it won't put
> a wordbound blank on that LU.
>
> - If -separable detects a string of words then the format of each will be
> combined and added on the entire string of words.
>
> - apertium-recursive isn't supported as of now. It will be by the end of
> the project though.
>
>
> If you have any questions, suggestions, I'd be glad to respond to them on
> this thread. If you need help testing this on your language pair you can
> contact us on the IRC. Same if you find any bugs, or have any feature
> requests.
>
>
> Enjoy!
> *तन्मय खन्ना *
> *Tanmai Khanna*
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to