Hello! I just found a (potential) issue, and wanted to double check with you (it's probably something you already looked into and is not a real issue): looking at transfuse's code, I saw tf-mangle-mode is doing tf-close on the generation step.
How does it work when postgen steps merges some of words? -- Xavi Ivars < http://xavi.ivars.me > El dl., 20 de jul. 2020, 20:34, Tanmai Khanna <khanna.tan...@gmail.com> va escriure: > Hey guys, > Using wordbound blanks, we've modified the Apertium pipeline, modules and > stream such that inline markup tags now move around with words in transfer, > merge when LUs merge, split when LUs split, to preserve the formatting of > the input document. If you want to follow the further development of this > project, see here > <https://wiki.apertium.org/wiki/User:Khannatanmai/Wordbound_blanks>. > > We have a decent version that is ready to test that does markup handling > for html documents. It will undergo extensive testing as part of this > project, but I thought it'll be a good idea to let the community test it > themselves on their language pairs based on their needs so that we can > understand what features need to be added, and what needs to be fixed. > Apertium users have been asking for markup handling for quite some time now > and had no other option but to use wrappers that try to guess alignments. > I'm hoping this project helps in that regard. Here's what you need to test > this: > - Make sure you have the latest commits of apertium and lttoolbox > installed. > - Latest commits of -separable, -anaphora, etc. if you're using those in > your mode. > - Clone and install https://github.com/TinoDidriksen/transfuse . > > After this all you need to do is pipe your html document to > tf-html-fragment and give as argument a translation mode of your language > pair of choice (full translation modes). > > Example: > > $ echo 'Hello <b>big green</b> <i>world</i>!' | tf-html-fragment > /Users/khannatanmai/Documents/GSoC/repo/main/apertium-eng-spa/modes/eng-spa.mode > > > Hola <i>Mundo</i> <b>verde grande</b> ! > > > It only works for html right now, but we're in the process of supporting > all usual document types. > > > *Known issues:* > > - If a transfer rule has multiple words in the pattern, and in the output > there is a LU that wasn't clipped from any word in the input, it won't put > a wordbound blank on that LU. > > - If -separable detects a string of words then the format of each will be > combined and added on the entire string of words. > > - apertium-recursive isn't supported as of now. It will be by the end of > the project though. > > > If you have any questions, suggestions, I'd be glad to respond to them on > this thread. If you need help testing this on your language pair you can > contact us on the IRC. Same if you find any bugs, or have any feature > requests. > > > Enjoy! > *तन्मय खन्ना * > *Tanmai Khanna* > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff