Just in case you need a library - I recently packaged the Europarl sentence splitter and sentence aligner tools into two Perl modules on CPAN: http://search.cpan.org/~achimru/Lingua-Sentence-1.00/ http://search.cpan.org/~achimru/Text-GaleChurch-1.00/
Achim 2010/3/22 Jörg Tiedemann <[email protected]>: > > Europarl comes with a sentence aligner: > http://statmt.org/europarl/v5/tools.tgz > > You can also use hunalign: > http://mokk.bme.hu/resources/hunalign > (look at the "realign" feature for lexical matching) > GMA: > http://nlp.cs.nyu.edu/GMA/ > > Uplug includes all three and also a tool for interactive > (semi-automatic) sentence alignment: > http://sourceforge.net/projects/uplug/ > http://www.let.rug.nl/~tiedeman/Uplug/php/ > > > Jörg > > > Raphael Payen wrote: >> Hi >> >>>From what I've seen, moses, even with all the tools that go with it, >> requires a sentence-aligned bilingual corpus as its input. What if we >> only have an unaligned parallel corpus ? Do you know if there are >> tools available to do this sentence-level alignment ? There seems to >> be something in python-nltk, based on Gale & Church, but it is recent >> and not yet completely part of the package. Besides, Gale & Church >> algorithm uses only sentence lengths, probably there exist more >> powerful algorithms, using dictionaries of word alignment information >> ? (I mean "static" dictionaries provided beforehand; I guess >> theoretically there could be ways to "dynamically" use a word aligner >> like giza on an unaligned corpus, compute some word alignments, use >> them to compute the sentence alignements, and feed this to itself, but >> static dictionaries seem more practical). >> >> Also, since this step usually requires human supervision, do you know >> if there are there open-source / unix GUI tools to assist in editing >> the alignements proposed ? (comparable to Trados WinAlign) ? >> >> Best regards, >> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
