Hi

>From what I've seen, moses, even with all the tools that go with it,
requires a sentence-aligned bilingual corpus as its input. What if we
only have an unaligned parallel corpus ? Do you know if there are
tools available to do this sentence-level alignment ? There seems to
be something in python-nltk, based on Gale & Church, but it is recent
and not yet completely part of the package. Besides, Gale & Church
algorithm uses only sentence lengths, probably there exist more
powerful algorithms, using dictionaries of word alignment information
? (I mean "static" dictionaries provided beforehand; I guess
theoretically there could be ways to "dynamically" use a word aligner
like giza on an unaligned corpus, compute some word alignments, use
them to compute the sentence alignements, and feed this to itself, but
static dictionaries seem more practical).

Also, since this step usually requires human supervision, do you know
if there are there open-source / unix GUI tools to assist in editing
the alignements proposed ? (comparable to Trados WinAlign) ?

Best regards,

-- 
Raphael Payen
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to