Can you clarify what you're asking for? While there is no specific utility in Moses that does any removal of word alignments or phrases, I have found that using diverse features (such as the lexical weighting features) results in some de facto cleaning of bad phrases (because phrases that score well under all features are much more likely to be used). For Giza++, the symmetrization heuristic can discard links from the initial alignmentsl; you might view this as a type of cleanup. You could also view the various supervised alignment methods as a type of cleanup for unsupervised methods like Giza++ (note: none of these are included with Moses). This is the topic of Fazil Ayan's thesis: http://www.lib.umd.edu/drum/handle/1903/3126
There are also various post hoc approaches to removing noise from phrases tables and alignments. Some recent examples: http://aclweb.org/anthology-new/D/D07/D07-1103.pdf http://aclweb.org/anthology-new/W/W08/W08-0306.pdf Although there's nothing like this included in Moses, it would be easy to contribute one as a standalone script. Cheers Adam On 23 Jul 2008, at 17:26, marco turchi wrote: > Dear all, > are there any checks either in Giza or during the phrase extraction > for discarding bad word alignments or phrase associations? > > Thanks a lot > Marco > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
