Re: [Moses-support] Giza + Phrase extraction

Adam Lopez Wed, 23 Jul 2008 10:32:06 -0700

Can you clarify what you're asking for?

While there is no specific utility in Moses that does any removal of  
word alignments or phrases, I have found that using diverse features  
(such as the lexical weighting features) results in some de facto  
cleaning of bad phrases (because phrases that score well under all  
features are much more likely to be used).  For Giza++, the  
symmetrization heuristic can discard links from the initial  
alignmentsl; you might view this as a type of cleanup.  You could also  
view the various supervised alignment methods as a type of cleanup for  
unsupervised methods like Giza++ (note: none of these are included  
with Moses).  This is the topic of Fazil Ayan's thesis:
http://www.lib.umd.edu/drum/handle/1903/3126

There are also various post hoc approaches to removing noise from  
phrases tables and alignments.  Some recent examples:
http://aclweb.org/anthology-new/D/D07/D07-1103.pdf
http://aclweb.org/anthology-new/W/W08/W08-0306.pdf

Although there's nothing like this included in Moses, it would be easy  
to contribute one as a standalone script.

Cheers
Adam

On 23 Jul 2008, at 17:26, marco turchi wrote:

> Dear all,
> are there any checks either in Giza or during the phrase extraction  
> for discarding bad word alignments or phrase associations?
>
> Thanks a lot
> Marco
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Giza + Phrase extraction

Reply via email to