Hi Hieu, Thanks a lot for the extra information. I still haven't tried the factored model so I will give it a try first before further investigating the options within the tree models.
Although I saw one of your related posts (I pasted the link below) as a reply to a question about factored models and tagging, I still would like to make sure: when tagging the source and target languages, do the taggers have to use the same tagsets? or is it actually ok if they are different sets? http://www.mail-archive.com/[email protected]/msg00736.html Thank you very much for your support. Regards, Arda Tezcan ------------------------------ Message: 2 Date: Tue, 31 Aug 2010 13:01:16 +0100 From: Hieu Hoang <[email protected]> Subject: [Moses-support] Fwd: Re: Tree based models - Eng > Ger general question To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset="iso-8859-1" hi arda I think an email by Chris Dyer sums up the issue that it's pretty hard to beat the phrase-based BLEU for many language pairs. http://www.mail-archive.com/[email protected]/msg01995.html here's Edinburgh's attempt from this years WMT10: http://aclweb.org/anthology-new/W/W10/W10-1715.pdf The straightforward way of adding syntax severely reduces BLEU, you have to add something extra to get any gains. Off the top of my head, the main ways that i've seen so far is 1. Add alternative parses, eg. forest decoding 2. Mix up the parse tree, eg. SAMT 3. Soft constrain instead of hard constraints, eg http://www.isi.edu/~chiang/papers/acl2010-chiang.pdf 4. Occasionally ignoring syntax, eg. http://aclweb.org/anthology-new/W/W10/W10-1761.pdf There's loads of other ways & papers i haven't mentioned ***************
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
