Dear mt-list users,
Specially Simon Zwarts, Dekai, John Smart, Andy Way and Michael Carl, thanks for you kind messages and explanations.
I think I got the idea behind of the question I've done. In fact, I am still reading initial bibliography on the area, so while I read more, I should have clear ideas.
Thank you, Alberto
Simon Zwarts wrote:
On Thu, 3 Feb 2005 08:11, Alberto Manuel Brandao Simoes wrote:
Meanwhile, I found this article from Microsoft Research:
http://research.microsoft.com/research/pubs/view.aspx?type=Publication&id=1 354
After reading the introduction, almost all examples of what they call Phrasal SMT seems (to me) examples of EBMT systems.
Hello,
First of all I think the admit right from the start they are working on ... bridged the gap between the domain-specific learning of Example-based and SMT systems and ... (although there are referring in this quote to a previous system they indicate that they want to solve some problems there)
Why this still should be classified as a SMT system rather than an Example-based system is because they still employ the typical SMT noisy channel model (see Chapter 3) were the problem splits in the well known two parts of decoding and the language model. Which are both statistical models. The decoding part (Chapter 3&4) is using trees and to obtain these statistics it is trained on examples.
As a result the decoding part can produce lots of translations (because according to the SMT paradigm every sentence has a certain probability of being a translation of another), so a search algorithm is needed (4.1 and 4.2). So there is no "one way" how to transfer a source sentence in a target sentence, lots of possible translations are "ranked" against the language model.
The model is trained on a corpus with structure only in the source language. The structure of the target language is derived from the statistical alignment model and the dependency tree of the source (!) tree. (p11) There are no examples of structure of the target language, which would be more the EBMT way.
Furthermore where in an EBMT system you try to store as much examples as possible for future use, most "examples" here "disappear" in statistics. It's not so important to have individual examples. When they only show one example like at the end of chapter 2 this is hard to see.
This not the first article on phrase based SMT, the articles they refer to might show in a clearer way why this phrase based approach is still clearly SMT.
Well this is how I see it, Simon Zwarts - Language Technology group - Macquarie University _______________________________________________ Mt-list mailing list
-- Alberto Sim�es - Departamento de Inform�tica - Universidade do Minho ,,__ .. .. / o._) .---. /--'/--\ \-'|| .----. .' '. / \_/ / | .' '..' '-. .'\ \__\ __.'.' .' �-._ )\ | )\ | _.' // \\ // \\ ||_ \\|_ \\_ Perl Monger mrf '--' '--'' '--' www.perl-hackers.net _______________________________________________ Mt-list mailing list
