Hello Eduard Hovy, Continuing the discussion:
> > 3. Has EBMT as a paradigm been 'muscled out' by the more dominant SMT approach? > > I don't think so, because SMT is no longer a single unitary approach, > and will continue to split into flavors, just as MT has done. And > these flavors will increasingly correspond to the old approaches of > the Vauquois pyramid. SMT just does much of the tedious work > automatically, and in some cases much better, than humans can. EBMT > should not try to compete with SMT on its strength (brute force > learning of large sets of example patterns), but should co-opt SMTas > a technique to do its gruntwork. > Does one conclude from here that SMT is more a tool rather than an MT engine on its own and the underlying engine still remains EBMT! If so, where is the question of EBMT competing with SMT? > The open questions of MT are still open: no-one can properly handle > interpersonal/stylistic/pragmatic effects of communication. If one > could use EBMT methods to capture style, for example, you'd be doing > something that syntax-based transfer approaches would find very hard > to do, and SMT approaches would struggle with given their need for > large corpora of unitary style. Is is not a myth that SMT as an MT engine needs only a large corpora as its 'fuel' from the application domain? Any failure in SMT could be easily attributed to inadequate corpora. However, is this not true even for EBMT, as examples are anyway used implicitly or explicitly from corpora, formally or informally, automatically or manually! In case of non-availabilty of corpora, as has been the case for Indic languages in which I work, we have used an 'interactive incerementally growing' example-base taking inputs from actual usage which starts exhibiting satuaration after long usage. This is a tedious way of growing example-base in which SMT can be made use of if parallel corpora were available, however, idiosyncrasies of unitary style referred to, can be more easily taken care of using my approach. So, what you mention as open questions of MT, have no particular preference for EBMT or SMT, both need extensive examples of actual usage in communication of the style. This only suggests that the real answer lies in hybridization, may be call it SEBMT (Statistical Example-based MT), wherein the basic MT is that of EBMT and SMT is used for learing replacement rules and re-inforcing them. Again, a pure EBMT (with raw example-base) is not 'practical' and so we use 'generalization' (what I have referred to as 'abstraction' in our works) using syntacto-semantic information (using some aspects of KBMT). Further using 'chunks' as examples, calls for using some aspects of RBMT. Automatic detection of chunks statistically may not be error free and even here using some aspects of RBMT is helpful. Thus finally it is hybridization that works. This is what I have been advocating in our works, call it HEBMT, SEBMT or HRSEBMT(hybridized rule-statistical-example based MT). In my opinion, no single paradigm can 'muscle out' other paradigms. Cheers, RMK Sinha ----------------------------------------------------------------------- Dr. R.M.K. Sinha Indian Institute of Technology, Kanpur 208016 India E-mail: rmk.iitk.ac.in Home-page URL: http://www.cse.iitk.ac.in/users/rmk/ ----------------------------------------------------------------------- > > Message: 2 > Date: Fri, 9 Jul 2004 12:10:22 -0700 > To: [EMAIL PROTECTED] > From: Eduard Hovy <[EMAIL PROTECTED]> > Subject: [Mt-list] Re: MT-List digest, Vol 1 #36 - 2 msgs > > > Hello Andy, > > >Date: Thu, 08 Jul 2004 14:28:12 +0100 > >From: Andy Way <[EMAIL PROTECTED]> > > > > 2. Can anyone envisage a situation where an SMT paper was asked to > > compare its results against an MT model? > > More than most other approaches, SMT people tend to ignore previous > work in the mistaken belief that it is not relevant, because SMT is > such a new paradigm. That is simply wrong, of course, as people have > argued (even in this mailing list). SMT is re-treading the path of > older approaches, but now doing things automatically that used to be > done by hand: > - the initial IBM work recreated word-replacement MT, but learned the > replacement rules automatically > - Och's and other current SMT is redoing EBMT, but learning the phrases > (i.e., examples) automatically > - Yamada and Knight, Wu, and Melamed each are working on versions of > transfer, with the rules, again, being learned > > > > 1. Can papers on EBMT succeed in getting published (especially in > > non-expert, i.e. MT-specific, conferences) without making direct > > comparisons to SMT? > > Given the above trend, I think an effective response is to explicitly > say in an EBMT paper "yes I am doing EBMT but creating the example > phrases and their translation by hand; some SMT is creating the > phrases by machine; for me an open question is not only how to create > lots of patterns automatically but how good the actual patterns are", > which simultaneously shows familiarity with the relevant SMT work, > brings it into the picture in the right way, and addresses a point on > which SMT-style EBMT is vulnerable. > > the bigger point, though, is: why should one not make comparisons to > SMT-style EBMT? A serious weakness of EBMT has always been the > bottleneck of building the example patterns and their translations > manually. SMT-style EBMT claims to overcome this bottleneck. Good > science demands that old-style EBMT work address this. You can still > then redirect the issue to the particular other, non-building, point > you are investigating. > > > > 3. Has EBMT as a paradigm been 'muscled out' by the more dominant > > SMT approach? > > I don't think so, because SMT is no longer a single unitary approach, > and will continue to split into flavors, just as MT has done. And > these flavors will increasingly correspond to the old approaches of > the Vauquois pyramid. SMT just does much of the tedious work > automatically, and in some cases much better, than humans can. EBMT > should not try to compete with SMT on its strength (brute force > learning of large sets of example patterns), but should co-opt SMT as > a technique to do its gruntwork. > > The open questions of MT are still open: no-one can properly handle > interpersonal/stylistic/pragmatic effects of communication. If one > could use EBMT methods to capture style, for example, you'd be doing > something that syntax-based transfer approaches would find very hard > to do, and SMT approaches would struggle with given their need for > large corpora of unitary style. > > E > > -- > Eduard Hovy > email: [EMAIL PROTECTED] USC Information Sciences Institute > tel: 310-448-8731 4676 Admiralty Way > fax: 310-823-6714 Marina del Rey, CA 90292-6695 > http://www.isi.edu/natural-language/nlp-at-isi.html > > > _______________________________________________ MT-List mailing list [EMAIL PROTECTED] http://www.computing.dcu.ie/mailman/listinfo/mt-list
