[Mt-list] Re: SMT vs. EBMT: MT-List digest, Vol 1 #37 - 2 msgs

Prof. R.M.K. Sinha Sun, 11 Jul 2004 11:21:24 -0700

Hello Eduard Hovy,

Continuing the discussion:


> > 3. Has EBMT as a paradigm been 'muscled out' by the more
dominant SMT approach?
> 
> I don't think so, because SMT is no longer a single unitary approach,
> and will continue to split  into flavors, just as MT has done.  And  
> these flavors will increasingly correspond to the old approaches of
> the Vauquois pyramid.  SMT just does much of the tedious work
> automatically, and in some cases much better, than humans can.  EBMT
> should not try to compete with SMT on its strength (brute force
> learning of large sets of example patterns), but should co-opt SMTas
> a technique to do its gruntwork.
> 

Does one conclude from here that SMT is more a tool rather than an MT
engine on its own and the underlying engine still remains EBMT! If so, 
where is the question of EBMT competing with SMT?  
 
> The open questions of MT are still open: no-one can properly handle  
> interpersonal/stylistic/pragmatic effects of communication.  If one
> could use EBMT methods to capture style, for example, you'd be doing
> something that syntax-based transfer approaches would find very hard
> to do, and SMT approaches would struggle with given their need for
> large corpora of unitary style.

Is is not a myth that SMT as an MT engine needs only a large corpora
as its 'fuel' from the application domain? Any failure in SMT could be
easily attributed to inadequate corpora. However, is this not true
even for EBMT, as examples are anyway used implicitly or explicitly
from corpora, formally or informally, automatically or manually!
In case of non-availabilty of corpora, as has been the case for Indic
languages in which I work, we have used an 'interactive incerementally
growing' example-base taking inputs from actual usage which starts
exhibiting satuaration after long usage. This is a tedious way of
growing example-base in which SMT can be made use of if parallel
corpora were available, however, idiosyncrasies of unitary style
referred to, can be more easily taken care of using my approach.

So, what you mention as open questions of MT, have no particular
preference for EBMT or SMT, both need extensive examples of actual
usage in communication of the style. This only suggests that the real
answer lies in hybridization, may be call it SEBMT (Statistical
Example-based MT), wherein the basic MT is that of EBMT and SMT is
used for learing replacement rules and re-inforcing them. Again, a
pure EBMT (with raw example-base) is not 'practical' and so we use
'generalization' (what I have referred to as 'abstraction' in our
works) using syntacto-semantic information (using some aspects of
KBMT). Further using 'chunks' as examples, calls for using some
aspects of RBMT. Automatic detection of chunks statistically may not
be error free and even here using some aspects of RBMT is helpful.
Thus finally it is hybridization that works. This is what I have been
advocating in our works, call it HEBMT, SEBMT or HRSEBMT(hybridized 
rule-statistical-example based MT).

In my opinion, no single paradigm can 'muscle out' other paradigms.

Cheers,

RMK Sinha
-----------------------------------------------------------------------
Dr. R.M.K. Sinha
Indian Institute of Technology, Kanpur 208016 India            
E-mail: rmk.iitk.ac.in
Home-page URL: http://www.cse.iitk.ac.in/users/rmk/
-----------------------------------------------------------------------
> 
> Message: 2
> Date: Fri, 9 Jul 2004 12:10:22 -0700
> To: [EMAIL PROTECTED]
> From: Eduard Hovy <[EMAIL PROTECTED]>
> Subject: [Mt-list] Re: MT-List digest, Vol 1 #36 - 2 msgs
> 
> 
> Hello Andy,
> 
> >Date: Thu, 08 Jul 2004 14:28:12 +0100
> >From: Andy Way <[EMAIL PROTECTED]>
> >
> >     2. Can anyone envisage a situation where an SMT paper was asked to
> >        compare its results against an MT model?
> 
> More than most other approaches, SMT people tend to ignore previous 
> work in the mistaken belief that it is not relevant, because SMT is 
> such a new paradigm.  That is simply wrong, of course, as people have 
> argued (even in this mailing list).  SMT is re-treading the path of 
> older approaches, but now doing things automatically that used to be 
> done by hand:
> - the initial IBM work recreated word-replacement MT, but learned the
>    replacement rules automatically
> - Och's and other current SMT is redoing EBMT, but learning the phrases
>    (i.e., examples) automatically
> - Yamada and Knight, Wu, and Melamed each are working on versions of
>    transfer, with the rules, again, being learned
> 
> 
> >     1. Can papers on EBMT succeed in getting published (especially in
> >        non-expert, i.e. MT-specific, conferences) without making direct
> >        comparisons to SMT?
> 
> Given the above trend, I think an effective response is to explicitly 
> say in an EBMT paper "yes I am doing EBMT but creating the example 
> phrases and their translation by hand; some SMT is creating the 
> phrases by machine; for me an open question is not only how to create 
> lots of patterns automatically but how good the actual patterns are", 
> which simultaneously shows familiarity with the relevant SMT work, 
> brings it into the picture in the right way, and addresses a point on 
> which SMT-style EBMT is vulnerable.
> 
> the bigger point, though, is: why should one not make comparisons to 
> SMT-style EBMT?  A serious weakness of EBMT has always been the 
> bottleneck of building the example patterns and their translations 
> manually.  SMT-style EBMT claims to overcome this bottleneck.  Good 
> science demands that old-style EBMT work address this.  You can still 
> then redirect the issue to the particular other, non-building, point 
> you are investigating.
> 
> 
> >     3. Has EBMT as a paradigm been 'muscled out' by the more dominant
> >        SMT approach?
> 
> I don't think so, because SMT is no longer a single unitary approach, 
> and will continue to split  into flavors, just as MT has done.  And 
> these flavors will increasingly correspond to the old approaches of 
> the Vauquois pyramid.  SMT just does much of the tedious work 
> automatically, and in some cases much better, than humans can.  EBMT 
> should not try to compete with SMT on its strength (brute force 
> learning of large sets of example patterns), but should co-opt SMT as 
> a technique to do its gruntwork.
> 
> The open questions of MT are still open: no-one can properly handle 
> interpersonal/stylistic/pragmatic effects of communication.  If one 
> could use EBMT methods to capture style, for example, you'd be doing 
> something that syntax-based transfer approaches would find very hard 
> to do, and SMT approaches would struggle with given their need for 
> large corpora of unitary style.
> 
> E
> 
> -- 
> Eduard Hovy
> email: [EMAIL PROTECTED]            USC Information Sciences Institute
> tel: 310-448-8731            4676 Admiralty Way
> fax: 310-823-6714            Marina del Rey, CA 90292-6695
> http://www.isi.edu/natural-language/nlp-at-isi.html
> 
> 
> 



_______________________________________________
MT-List mailing list
[EMAIL PROTECTED]
http://www.computing.dcu.ie/mailman/listinfo/mt-list

[Mt-list] Re: SMT vs. EBMT: MT-List digest, Vol 1 #37 - 2 msgs

Reply via email to