Hello!
30/3/04
At 17:48 +0200 30/03/04, Nic Cottrell wrote:
Hi!
Has there been a lot of research into the possibility of using
user-interactivity to improve the quality of MT? I was thinking of
something like a spell checker but to assist in correct sense
disambiguation etc.
Any links or papers would be much appreciated.
Thanks,
Nic.
--
nicholas cottrell <[EMAIL PROTECTED]>
stockholm, sweden
phone +46 702 630 451
Yes, the idea is not new. Here is a very incomplete list.
Lab prototypes
- First attempts about 1965-67 by M.Kay and R.Kaplan at Rand corp. to include ID (interactive disambiguation).
- ITS at BYU (Provo) 1973-81 or so (ref.: ask Alan Melby)
- N-Trans (Alvey project) in the 80's
- DLT (BSO research, 1982-88, ask Klaus Schubert)
- ITS-2 (LATL, Geneva, 198?--, ask E.Wehrli)
- LIDIA (GETA, 1989-1995, ask H.Blanchon)
Industrial systems or prototypes
- ALPS (Transactive), Weidner (CAT)�
- A system by VINITI (Moscow, I forgot its name)
- JETS (IBM-Japan, ?--1990 or so, ask Watanabe). Lexical and syntactic ID, using dependency analysis and constraeint propagation.
- KANT-CATALYST (1992--, CMU for Caterpillar, with controlled input language and link to an ontology of the domain)
- Maybe LMT (IBM) has or had some ID possibility
- Taifun & Tsunami (EJ, JE): lexical ID.
- Also, a very new version of Systran includes some level of ID.
- Certainly many others (WordMagic, CIMOS�), check.
Contact H.Blanchon for more references.
A *main* point is that attempts to systematically ask questions
while the system is processing fail because users become "slaves
of the machine" and don't like it! Another is that users should
not be supposed to be specialists in grammar, nor to know the system,
or the target language(s).
Questions should be asked
- at some intermediate points, at the discretion of the users (e.g., after all-path analysis)
- in the source language
- with straightforward questions (no trees, no notions of "PP-attachments�"
AND they should be asked to the proper persons. At IBM-Japan, the
technical writers agreed to answer lexical questions by JETS
(J->E), but not questions about dependencies between words. They
felt it was not their business -- and maybe the interface, although
very nice, was too "linguistic" for them.
Suppose you have "Paul's photograph".
It seems to be a bad idea to ask:
1. Paul -- owner_of --> photograph
2. Paul -- agent_of --> photograph
3. Paul -- object_of --> photograph
and to be better to ask:
1. Paul owns/owned the photograph ?
2. Paul does/did the photograph?
3. Paul is/was on the photograph?
Or: "I know which hotel manages this office"
Compare:
1. hotel -- agent_of --> manages ? (y/n)
with the more understandable alternative:
1. hotel manages office ?
2. office manages notel ?
Best,
CB
