Since 2002, NIST has been coordinating yearly evaluations of Machine
Translation technology. The evaluations have focused on text-to-text
systems in the domains of "newswire", "web data" and transcripts of
"broadcast news".
Plans for the next NIST Open MT evaluation (MT-07)
(http://www.nist.gov/speech/tests/mt/) are well underway and they
include some exciting additions. While we expect to post the MT-07
evaluation specification document in early June, we include here a
summary of our plans.
Planned Schedule:
* Evaluation period: Nov 5-9, 2007
* Evaluation workshop: Jan 17-18, 2008
Confirmed Tasks:
* Arabic-to-English
* (NEW) A mystery language-to-English (not a surprise language test,
more below)
Hopeful Tasks: (pending funding)
* Chinese-to-English
* (NEW) English-to-Chinese
Metrics:
* We will again run a suite of automatic metrics
* (NEW) We will produce the infrastructure for web-based human assessments
(Sites can volunteer to participate).
* (NEW) MT Comprehension tests for Arabic (and Chinese) to English
Evaluation data:
* The web-crawling cut-off date will likely be June 1st.
* Genres will include Newswire and Web data
* (NEW) Half of the evaluation test set for Arabic (and Chinese)
will be defined as a "progress test" set. Strict usage rules will apply.
We define a progress test set as data that remains completely blind for
a series of evaluations. Source data and intermediate derivative files
are deleted after processing. The translations are never viewed.
Reference data is never released.
Mystery Language:
------------------------
NIST will be using one of the languages that the LDC has collected from
the "less commonly taught languages" collection, which includes
lexicons, monolingual and bilingual corpora, et cetera, for a number of
languages. We anticipate releasing a DVD of data and tools that
encapsulates everything required to build/modify a MT system for this
task. The language will be announced when we settle on the data pack to
use. The DVD of development data and tools will be distributed well
before the evaluation. Details will be announced as they are finalized.
--
// Mark Przybocki - NIST, Phone: 301/975-3347 FAX: 301/670-0939 \\
_______________________________________________
Mt-list mailing list