-------------------------------------------------------------------------
Patent Machine Translation Task (PatentMT) at NTCIR-10
Call for Participation
June 18-23, 2013, Tokyo, Japan
http://ntcir.nii.ac.jp/PatentMT-2/
- Patent examination evaluation will be conducted
- Human evaluation will be conducted
- Parallel corpora provided:
1 million Chinese-English sentence pairs
3 million Japanese-English sentence pairs
-------------------------------------------------------------------------
Those interested are invited to participate in the Patent Machine
Translation task (PatentMT) at NTCIR-10.
Patents constitute one of the challenging domains for machine
translation because the sentences are often quite long and contain
complex structures. Moreover, there is a significant practical need
for machine translation of patent documents. Let us cultivate this
challenging research field!
PatentMT builds on the previous patent translation tasks.
The best systems for the previous PatentMT at NTCIR-9 were:
System type % of understandable sentences
- Chinese to English (CE) SMT 80%
- Japanese to English (JE) RBMT 63%
- English to Japanese (EJ) SMT 60%
These evaluation results were based on the quality of randomly selected
sentences. In addition to that, NTCIR-10 is planning to evaluate
usefulness in patent examination and differences over time, and
compare CE and JE translations.
Four types of evaluations are planned:
- Intrinsic Evaluation (IE)
Similar to the NTCIR-9 evaluation. The quality of translated
sentences will be evaluated using new test sets.
Human and automatic evaluations will be conducted.
- Patent Examination Evaluation (PEE)
New: The usefulness of machine translation for patent
examination will be evaluated.
Real reference patent documents that were used to reject patent
applications will be machine translated, and the translation
results will be evaluated to see if they would be useful for
examining patent applications.
This evaluation will be conducted for the CE and JE subtasks.
Nippon Intellectual Property Translation Association (NIPTA)
will cooperate for PEE.
- Chronological Evaluation (ChE)
New: A comparison between NTCIR-10 and 9 to measure progress
over time, using the NTCIR-9 test sets.
- Multilingual Evaluation (ME)
New: A comparison of CE and JE translations using the same
English references to see the source language dependency.
The "training resources" are as follows:
- Chinese to English subtask:
* 1 million Chinese-English parallel sentences
* 300 million English patent sentences
- Japanese to English subtask:
* 3 million Japanese-English parallel sentences
* 300 million English patent sentences
- English to Japanese subtask:
* 3 million Japanese-English parallel sentences
* 400 million Japanese patent sentences
Moreover, blind "test sets" of patent descriptions will be released.
The use of the data will be governed by the NTCIR-10 agreement.
Participants are requested to TRANSLATE the test sets, to SUBMIT a paper
describing their MT system, and to SHOW UP and PRESENT their work at
Tokyo.
=== Important Dates
- Training data release June 29, 2012 (Started)
- Task registration deadline August 31, 2012 (Extended)
- Test data release October 15, 2012
- Translation results submission deadline October 28, 2012
- Evaluation results release February 1, 2013
- MT system description deadline March 1, 2013
- Camera-ready deadline May 1, 2013
- NTCIR-10 workshop June 18-23, 2013
=== Organizers
Chinese-English side:
- Benjamin K. Tsou (Hong Kong Institute of Education/
City University of Hong Kong)
- Kapo Chow (Hong Kong Institute of Education)
- Bin Lu (City University of Hong Kong/
Hong Kong Institute of Education)
Japanese-English side:
- Isao Goto (National Institute of Information and
Communications Technology, NICT)
- Eiichiro Sumita (National Institute of Information and
Communications Technology, NICT)
For more information, please visit the NTICR-10 PatentMT website:
http://ntcir.nii.ac.jp/PatentMT-2/
-------------------------------------------------------------------------
_______________________________________________
Mt-list mailing list