Dear all EAMT members,
I'm Toshiaki Nakazawa from JST (Japan Science and Technology Agency),
Japan. This is an announcement of the 3rd Workshop on Asian
Translation (WAT2016) as a workshop of Coling2016. Those who are
working on machine translation, please join us.
### What's New about WAT2016 ###
1. workshop of Coling2016
2. new language pairs
- Hindi-English, Hindi-Japanese, Indonesian-English, Chinese-English
3. invite research papers
################################
Best regards,
---------------------------------------------------------------------------
WAT 2016
(The 3rd Workshop on Asian Translation)
in conjunction with COLING 2016
http://lotus.kuee.kyoto-u.ac.jp/WAT/
December 12, 2016, Osaka, Japan
Following the success of the previous Workshops on Asian Translation
(WAT 2014 and WAT 2015), WAT 2016 will bring together machine
translation researchers and users to try, evaluate, share and discuss
brand-new ideas of machine translation. We are working toward the
practical use of machine translation among all Asian countries.
For the WAT 2016, we adopt new translation subtasks
"Hindi-to-English/Japanese mixed domain translations" as well as
"Indonesian/Chinese/Japanese-to-English newswire translation" in
addition to the subtasks that were conducted in the previous two
workshops. The workshop will also feature research papers on topics
related to the machine translation, especially for Asian languages.
WAT 2016 also invites researchers to submit their original work on
machine translation of Asian languages. The scope covers studies and
reports on theories, techniques, and resources to improve the machines
translation of Asian languages. All submitted research papers will be
examined under a double-blind peer-reviewing to decide if they will
appear at the workshop.
Topics of interest include, but are not limited to:
- Word-/phrase-/syntax-/semantics-/rule-based, neural and hybrids machine
translation
- Asian language processing
- Incorporating linguistic information into machine translation
- Decoding algorithms
- System combination
- Error analysis
- Manual and automatic machine translation evaluation
- Machine translation applications
- Quality estimation
- Domain adaptation
- Machine translation for low resource languages
- Language resources
************************* IMPORTANT NOTICE *************************
Participants of the previous workshop are also required to sign up to
WAT2016
********************************************************************
IMPORTANT DATES
---------------
August 19 Crowdsourcing evaluation due
September 25 System description draft and research paper (new!) due
October 16 System description draft Review feedback
October 16 Research paper acceptance notification
October 30 System description and research paper camera-ready paper due
December 12 WAT 2016
TASK
----
The task is to improve the text translation quality for scientific
papers and patent documents. Participants choose any of the subtasks
in which they would like to participate and translate the test data
using their machine translation systems. The WAT organizers will
evaluate the results submitted using automatic evaluation and human
evaluation. We will also provide a baseline machine translation.
Subtasks:
Scientific Paper Subtasks:
English/Chinese <--> Japanese
Patent Subtasks:
English/Chinese/Korean <--> Japanese
Newswire Subtasks:
Indonesian/Chinese/Japanese <--> English
Mixed domain Subtasks:
Hindi <--> English/Japanese
Dataset:
* Scientific paper Subtasks:
WAT uses ASPEC for the dataset including training, development,
development test and test data. Participants of the scientific papers
subtask must get a copy of ASPEC by themselves. ASPEC consists of
approximately 3 million Japanese-English parallel sentences from paper
abstracts (ASPEC-JE) and approximately 0.7 million Japanese-Chinese
paper excerpts (ASPEC-JC)
* Patent Subtasks:
WAT uses JPO Patent Corpus, which is constructed by Japan Patent
Office (JPO). This corpus consists of 1 million Chinese-Japanese
parallel sentences and 1 million Korean-Japanese parallel sentences
from patent description with four categories. Participants of patents
subtask are required to get it on WAT2016 site of JPO Patent Corpus.
* Newswire Subtasks (Indonesian <--> English):
WAT uses BPPT Corpus, which is constructed by Badan Pengkajian dan
Penerapan Teknologi (BPPT). This corpus consists of 50,000
Indonesian-Japanese parallel sentences from news description with five
categories. Participants of patents subtask are required to get it on
WAT2016 site of BPPT Corpus.
* Newswire Subtasks (Chinese/Japanese <--> English):
TBA
* Mixed domain Subtask:
WAT uses HINDEN for the dataset for training, development, development
test and test data. The training corpus is mixed domain and contains
around 1 million lines of sentences and phrases. In order to access
the corpus participants should sign the following agreement, scan and
send it to the addresss mentioned in it. The training corpus is a
mixed domain corpus whose composition will be availble in the readme
of the corpus you download. The development and test set are from the
News domain and are exactly the same as the ones in WMT 2014.
Automatic evaluation:
We are providing an automatic evaluation server. It is for free for
everyone, but you need to create an account for evaluation. Just
showing the list of evaluation results does not require an account.
Sign-up: http://lotus.kuee.kyoto-u.ac.jp/WAT/registration/index.html
Eval. result: http://lotus.kuee.kyoto-u.ac.jp/WAT/evaluation/index.html
Human evaluation:
Both crowdsourcing evaluation and JPO adequacy evaluation will be
carried out for selected subtasks and selected submitted systems (the
details will be announced later). Participants can submit one
translation result for each subtask.
INVITED TALK
------------
TBA
ORGANIZERS
----------
Toshiaki Nakazawa, Japan Science and Technology Agency (JST), Japan
Hideya Mino, National Institute of Information and Communications Technology
(NICT), Japan
Chenchen Ding, National Institute of Information and Communications Technology
(NICT), Japan
Isao Goto, Japan Broadcasting Corporation (NHK), Japan
Graham Neubig, Nara Institute of Science and Technology (NAIST), Japan
Sadao Kurohashi, Kyoto University, Japan
Ir. Hammam Riza, Agency for the Assessment and Application of Technology
(BPPT), Indonesia
Pushpak Bhattacharyya, Indian Institute of Technology Bombay (IIT), India
CONTACT
-------
[email protected]
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list