[Moses-support] Call For Shared Task Participation: WAT2018 (The 5th Workshop on Asian Translation)

Toshiaki Nakazawa Fri, 13 Jul 2018 01:54:25 -0700

Dear all MT researchers/users,

I'm Toshiaki Nakazawa from The University of Tokyo. This is the call
for participation for the shared tasks of the 5th Workshop on Asian
Translation (WAT2018).
http://lotus.kuee.kyoto-u.ac.jp/WAT/


WAT2018 will be collocated with the PACLIC32 (Dec 1-3 in Hong Kong).
http://www.cbs.polyu.edu.hk/2018paclic/index.php

If you are working on machine translation, please join us.

IMPORTANT DATES
---------------

August 31     Translation Task Submission Deadline
October 26    System Description Paper Submission Deadline
November 2    Review Feedback of System Description Paper
November 9    Camera-ready Deadline
December 1, 2 or 3  WAT2018

* All deadlines are calculated at 11:59PM UTC-7


Best regards,

---------------------------------------------------------------------------
                       WAT2018
       (The 5th Workshop on Asian Translation)
           in collaboration with PACLIC32
        http://lotus.kuee.kyoto-u.ac.jp/WAT/
        December 1, 2 or 3, 2018, Hong Kong

Following the success of the previous WAT workshops, WAT2018 will
bring together machine translation researchers and users to try,
evaluate, share and discuss brand-new ideas about machine
translation.

WAT2018 does NOT accept research papers. Instead you can submit them
to PACLIC32.
http://www.cbs.polyu.edu.hk/2018paclic/call-for-papers.php

What's NEW in WAT2018:

* baseline translations are updated to NMT (from PBSMT)
* additional test data for patent tasks
* Myanmar-English translation tasks
* multilingual translation subtask for 10 Indian languages

************************* IMPORTANT NOTICE *************************
Participants of the previous workshop are also required to sign up to
WAT2018
********************************************************************

TASK
----

The task is to improve the text translation quality for scientific
papers and patent documents. Participants choose any of the subtasks
in which they would like to participate and translate the test data
using their machine translation systems. The WAT organizers will
evaluate the results submitted using automatic evaluation and human
evaluation. We will also provide a baseline machine translation.

Tasks:
  Scientific Paper Tasks: [Asian Scientific Paper Excerpt Corpus (ASPEC)]
    English/Chinese <--> Japanese
  Patent Tasks: [Japan Patent Office Patent Corpus 2.0 (JPC2)]
    English/Chinese/Korean <--> Japanese
    Chinese -> Japanese Expression Pattern Task
  Newswire Tasks:
    English <--> Japanese [JIJI Corpus]
  Indian Language Tasks:
    Hindi <--> English [IIT Bombay (IITB) Corpus]
    10 Indian Languages [NEW!!]
  Mixed domain tasks: UCSY and ALT Corpora
    Myanmar (Burmese) <--> English [NEW!!]
  Recipe Tasks: [Cookpad Comparable Corpus]
    Japanese <--> English

Dataset:

* Scientific paper Tasks:

WAT uses ASPEC for the dataset including training, development,
development test and test data. Participants of the scientific paper
tasks must get a copy of ASPEC by themselves. ASPEC consists of
approximately 3 million Japanese-English parallel sentences from paper
abstracts (ASPEC-JE) and approximately 0.7 million Japanese-Chinese
paper excerpts (ASPEC-JC)

* Patent Tasks:

WAT uses JPO Patent Corpus 2.0 (JPC2), which is constructed by Japan
Patent Office (JPO). This corpus consists of 1 million parallel
sentences from patent description with four categories (Chemistry,
Electricity, Machine and Physics) for each language pair
(English-Japanese, Chinese-Japanese and Korean-Japanese). Participants
are required to get it on WAT2018 site of JPC2.

- English/Chinese/Korean <--> Japanese:
  These tasks evaluate performance of a translation model similarly as
  the other translation tasks. Differing from the previous tasks at
  WAT2015, WAT2016 and WAT2017, new test sets of these tasks consists
  of (a) patent documents published between 2011 and 2013, which were
  used in the past years' WAT, and (b) ones published between 2016 and
  2017 for each language pair. We will also evaluate performance of
  the section (a) so as to compare systems submitted in the past
  years' WAT.

- Chinese -> Japanese Expression Pattern Task:
  This task evaluates performance of a translation model for each
  predefined category of expression patterns, which corresponds to
  title of invention (TIT), abstract (ABS), scope of claim (CLM) or
  description (DES). Test set of this task consists of sentences each
  of which is annotated with a corresponding category of expression
  patterns.

* Newswire Tasks (English <--> Japanese):

WAT uses JIJI Corpus, which is constructed by Jiji Press Ltd. in
collaboration with the National Institute of Information and
Communications Technology (NICT). This corpus consists of a
Japanese-English news corpus of 200K parallel sentences, from Jiji
Press news with various categories. Participants of patent tasks are
required to get it on WAT2017 site of JIJI Corpus.

* Indian Language Tasks:

TBA (Keep watching our WEB site)

* Myanmar <--> English Tasks:

WAT uses UCSY Corpus and ALT Corpus. 
The UCSY corpus and a portion of the ALT corpus are use as training data, 
which are around 220,000 lines of sentences and phrases. 
The development and test data are from the ALT corpus. 

* Recipe Tasks:

WAT uses Recipe Corpus, which is constructed by Cookpad Inc. This
corpus consists of 16,282 Japanese-English parallel sentences from
recipes. Participants of recipe tasks are required to get it on
WAT2018 site of Recipe Corpus.


EVALUATION
----------

Automatic evaluation:
We are providing an automatic evaluation server. It is for free for
everyone, but you need to create an account for evaluation. Just
showing the list of evaluation results does not require an account.

Sign-up: http://lotus.kuee.kyoto-u.ac.jp/WAT/registration/index.html
Eval. result: http://lotus.kuee.kyoto-u.ac.jp/WAT/evaluation/index.html

Human evaluation:
Both crowdsourcing evaluation and JPO adequacy evaluation will be
carried out for selected subtasks and selected submitted systems (the
details will be announced later). Participants can submit one
translation result for each subtask.


ORGANIZERS
----------

Pushpak Bhattacharyya, Indian Institute of Technology Bombay (IIT), India
Raj Dabre, National Institute of Information and Communications Technology 
(NICT), Japan
Chenchen Ding, National Institute of Information and Communications Technology 
(NICT), Japan
Isao Goto, Japan Broadcasting Corporation (NHK), Japan
Jun Harashima, Cookpad Inc., Japan
Shohei Higashiyama, National Institute of Information and Communications 
Technology (NICT), Japan
Hideo Kazawa, Google, Japan
Anoop Kunchukuttan, Microsoft Research India, India
Sadao Kurohashi, Kyoto University, Japan
Hideya Mino, Japan Broadcasting Corporation (NHK), Japan
Toshiaki Nakazawa, The University of Tokyo, Japan
Graham Neubig, Carnegie Mellon University (CMU), Japan
Yusuke Oda, Google, Japan
Win Pa Pa, University of Computer Studies, Yangon (UCSY), Myanmar
Katsuhito Sudoh, Nara Institute of Science and Technology (NAIST), Japan

CONTACT
-------

[email protected]
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Call For Shared Task Participation: WAT2018 (The 5th Workshop on Asian Translation)

Reply via email to