EMNLP 2015 Workshop on Discourse in Machine Translation (DiscoMT'15)
            (http://www.idiap.ch/workshop/DiscoMT)
            17 September 2015 -- Lisbon, Portugal


Final call for papers - Submission deadline: 28 June 2015


It is well-known that texts have properties that go beyond those of their 
individual sentences and that reveal themselves in the frequency and 
distribution of words, word senses, referential forms and syntactic structures, 
including:
- document-wide properties, such as style, register, reading level and genre;
- patterns of topical or functional sub-structure;
- patterns of discourse coherence, as realized through explicit and/or implicit 
relations between sentences, clauses or referring forms;
- anaphoric and elliptic expressions, in which speakers exploit the previous 
discourse context to convey subsequent information very succinctly.

By the end of the 1990s, these properties had stimulated considerable research 
in Machine Translation, aimed at endowing machine--translated texts with 
similar document and discourse properties as their source texts.  A period of 
ten years then elapsed before interest resumed in these topics, now from the 
perspectives of Statistical and/or Hybrid Machine Translation.  This led to the
first ACL Workshop on Discourse in Machine Translation (DiscoMT) in 2013, held 
in Sofia, Bulgaria. 

Since then, SMT has itself evolved in ways that allow more access to needed 
linguistic knowledge, through the availability of feature-rich statistical 
models.  As such, we are now holding a second DiscoMT workshop (DiscoMT'15), 
this time with a complementary Shared Task (see below).

DiscoMT'15 solicits submissions on any the following topics and any language 
pairs, but also welcomes submissions that link discourse studies with machine 
translation in some other way.

- discourse processing in support of MT, including:
 . textual coherence, including anaphora, coreference, tense, aspect and 
modality
 . textual cohesion, including lexical consistency
 . discourse structure, including use of connectives and information 
structuring devices
 . topic structure
 . consistency in style and register;
- MT techniques for obtaining document-level consistency and domain 
adaptability;
- MT techniques for structured documents;
- methods and algorithms to handle discourse-level phenomena in MT training and 
decoding;
- uses of MT in processing discourse-level phenomena;
- techniques for evaluating the effect of efforts targetting discourse-level 
phenomena in SMT
- techniques for assessing the impact of discourse-level processing on MT 
quality;
- quantitative studies on the impact of discourse-level phenomena on current MT 
systems vs. discourse-aware ones.

SUBMISSION INSTRUCTIONS

We solicit previously unpublished work, presented either as long or short 
papers, following the ACL 2015 formatting guidelines at

   http://www.acl2015.org/call_for_papers.html

Long papers should have at most 8 pages of content, not including references.  
Short papers are limited to 4 pages of content, not including references.  
There is no constraint on the size of the reference list.  Submissions should 
be anonymous and not disclose in any way the identity of the author(s).  
Submissions should be made using the START system at

https://www.softconf.com/emnlp2015/DiscoMT15 


IMPORTANT DATES

Submission deadline: 28 June 2015
Notification of acceptance: 21 July 2015
Final versions due: 11 August 2015
Workshop: 17 September 2015

CO-CHAIRS

Bonnie Webber, University of Edinburgh
Andrei Popescu-Belis, Idiap Research Institute
Marine Carpuat, University of Maryland

ORGANIZING COMMITTEE

Ani Nenkova, University of Pennsylvania
Christian Hardmeier, Uppsala University
Jorg Tiedemann, Uppsala University
Lori Levin, Carnegie Mellon University
Lucia Specia, University of Sheffield
Mark Fishel, University of Zurich
Min Zhang, Soochow University
Preslav Nakov, Qatar Computing Research Institute

PROGRAM COMMITTEE

Liane Guillou, University of Edinburgh
Beata Beigman Klebanov, Educational Testing Service, New Jersey
Francisco Guzmán, Qatar Computing Research Institute, Doha, Qatar
Shafiq Joty, Qatar Computing Research Institute, Doha, Qatar
Thomas Meyer, Google, Zurich
Michal Novak, Charles University, Prague
Lucie Poláková, Charles University, Prague
Maja Popovic, DFKI, Berlin
Sara Stymne, University of Uppsala
Yannick Versley, University of Heidelberg
Marion Weller, University of Stuttgart



SHARED TASK

The DiscoMT shared task will consist of two sub-tasks, designed to make it 
interesting to both the MT and discourse communities. For the MT community, 
there is a practical MT task, for the discourse community, a classification 
task that requires no specific MT expertise. Both subtasks will be run on 
transcripts from the TED conference series.  Both subtasks use the language 
pair English-French, which has a sufficiently high baseline performance to 
produce basically intelligible output, as well as interesting differences in 
their pronoun systems.

Subtask A: Pronoun-focused Translation

The first subtask is a regular end-to-end statistical machine translation (SMT) 
task, where participants are provided training data for an SMT system and are 
asked to generate a translation of a unseen test set for the evaluation.  
Unlike other MT shared tasks, our primary evaluation will focus not on general 
MT quality, but specifically on the correctness of pronoun translation.  Thanks 
to a grant from the European Association for Machine Translation, the 
evaluation of pronoun correctness will be carried out manually and is 
complimentary for the participants.

Task B: Cross-Lingual Pronoun Prediction

The second task requires participating systems to predict the correct 
translation of a source language pronoun from a small set of classes.  The 
input data will consist of the source language text and a complete manual 
reference translation from which the target pronouns have been removed.  The 
evaluation of this task will be fully automatic by matching against the 
pronouns found in the reference translation.

Further details on the shared task can be found at

http://www.idiap.ch/workshop/DiscoMT/shared-task

Shared Task Coordinators

Christian Hardmeier, Uppsala University
Preslav Nakov, Qatar Computing Research Institute
Sara Stymne, Uppsala University
Yannick Versley, University of Heidelberg
Jörg Tiedemann, Uppsala University

_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

Reply via email to