EMNLP 2015 Workshop on Discourse in Machine Translation (DiscoMT'15)
(http://www.idiap.ch/workshop/DiscoMT)
17 September 2015 -- Lisbon, Portugal
Second call for papers
It is well-known that texts have properties that go beyond those of their
individual sentences and that reveal themselves in the frequency and
distribution of words, word senses, referential forms and syntactic structures,
including:
- document-wide properties, such as style, register, reading level and genre;
- patterns of topical or functional sub-structure;
- patterns of discourse coherence, as realized through explicit and/or implicit
relations between sentences, clauses or referring forms;
- anaphoric and elliptic expressions, in which speakers exploit the previous
discourse context to convey subsequent information very succinctly.
By the end of the 1990s, these properties had stimulated considerable research
in Machine Translation, aimed at endowing machine--translated texts with
similar document and discourse properties as their source texts. A period of
ten years then elapsed before interest resumed in these topics, now from the
perspectives of Statistical and/or Hybrid Machine Translation. This led to the
first ACL Workshop on Discourse in Machine Translation (DiscoMT) in 2013, held
in Sofia, Bulgaria.
Since then, SMT has itself evolved in ways that allow more access to needed
linguistic knowledge, through the availability of feature-rich statistical
models. As such, we are now holding a second DiscoMT workshop (DiscoMT'15),
this time with a complementary Shared Task (see below).
DiscoMT'15 solicits submissions on any the following topics and any language
pairs, but also welcomes submissions that link discourse studies with machine
translation in some other way.
- discourse processing in support of MT, including:
. textual coherence, including anaphora, coreference, tense, aspect and
modality
. textual cohesion, including lexical consistency
. discourse structure, including use of connectives and information
structuring devices
. topic structure
. consistency in style and register;
- MT techniques for obtaining document-level consistency and domain
adaptability;
- MT techniques for structured documents;
- methods and algorithms to handle discourse-level phenomena in MT training and
decoding;
- uses of MT in processing discourse-level phenomena;
- techniques for evaluating the effect of efforts targetting discourse-level
phenomena in SMT
- techniques for assessing the impact of discourse-level processing on MT
quality;
- quantitative studies on the impact of discourse-level phenomena on current MT
systems vs. discourse-aware ones.
SUBMISSION INSTRUCTIONS
We solicit previously unpublished work, presented either as long or short
papers, following the ACL 2015 formatting guidelines at
http://www.acl2015.org/call_for_papers.html
Long papers should have at most 8 pages of content, not including references.
Short papers are limited to 4 pages of content, not including references.
There is no constraint on the size of the reference list. Submissions should
be anonymous and not disclose in any way the identity of the author(s).
Submissions should be made using the START system at
https://www.softconf.com/emnlp2015/DiscoMT15
IMPORTANT DATES
Submission deadline: 28 June 2015
Notification of acceptance: 21 July 2015
Final versions due: 11 August 2015
Workshop: 17 or 18 September 2015
CO-CHAIRS
Bonnie Webber, University of Edinburgh
Andrei Popescu-Belis, Idiap Research Institute
Marine Carpuat, University of Maryland
ORGANIZING COMMITTEE
Ani Nenkova, University of Pennsylvania
Christian Hardmeier, Uppsala University
Jorg Tiedemann, Uppsala University
Lori Levin, Carnegie Mellon University
Lucia Specia, University of Sheffield
Mark Fishel, University of Zurich
Min Zhang, Soochow University
Preslav Nakov, Qatar Computing Research Institute
PROGRAM COMMITTEE
Liane Guillou, University of Edinburgh
Beata Beigman Klebanov, Educational Testing Service, New Jersey
Francisco Guzmán, Qatar Computing Research Institute, Doha, Qatar
Shafiq Joty, Qatar Computing Research Institute, Doha, Qatar
Thomas Meyer, Google, Zurich
Michal Novak, Charles University, Prague
Lucie Poláková, Charles University, Prague
Maja Popovic, DFKI, Berlin
Sara Stymne, University of Uppsala
Yannick Versley, University of Heidelberg
Marion Weller, University of Stuttgart
SHARED TASK
The DiscoMT shared task will consist of two sub-tasks, designed to make it
interesting to both the MT and discourse communities. For the MT community,
there is a practical MT task, for the discourse community, a classification
task that requires no specific MT expertise. Both subtasks will be run on
transcripts from the TED conference series. Both subtasks use the language
pair English-French, which has a sufficiently high baseline performance to
produce basically intelligible output, as well as interesting differences in
their pronoun systems.
Subtask A: Pronoun-focused Translation (submission deadline: May 10, 2015)
The first subtask is a regular end-to-end statistical machine translation (SMT)
task, where participants are provided training data for an SMT system and are
asked to generate a translation of a unseen test set for the evaluation.
Unlike other MT shared tasks, our primary evaluation will focus not on general
MT quality, but specifically on the correctness of pronoun translation. Thanks
to a grant from the European Association for Machine Translation, the
evaluation of pronoun correctness will be carried out manually and is
complimentary for the participants.
Task B: Cross-Lingual Pronoun Prediction (submission deadline: May 18, 2015)
The second task requires participating systems to predict the correct
translation of a source language pronoun from a small set of classes. The
input data will consist of the source language text and a complete manual
reference translation from which the target pronouns have been removed. The
evaluation of this task will be fully automatic by matching against the
pronouns found in the reference translation.
Further details on the shared task can be found at
http://www.idiap.ch/workshop/DiscoMT/shared-task
Shared Task Coordinators
Christian Hardmeier, Uppsala University
Preslav Nakov, Qatar Computing Research Institute
Sara Stymne, Uppsala University
Yannick Versley, University of Heidelberg
Jörg Tiedemann, Uppsala University
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list