FIRST CALL FOR PAPERS
Workshop on Applying Machine Learning techniques to optimising the 
division of labour in Hybrid MT (http://www.dfki.de/ml4hmt/)
==================================================================================================
 

Conference: Machine Translation Summit XIII (MT Summit XIII)

Workshop Purpose and Theme
==========================
The workshop will explore alternatives in order to provide optimal 
support for Hybrid MT design, using sophisticated machine-learning 
techniques. One further important objective of the workshop is to build 
bridges from MT to the ML community to systematically and jointly 
explore the choice space for Hybrid MT.

Workshop Programme
==================
The workshop will open with an invited talk (speaker TBA), followed by 
two technical paper sessions and a challenge or shared task session, and 
will conclude with a discussion panel.
Topics of Interest of the Technical Papers
Topics of interests include, but are not limited to:
* use of Machine Learning techniques in combination / hybridization of 
Machine Translation systems
* using richer linguistic information in phrase-based SMT (e.g. in 
factored models or hierarchical SMT)
* using phrases from different types of MT in e.g. phrase-based SMT
* system combination approaches, either parallel in multi-engine MT 
(MEMT) or sequential in statistical post-editing (SPMT)
* learning resources (e.g. transfer rules, transduction grammars) for 
probabilistic rule-based MT
All contributions will be published in the workshop proceedings.

Shared Task Description
=======================
The "Shared Task on Optimising the Division of Labour in Hybrid MT " is 
an effort to trigger systematic investigation on improving 
state-of-the-art Hybrid MT, using advanced machine-learning (ML) 
methodologies. Participants are requested to build Hybrid/System 
Combination systems by combining the output of several systems of 
different types, which is provided by the organizers.
The main focus of the shared task is trying to answer the following 
question:
Could Hybrid/System Combination MT techniques benefit from extra 
information (linguistically motivated, decoding and runtime) from the 
different systems involved?
* Data: The participants are given a development bilingual set, aligned 
at a sentence level. Each "bilingual sentence" contains:
         o the source sentence,
         o the target (reference) sentence and
         o the corresponding multiple output translations from 5 
different systems, based on different MT approaches (Apertium, 
Ramírez-Sanchéz, 2006; Joshua, Zhifei Li et al, 2009; Lucy, Alonso and 
Thurmair, 2003; Matrex, Penkale et. al 2010) Metis, Vandeghinste et al., 
2006). The output has been annotated with system-internal information 
deriving from the translation process of each of the systems (see below).
* Baseline: As a baseline we consider state-of-the-art open-source 
system-combination systems, such as MANY (Barrault, 2010) and CMU-MEMT 
(Heafierld & Lavie, 2010).
* Challenge: Participants are challenged to build an MT mechanism that 
improves over the baseline, by making effective use of the 
system-specific MT output. They can either provide solutions based on an 
open source system, or develop their own mechanisms. A suggested 
approach is given below.
        o Spanish-English will be the language direction
        o The development set can be used for tuning the systems during 
the development phase. Final submissions have to include translation 
output on a test set, which will be available one week before the 
submission deadline
        o If you need language/reordering models they can be built upon 
the WMT News Commentary (http://www.statmt.org/wmt11/).
        o Participants can also make use of additional linguistic 
analysis tools, if their systems require so, but they have to explicitly 
declare that upon submission, so that they are judged as "unconstrained" 
systems.
* Evaluation: The system output will be judged via peer-based human 
evaluation. During the evaluation phase, participants will be requested 
to rank system outputs of other participants through a web-based 
interface (Appraise; Federmann 2010). Automatic metrics (BLEU, Papineni 
et. al, 2002) will be additionally used.
* System description: shared task participants will be invited to submit 
short papers (4-6 pages) describing their systems or their evaluation 
metrics (see instructions in Submissions).

Important Dates
===============
* May 20th – Release of data for the challenge
* July 20th - Paper Submissions due / Challenge results due
* August 10th – Author notification / Release of challenge evaluation 
results
* August 19th - Final version due

Submissions
===========
Technical papers and system description papers should follow the main 
conference formatting requirements 
(http://mt.xmu.edu.cn/mtsummit/SubmitPapers.html#). To submit 
contributions, please follow the instructions at the Workshop management 
system submission website: 
https://www.easychair.org/account/signin.cgi?conf=ml4hmt.
The contributions will undergo a double-blind review by members of the 
programme committee. Please address queries to [email protected].

Organization
============
Chair:   Toni Badia (Pompeu Fabra University, Spain)
Co-chairs: Christian Federmann (German Research Center for Artificial 
Intelligence, Germany), Josef van Genabith (Dublin City University, 
Ireland)
Committee members
=================
Maite Melero (Barcelona Media Innovation Center, Spain), Marta R. 
Costa-jussà (Barcelona Media Innovation Center, Spain), Pavel Pecina 
(Dublin City University, Ireland), Eleftherios Avramadis (German 
Research Center for Artificial Intelligence, Germany)
Program Committee
=================
Eleftherios Avramadis (German Research Center for Artificial 
Intelligence, Germany)
Rafael Banchs (Institute for Infocomm Reserarch - I2R, Singapore)
Loïc Barrault (LIUM - University of Le Mans, France)
Chris Callison-Burch (Johns Hopkins University, MD, USA)
Jinhua Du (Faculty of Automation and Information Engineering, Xi'an 
University of Technology, Xi'an, China)
Andreas Eisele (Directorate-General for Translation (DGT), Luxembourg)
Cristina España-Bonet (Technical University of Catalonia, TALP, Barcelona)
Patrick Lambert (LIUM - University of Le Mans, France)
Maite Melero (Barcelona Media Innovation Center, Spain)
Pavel Pecina (Dublin City University, Ireland)
Marta R. Costa-jussà (Barcelona Media Innovation Center, Spain)
David Vilar (German Research Center for Artificial Intelligence, Germany)
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to