[Mt-list] Call for participation: 1st Translation Memory Cleaning Shared Task

Carla Parra Mon, 08 Feb 2016 05:25:07 -0800

 

(apologies for cross-posting)


CALL FOR PARTICIPATION IN THE 1ST TRANSLATION MEMORY CLEANING SHARED
TASK
organised at the 2nd Workshop on Natural Language Processing for
Translation Memories (NLP4TM 2016)
to be held at LREC 2016 (Portorož, Slovenia), May 28, 2016 

http://rgcl.wlv.ac.uk/nlp4tm2016/shared-task/ [1] 

The NLP4TM 2016 workshop proposes a shared task on cleaning translation
memories. Participants in this task will be required to take pairs of
source and target segments from translation memories and decide whether
they are right translations. For the first task three language pairs
have been prepared: EN-ES, EN-IT and EN-DE.

The data was annotated with information on whether the source and target
content of each TM segment represent a valid translation. In particular,
the following 3 point scale has been applied:
(1) The translation is correct.
(2) The translation is correct, but there are a few orthotypographic
mistakes so some minor post-editing is required
(3) The translation is not correct (content missing/added, wrong
meaning, etc.).

The annotation guidelines are available on the task's website.
For each language pair, 2/3 of the annotated segments are provided for
training and 1/3 will be provided for testing during the evaluation
phase.

1. TASKS PROPOSED
The participating teams can choose to participate in either or both of
the following three tasks: 

        * Binary Classification (I)

In this task, it is only required to determine whether a segment is
right or wrong. For the first binary classification option, only tag (1)
is considered correct because the translators do not need to make any
modification, whilst tags (2) and (3) are considered wrong translations.


        * Binary Classification (II)

As in the first task, in this task it is only required to determine
whether the segment is right or wrong. However, in contrast to the first
task, a segment is considered correct if it was labelled by annotators
as (1) or (2). Segments labelled (3) are considered wrong because they
require major post-editing. 

        * Fine-grained Classification:

In this task, the participating teams have to classify the segments
according to the annotation provided in the training data: correct
translations (1), correct translations with few orthotypographic errors
(2), and wrong (3). 

2. SUBMISSION AND EVALUATION INFORMATION
Participants are required to register their intention to participate by
filling in the following form before 1st April 2016:
http://goo.gl/forms/ELStRtrw9J [2]

The organisers will provide the training and test set to the
participating teams and they will be asked to submit the output of their
systems in a format similar to the training set. The exact modality and
formatting of submissions will be communicated to participants at a
later stage. 

For evaluation, standard measures like precision, recall, f-measure will
be used. In addition, the organisers may perform some manual error
analysis. The extent of this analysis will depend on the number of
systems submitted. For this reason, even though we do not plan to limit
the numbers of runs submitted by participants, they will be required to
indicate their primary (and secondary, if relevant) runs.

The participants are encouraged to release their systems and make them
publicly available for future use. They are also encouraged not to use
machine translation as one of the factors used to determine the class of
a segment. This is because we are trying to encourage development of
methods that can be run on large datasets without requiring a lot of
computational resources.

In addition to submitting the output of their system, the participants
will be asked to submit short contributions in the form of working notes
describing their systems. They will be published on the workshop's
website and submissions that are not accompanied by a description will
not be considered. 

All systems will be presented in a demo session during the workshop. 

3. IMPORTANT DATES 

        * Release of training data: second week of February 2016
        * End of registration: 1st April 2016
        * Evaluation phase: 14th - 27th April 2016
        * Ranking of systems and release of the test set annotations: 4th May
2016
        * Submission of working notes: 16th May 2016
        * Workshop date: 28th May 2016

4. ORGANISING COMMITTEE

Eduard Barbu, Translated, Italy
Carla Parra, Hermes, Spain
Luca Mastrostefano, Translated, Italy
Matteo Negri, FBK, Italy
Marco Turchi, FBK, Italy
Luisa Bentivogli, FBK, Italy
Constantin Orasan, University of Wolverhampton, UK

The organisers can be contacted by sending an email to
[email protected]. 

-- 

DR. CARLA PARRA ESCARTÍN 

 Especialista en tecnología aplicada - Investigadora Marie Curie -
EXPERT ITN [3]

 Applied Technology Engineer - Marie Curie Experienced Researcher -
EXPERT ITN [3] 

www.hermestrans.com [4] 

(+34) 91 640 7640 (Madrid) 

(+34) 95 202 0525 (Málaga)

AVISO LEGAL: Este mensaje está dirigido únicamente a su destinatario.
Contiene información CONFIDENCIAL sometida a secreto profesional o cuya
divulgación está prohibida por la ley. Si ha recibido este mensaje por
error, debe saber que su lectura, copia y uso no están autorizados. Le
rogamos que nos lo comunique inmediatamente por esta misma vía y proceda
a su destrucción. El correo electrónico mediante Internet no permite
asegurar la confidencialidad de los mensajes que se transmiten ni su
integridad o correcta recepción. Hermes Traducciones y Servicios
Lingüísticos, SL no asume responsabilidad alguna por estas
circunstancias y se reserva el derecho a ejercer las acciones legales
que le correspondan contra todo tercero que acceda de forma ilegítima al
contenido de este mensaje y al de los archivos en él contenidos. Si el
destinatario de este mensaje no consintiera la utilización del correo
electrónico por Internet y la grabación de los mensajes, rogamos que lo
ponga en nuestro conocimiento de forma inmediata. 

LEGAL NOTICE: This message is only intended for the addressee. It
contains CONFIDENTIAL information protected by professional secrecy.
Dissemination of such information is prohibited by law. If you have
received his message by mistake, please be aware that you are not
authorised to read, copy or use it. Please notify us immediately via
this means and destroy it. E-mail over the Internet does not allow to
ensure the confidentiality, integrity or correct reception of the
messages that are sent. Hermes Traducciones y Servicios Lingüísticos, SL
does not accept liability for these circumstances and reserves the right
to take the legal measures to which it is entitled against any third
party that unlawfully accesses the content of this message and the files
attached here to. If the addressee of this message does not consent to
the use of e-mail via the Internet and to messages being saved, please
notify us on an immediate basis.

 

Links:
------
[1] http://rgcl.wlv.ac.uk/nlp4tm2016/shared-task/
[2] http://goo.gl/forms/ELStRtrw9J
[3] http://expert-itn.eu/
[4] http://www.hermestrans.com

_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

[Mt-list] Call for participation: 1st Translation Memory Cleaning Shared Task

Reply via email to