[Apologies for multiple postings]

FIRST CALL FOR PAPERS


REPROLANG 2020
Shared Task on the Reproduction of Research Results in Science and Technology of Language
(part of LREC 2020 conference)
Marseille, France
May 13-15, 2020
http://wordpress.let.vupr.nl/lrec-reproduction


We are very pleased to announce REPROLANG 2020, the Shared Task on the Reproduction of Research Results in Science and Technology of Language, organized by ELRA - European Language Resources Association with the technical support of CLARIN - European Research Infrastructure for Language Resources and Technology, as part of the LREC 2020 conference.

BACKGROUND

Scientific knowledge is grounded on falsifiable predictions and thus its credibility and raison d’être relies on the possibility of repeating experiments and getting similar results as originally obtained and reported. In many young scientific areas, including ours, acknowledgement and promotion of the reproduction of research results need very much to be increased.

For this reason, a special track on reproducibility is included into the LREC 2020 conference regular program (side by side with other sessions on other topics) for papers on reproduction of research results, and the present specific community-wide shared task is launched to elicit and motivate the spread of scientific work on reproduction. This initiative builds on the previous pioneer LREC workshops on reproducibility 4REAL 2016 and 4REAL 2018.


SHARED TASK

The shared task is of a new type: it is partly similar to the usual competitive shared tasks --- in the sense that all participants share a common goal; but it is partly different to previous shared tasks --- in the sense that its primary focus is on seeking support and confirmation of previous results, rather than on overcoming those previous results with superior ones. Thus instead of a competitive shared task, with each participant struggling for an individual top system that scores as far as possible from a rough baseline, this will be a cooperative shared task, with participants struggling for systems that reproduce as close as possible an original complex research experiment and thus eventually reinforcing the level of reliability on its results by means of their eventually convergent outcomes. Concomitantly, like with competitive shared tasks, in the process of participating in the collaborative shared task, new ideas for improvement and new advances beyond the reproduced results find here an excellent ground to be ignited.

We invite researchers to reproduce the results of a selected set of articles, which have been offered by the respective authors with their consent to be used for this shared task. Papers submitted for this task are expected to report on reproduction findings, to document how the results of the original paper were reproduced, to discuss reproducibility challenges, to inform on time, space or data requirements found concerning training and testing, to ponder on lessons learned, to elaborate on recommendations for best practices, etc. Submissions that in addition to the reproduction exercise, report also on results of the replication of the selected tasks with other languages, domains, data sets, models, methods, algorithms, downstream tasks, etc. are also encouraged. These should permit to gain insight also into the robustness of the replicated approaches, their learning curves and potential of incremental performance, their capacity of generalization, their transferability across experimental circumstances and into eventual real-life usage scenarios, their suitability to support further progress, etc.


PUBLICATION

LREC conferences have one of the top h5-index scores of research impact among the world class venues for research on Human Language Technology.

Accepted papers for the shared task will be published in the Proceedings of the LREC 2020 main conference. LREC Proceedings are freely available from ELRA and ACL Anthology. They are indexed in Scopus (Elsevier) and in DBLP. LREC 2010, LREC 2012 and LREC 2014 Proceedings are included in the Thomson Reuters Conference Proceedings Citation Index (the other editions are being processed).

Substantially extended versions of papers selected by reviewers as the most appropriate will be considered for publication in special issues of the Language Resources and Evaluation Journal published by Springer (a SCI-indexed journal).


IMPORTANT DATES

November 25, 2019: deadline for paper submission (aligned with LREC 2020)
November 27: deadline for projects in gitlab.com to go public
February 14, 2020: notification of acceptance
May 11-16: LREC conference takes place


SELECTED TASKS

The Selection Committee has selected a broad range of papers and tasks.

Chapter A: Lexical processing

Task A.1: Cross-lingual word embeddings

Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. “A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 789–798.
http://aclweb.org/anthology/P18-1073
Major reproduction comparables: Accuracy scores (tables 1 to 4).

Task A.2: Named entity embeddings

Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018. “Jointly Embedding Entities and Text with Distant Supervision”. In Proceedings of The Third Workshop on Representation Learning for NLP, pp. 195–206.
http://aclweb.org/anthology/W18-3026
Major reproduction comparables: Spearman’s ρ scores for semantic similarity predictions
(tables 3 and 4), and accuracy scores (table 6).

Chapter B: Sentence processing

Task B.1: POS tagging

Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, Emily Pitler, and Joshua Maynez. 2018. “Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2642–2652.
http://aclweb.org/anthology/P18-1246
Major reproduction comparables: f-score values (tables 2 to 8).

Task B.2: Sentence semantic relatedness

Gupta, Amulya, and Zhu Zhang. 2018. “To Attend or not to Attend: A Case Study on Syntactic Structures for Semantic Relatedness”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 2116–2125.
http://aclweb.org/anthology/P18-1197
Major reproduction comparables: Pearson’s r and Spearman’s ρ scores for the semantic relatedness
(table 1), and f-score values for paraphrase detection (table 2).

Chapter C: Text processing

Task C.1: Relation extraction and classification

Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. “ETH-DS3Lab at SemEval-2018 Task 7: Effectively Combining Recurrent and Convolutional Neural Networks for Relation Classification and Extraction”. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval 2018), pp. 689–696.
http://aclweb.org/anthology/S18-1112
Major reproduction comparables: precision, recall and f-score values (tables 3 and 4).

Task C.2: Privacy preserving representation

Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. “Towards Robust and Privacy-preserving Text Representations”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 25-30.
http://aclweb.org/anthology/P18-2005
Major reproduction comparables: POS accuracy scores (tables 1 and 2), and sentiment analysis
f-score scores (table 3).

Task C.3: Language modelling

Howard, Jeremy, and Sebastian Ruder. 2018. ”Universal Language Model Fine-tuning for Text Classification”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 328–339.
http://aclweb.org/anthology/P18-1031
Major reproduction comparables: Error rate (%) scores in sentiment analysis and question classification tasks (tables 2 and 3).

Chapter D: Applications

Task D.1: Text simplification

Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. “Exploring Neural Text Simplification Models”. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), pp. 85-91.
http://aclweb.org/anthology/P/P17/P17-2014.pdf
Major reproduction comparables: Averaged human evaluation scores, by 3 evaluators,
in 1 to 5 and -2 to +2 scales (table 2).

Task D.2: Language proficiency scoring

Vajjala, Sowmya, and Taraka Rama. 2018. “Experiments with Universal CEFR classifications”. In Proceedings of Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 147–153.
http://aclweb.org/anthology/W18-0515
Major reproduction comparables: f-score values (tables 2, 3 and 4).

Task D.3: Neural machine translation

Vanmassenhove, Eva, and Andy Way. 2018. “SuperNMT: Neural Machine Translation with Semantic Supersenses and Syntactic Supertags”. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), pp. 67–73.
http://aclweb.org/anthology/P18-3010
Major reproduction comparables: BLEU scores (tables 1 and 2; plots in figures 2, 3 and 4).

Chapter E: Language resources

Task E.1: Parallel corpus construction

Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and Giulia Venturi. 2016. “PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentences for Automatic Text Simplification”. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), pp. 351-361.
https://aclweb.org/anthology/D16-1034
Major reproduction comparables: data set.

Participants are expected to obtain the data and tools for the reproduction from the information provided in the paper. Using the description of the experiment is part of the reproduction exercise.
SUBMISSION
The START platform of LREC 2020 will be used for the submission of the following required elements: A paper describing the reproduction effort, and a link to the software and data used to obtain the results reported in the paper (more details below). The submitted materials and results will be checked by a CLARIN panel. Papers will be peer-reviewed.


PAPER PREPARATION
REPROLANG 2020 invites the submission of full papers from 4 pages to 8 pages (plus more pages for references if needed). These submissions must strictly follow the LREC 2020 conference stylesheet which will be available on the conference website.


MATERIALS PREPARATION
To be checked by a CLARIN panel and the submission to be complete, the software used to obtain the results reported in the paper must be made available as a docker container through a project in gitlab. Detailed instructions are available at https://gitlab.com/CLARIN-ERIC/reprolang/ For technical support, the CLARIN team can be contacted at reprolang...@clarin.eu or an issue can be created under https://gitlab.com/CLARIN-ERIC/reprolang/issues.

Submissions are done via the START conference management system used by LREC 2020 and include the following elements:
- url address of your gitlab.com project
- url of the tar.gz with the datasets - the md5 checksum of the above tar.gz
- .pdf with the paper, which must include the above url of your gitlab.com project, and the above commit hash and tag

The project in gitlab.com should be made public within 2 days after the submission deadline.

PRESENTATION Papers accepted for publication will be presented in a specific session of the LREC main conference. There is no difference in quality between oral and poster presentations. Only the appropriateness of the type of communication (more or less interactive) to the content of the paper will be considered. The format of the presentations will be decided by the Program Committee. The proceedings will include both oral and poster papers in the same format.

REGISTRATION
For a selected paper to be included in the programme and to be published in the proceedings, at least one of its authors must register for the LREC 2020 conference by the early bird registration deadline. A single registration only covers one paper, following the general LREC policy on registration. Registration service is to be found at the LREC 2020 website.


CONTACTS
About the shared task:
Piek Vossen
p.t.j.m.vos...@vu.nl

About the preparation and submission of materials:
reprolang...@clarin.eu
REPROLANG 2020 website: http://wordpress.let.vupr.nl/lrec-reproduction


ORGANIZATION

Steering Committee

António Branco, University of Lisbon (chair of Steering Committee)
Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee)
Gertjan van Noord, University of Groningen (chair of Task Selection Committee)
Piek Vossen, VU University Amsterdam (chair of Program Committee)


Task Selection Committee

Gertjan van Noord, University of Groningen (chair)
Tim Baldwin, University of Melbourne
António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa
Çağrı Çöltekin, University of Tuebingen
Nancy Ide, Vassar College, New York
Malvina Nissim, University of Groningen
Stephan Oepen, University of Oslo
Barbara Plank, University of Copenhagen
Piek Vossen, VU University Amsterdam
Dan Zeman, Prague University

Program Committee

several invitations awaiting an answer marked with [!]

Piek Vossen, VU University Amsterdam (chair)
  [!]Gilles Adda, LIMSI-CNRS, Paris
  [!]Eneko Agirre Basque University
Francis Bond, NanyangTechnical University, Singapore
António Branco, University of Lisbon

Nicoletta Calzolari, ILC, Pisa
Kevin Cohen, University of Colorado Boulder
 [!]Thierry Declerck decle...@dfki.de, DFKI Saarbruecken
  [!]John McCrae, Galway University
Nancy Ide , Vassar College, New York
  [!]Antske Fokkens VU University Amsterdam
Karën Fort, University of Paris-Sorbonne
  [!] Cyril Grouin, LIMSI-CNRS, Paris
Mark Liberman, University of Pennsylvania
  [!] Margo Mieskis
  [!] Aurélie Névéol, LIMSI-CNRS, Paris
Gertjan van Noord, University of Groningen
Stephan Oepen, University of Oslo
  [!]Ted Pedersen, University of Minnesota
Senja Pollak, Jozef Stefan Institute, Ljubljana
  [!]Paul Rayson, Lancaster University
Martijn Wieling, University of Groningen



Technical Committee
reprolang...@clarin.eu
Dieter Van Uytvanck, CLARIN (chair)
André Moreira, CLARIN
Twan Goosen, CLARIN
João Ricardo Silva, CLARIN and University of Lisbon
Luís Gomes, CLARIN and University of Lisbon
Willem Elbers, CLARIN


_______________________________________________
Mt-list site list
Mt-list@eamt.org
http://lists.eamt.org/mailman/listinfo/mt-list

Reply via email to