[Mt-list] 1st CfP: REPROLANG 2020 (Shared Task on the Reproduction of Research Results in Science and Technology of Language)

ELRA ELDA Information Tue, 12 Mar 2019 09:11:35 -0700

[Apologies for multiple postings]

FIRST CALL FOR PAPERS



REPROLANG 2020

Shared Task on the Reproduction of Research Results in Science andTechnology of Language

(part of LREC 2020 conference)
Marseille, France
May 13-15, 2020
http://wordpress.let.vupr.nl/lrec-reproduction

We are very pleased to announce REPROLANG 2020, the Shared Task on theReproduction of Research Results in Science and Technology of Language,organized by ELRA - European Language Resources Association with thetechnical support of CLARIN - European Research Infrastructure forLanguage Resources and Technology, as part of the LREC 2020 conference.


BACKGROUND

Scientific knowledge is grounded on falsifiable predictions and thus itscredibility and raison d’être relies on the possibility of repeatingexperiments and getting similar results as originallyobtained and reported. In many young scientific areas, including ours,acknowledgement and promotion of the reproduction of research resultsneed very much to be increased.

For this reason, a special track on reproducibility is included into theLREC 2020 conference regular program (side by side with other sessionson other topics) for papers on reproduction of research results, and thepresent specific community-wide shared task is launched to elicit andmotivate the spread of scientific work on reproduction. This initiativebuilds on the previous pioneer LREC workshops on reproducibility 4REAL2016 and 4REAL 2018.



SHARED TASK

The shared task is of a new type: it is partly similar to the usualcompetitive shared tasks --- in the sense that all participants share acommon goal; but it is partly different to previous shared tasks --- inthe sense that its primary focus is on seeking support and confirmationof previous results, rather than on overcoming those previous resultswith superior ones. Thus instead of a competitive shared task, with eachparticipant struggling for an individual top system that scores as faras possible from a rough baseline, this will be a cooperative sharedtask, with participants struggling for systems that reproduce as closeas possible an original complex research experiment and thus eventuallyreinforcing the level of reliability on its results by means of theireventually convergent outcomes. Concomitantly, like with competitiveshared tasks, in the process of participating in the collaborativeshared task, new ideas for improvement and new advances beyond thereproduced results find here an excellent ground to be ignited.

We invite researchers to reproduce the results of a selected set ofarticles, which have been offered by the respective authors with theirconsent to be used for this shared task. Papers submitted for this taskare expected to report on reproduction findings, to document how theresults of the original paper were reproduced, to discussreproducibility challenges, to inform on time, space or datarequirements found concerning training and testing, to ponder on lessonslearned, to elaborate on recommendations for best practices, etc.Submissions that in addition to the reproduction exercise, report alsoon results of the replication of the selected tasks with otherlanguages, domains, data sets, models, methods, algorithms, downstreamtasks, etc. are also encouraged. These should permit to gain insightalso into the robustness of the replicated approaches, their learningcurves and potential of incremental performance, their capacity ofgeneralization, their transferability across experimental circumstancesand into eventual real-life usage scenarios, their suitability tosupport further progress, etc.



PUBLICATION

LREC conferences have one of the top h5-index scores of research impactamong the world class venues for research on Human Language Technology.

Accepted papers for the shared task will be published in the Proceedingsof the LREC 2020 main conference. LREC Proceedings are freely availablefrom ELRA and ACL Anthology. They are indexed in Scopus (Elsevier) andin DBLP. LREC 2010, LREC 2012 and LREC 2014 Proceedings are included inthe Thomson Reuters Conference Proceedings Citation Index (the othereditions are being processed).

Substantially extended versions of papers selected by reviewers as themost appropriate will be considered for publication in special issues ofthe Language Resources and Evaluation Journal published by Springer (aSCI-indexed journal).



IMPORTANT DATES

November 25, 2019: deadline for paper submission (aligned with LREC 2020)
November 27: deadline for projects in gitlab.com to go public
February 14, 2020: notification of acceptance
May 11-16: LREC conference takes place


SELECTED TASKS

The Selection Committee has selected a broad range of papers and tasks.

Chapter A: Lexical processing

Task A.1: Cross-lingual word embeddings

Artetxe, Mikel, Gorka Labaka, and Eneko Agirre. 2018. “A robustself-learning method for fully unsupervised cross-lingual mappings ofword embeddings”. In Proceedings of the 56th Annual Meeting of theAssociation for Computational Linguistics (ACL 2018), pp. 789–798.

http://aclweb.org/anthology/P18-1073
Major reproduction comparables: Accuracy scores (tables 1 to 4).

Task A.2: Named entity embeddings

Newman-Griffis, Denis, Albert M Lai, and Eric Fosler-Lussier. 2018.“Jointly Embedding Entities and Text with Distant Supervision”. InProceedings of The Third Workshop on Representation Learning for NLP,pp. 195–206.

http://aclweb.org/anthology/W18-3026

Major reproduction comparables: Spearman’s ρ scores for semanticsimilarity predictions

(tables 3 and 4), and accuracy scores (table 6).

Chapter B: Sentence processing

Task B.1: POS tagging

Bohnet, Bernd, Ryan McDonald, Gonçalo Simões, Daniel Andor, EmilyPitler, and Joshua Maynez. 2018. “Morphosyntactic Tagging with aMeta-BiLSTM Model over Context Sensitive Token Encodings”. InProceedings of the 56th Annual Meeting of the Association forComputational Linguistics (ACL 2018), pp. 2642–2652.

http://aclweb.org/anthology/P18-1246
Major reproduction comparables: f-score values (tables 2 to 8).

Task B.2: Sentence semantic relatedness

Gupta, Amulya, and Zhu Zhang. 2018. “To Attend or not to Attend: A CaseStudy on Syntactic Structures for Semantic Relatedness”. In Proceedingsof the 56th Annual Meeting of the Association for ComputationalLinguistics (ACL 2018), pp. 2116–2125.

http://aclweb.org/anthology/P18-1197

Major reproduction comparables: Pearson’s r and Spearman’s ρ scores forthe semantic relatedness

(table 1), and f-score values for paraphrase detection (table 2).

Chapter C: Text processing

Task C.1: Relation extraction and classification

Rotsztejn, Jonathan, Nora Hollenstein, and Ce Zhang. 2018. “ETH-DS3Labat SemEval-2018 Task 7: Effectively Combining Recurrent andConvolutional Neural Networks for Relation Classification andExtraction”. In Proceedings of the 12th International Workshop onSemantic Evaluation (SemEval 2018), pp. 689–696.

http://aclweb.org/anthology/S18-1112

Major reproduction comparables: precision, recall and f-score values(tables 3 and 4).


Task C.2: Privacy preserving representation

Li, Yitong, Timothy Baldwin, and Trevor Cohn. 2018. “Towards Robust andPrivacy-preserving Text Representations”. In Proceedings of the 56thAnnual Meeting of the Association for Computational Linguistics (ACL2018), pp. 25-30.

http://aclweb.org/anthology/P18-2005

Major reproduction comparables: POS accuracy scores (tables 1 and 2),and sentiment analysis

f-score scores (table 3).

Task C.3: Language modelling

Howard, Jeremy, and Sebastian Ruder. 2018. ”Universal Language ModelFine-tuning for Text Classification”. In Proceedings of the 56th AnnualMeeting of the Association for Computational Linguistics (ACL 2018), pp.328–339.

http://aclweb.org/anthology/P18-1031

Major reproduction comparables: Error rate (%) scores in sentimentanalysis and question classification tasks (tables 2 and 3).


Chapter D: Applications

Task D.1: Text simplification

Nisioi, Sergiu, Sanja Stajner, Simone Paolo Ponzetto, and Liviu P. Dinu.2017.“Exploring Neural Text Simplification Models”. In Proceedings of the55th Annual Meeting of the Association for Computational Linguistics(ACL 2017), pp. 85-91.

http://aclweb.org/anthology/P/P17/P17-2014.pdf

Major reproduction comparables: Averaged human evaluation scores, by 3evaluators,

in 1 to 5 and -2 to +2 scales (table 2).

Task D.2: Language proficiency scoring

Vajjala, Sowmya, and Taraka Rama. 2018. “Experiments with Universal CEFRclassifications”.In Proceedings of Thirteenth Workshop on Innovative Use of NLP forBuilding Educational Applications, pp. 147–153.

http://aclweb.org/anthology/W18-0515
Major reproduction comparables: f-score values (tables 2, 3 and 4).

Task D.3: Neural machine translation

Vanmassenhove, Eva, and Andy Way. 2018. “SuperNMT: Neural MachineTranslation with Semantic Supersenses and Syntactic Supertags”. InProceedings of the 56th Annual Meeting of the Association forComputational Linguistics (ACL 2018), pp. 67–73.

http://aclweb.org/anthology/P18-3010

Major reproduction comparables: BLEU scores (tables 1 and 2; plots infigures 2, 3 and 4).


Chapter E: Language resources

Task E.1: Parallel corpus construction

Brunato, Dominique, Andrea Cimino, Felice Dell'Orletta, and GiuliaVenturi. 2016. “PaCCSS-IT: A Parallel Corpus of Complex-Simple Sentencesfor Automatic Text Simplification”. In Proceedings of the 2016Conference on Empirical Methods in Natural Language Processing (EMNLP2016), pp. 351-361.

https://aclweb.org/anthology/D16-1034
Major reproduction comparables: data set.

Participants are expected to obtain the data and tools for thereproduction from the information provided in the paper. Using thedescription of the experiment is part of the reproduction exercise.

SUBMISSION

The START platform of LREC 2020 will be used for the submission of thefollowing required elements: A paper describing the reproduction effort,and a link to the software and data used to obtain the results reportedin the paper (more details below). The submitted materials and resultswill be checked by a CLARIN panel. Papers will be peer-reviewed.



PAPER PREPARATION

REPROLANG 2020 invites the submission of full papers from 4 pages to 8pages (plus more pages for references if needed). These submissions muststrictly follow the LREC 2020 conference stylesheet which will beavailable on the conference website.



MATERIALS PREPARATION

To be checked by a CLARIN panel and the submission to be complete, thesoftware used to obtain the results reported in the paper must be madeavailable as a docker container through a project in gitlab. Detailedinstructions are available at https://gitlab.com/CLARIN-ERIC/reprolang/For technical support, the CLARIN team can be contacted atreprolang...@clarin.eu or an issue can be created underhttps://gitlab.com/CLARIN-ERIC/reprolang/issues.

Submissions are done via the START conference management system used byLREC 2020 and include the following elements:

- url address of your gitlab.com project
- url of the tar.gz with the datasets - the md5 checksum of the above tar.gz

- .pdf with the paper, which must include the above url of yourgitlab.com project, and the above commit hash and tag

The project in gitlab.com should be made public within 2 days after thesubmission deadline.

PRESENTATION Papers accepted for publication will be presented in aspecific session of the LREC main conference. There is no difference inquality between oral and poster presentations. Only the appropriatenessof the type of communication (more or less interactive) to the contentof the paper will be considered. The format of the presentations will bedecided by the Program Committee. The proceedings will include both oraland poster papers in the same format.


REGISTRATION

For a selected paper to be included in the programme and to be publishedin the proceedings, at least one of its authors must register for theLREC 2020 conference by the early bird registration deadline. A singleregistration only covers one paper, following the general LREC policy onregistration. Registration service is to be found at the LREC 2020 website.



CONTACTS
About the shared task:
Piek Vossen
p.t.j.m.vos...@vu.nl

About the preparation and submission of materials:
reprolang...@clarin.eu
REPROLANG 2020 website: http://wordpress.let.vupr.nl/lrec-reproduction


ORGANIZATION

Steering Committee

António Branco, University of Lisbon (chair of Steering Committee)
Nicoletta Calzolari, ILC, Pisa (co-chair of Steering Committee)

Gertjan van Noord, University of Groningen (chair of Task SelectionCommittee)

Piek Vossen, VU University Amsterdam (chair of Program Committee)


Task Selection Committee

Gertjan van Noord, University of Groningen (chair)
Tim Baldwin, University of Melbourne
António Branco, University of Lisbon
Nicoletta Calzolari, ILC, Pisa
Çağrı Çöltekin, University of Tuebingen
Nancy Ide, Vassar College, New York
Malvina Nissim, University of Groningen
Stephan Oepen, University of Oslo
Barbara Plank, University of Copenhagen
Piek Vossen, VU University Amsterdam
Dan Zeman, Prague University

Program Committee

several invitations awaiting an answer marked with [!]

Piek Vossen, VU University Amsterdam (chair)
  [!]Gilles Adda, LIMSI-CNRS, Paris
  [!]Eneko Agirre Basque University
Francis Bond, NanyangTechnical University, Singapore
António Branco, University of Lisbon

Nicoletta Calzolari, ILC, Pisa
Kevin Cohen, University of Colorado Boulder
 [!]Thierry Declerck decle...@dfki.de, DFKI Saarbruecken
  [!]John McCrae, Galway University
Nancy Ide , Vassar College, New York
  [!]Antske Fokkens VU University Amsterdam
Karën Fort, University of Paris-Sorbonne
  [!] Cyril Grouin, LIMSI-CNRS, Paris
Mark Liberman, University of Pennsylvania
  [!] Margo Mieskis
  [!] Aurélie Névéol, LIMSI-CNRS, Paris
Gertjan van Noord, University of Groningen
Stephan Oepen, University of Oslo
  [!]Ted Pedersen, University of Minnesota
Senja Pollak, Jozef Stefan Institute, Ljubljana
  [!]Paul Rayson, Lancaster University
Martijn Wieling, University of Groningen



Technical Committee
reprolang...@clarin.eu
Dieter Van Uytvanck, CLARIN (chair)
André Moreira, CLARIN
Twan Goosen, CLARIN
João Ricardo Silva, CLARIN and University of Lisbon
Luís Gomes, CLARIN and University of Lisbon
Willem Elbers, CLARIN


_______________________________________________
Mt-list site list
Mt-list@eamt.org
http://lists.eamt.org/mailman/listinfo/mt-list

[Mt-list] 1st CfP: REPROLANG 2020 (Shared Task on the Reproduction of Research Results in Science and Technology of Language)

Reply via email to