[Corpora-List] CFP AuTexTification 🤖 vs 👩🏻: Automatic Text Identification shared task at IberLEF 2023

Paolo Rosso via Corpora Tue, 21 Mar 2023 14:19:50 -0700

Hola Luis

¿qué tal?


Acabo de ver en Corpora-list que estás a tope con temas de chatbots.

A lo mejor ya te ha llegado la info: estamos organizando una tarea quepuede que os pueda interesar.


A ver si participas ;-)

Saludos
Paolo
-----

*Apologies for cross-posting*

Do you believe machine generated text is becoming an issue? Are youinterested in boosting research to automatically detect machinegenerated text? 🤖👩🏻

We cordially invite all researchers and practitioners from all fieldsto participate in the AuTexTification task. If interested, registeryourself in the shared task through this link: https://lnkd.in/dzBZsYiD

Once registered and training phase started, the datasets will be sentto your email along with a password. Look for more informationregarding task description, schedules, or submissions through theAutextification web page: https://sites.google.com/view/autextification


More information on the shared task

The new era of automatic content generation has surged throughpowerful causal language models like GPT, PALM, or Bloom that can beused to spread untruthful news, human-looking reviews, or opinions.Thus, it is imperative to develop technology to automatically detectgenerated text for content moderation and to attribute generated textto specific models to protect intellectual property or to distillresponsibilities. In this context, we propose the “Automatic TextIdentification” (AuTexTification) shared task, to boost research anddevelopment of automatic systems to detect automatically generatedtext, obtained by state-of-the-art language models, in English andSpanish. 

We propose two subtasks: (i) Human or Generated, where given atext participants will have to determine whether a text has beenautomatically generated or not; and (ii) Model Attribution, whereparticipants will have to determine what model generated a text. Thegeneration models used to generate the text are of increasing numberof neural parameters, ranging from 2 to 175 billion, meaning thatparticipants' systems should be versatile enough to detect a diverseset of text generation models and writing styles.

In the training phase, participants will be provided with twopartitions for subtask 1, i.e., English and Spanish partitions, withbinary labels 👩🏻 and 🤖. Similarly, a partition per language will bereleased for subtask 2. It will include six labels (A, B, C, D, E, andF), each label representing a text generation model. Later, theunlabeled test data will be released.


Important Dates
March 22, 2023: Release of training data
April 21, 2023: Release of test data
May 10, 2023: Participant system results submission
May 17, 2023: Results notification
June 3, 2023: Paper submission
June 16, 2023: Paper peer-reviewed
July 4, 2023: Camera-ready paper version
September 26, 2023: Conference

Task organizers
José Ángel González (Symanto) Contact Email: [email protected]
Areg Sarvazyan (Symanto) Contact Email: [email protected]
Marc Franco-Salvador (Symanto)
Francisco Rangel (Symanto)
Berta Chulvi (Universitat Politècnica de València)
Paolo Rosso (Universitat Politècnica de València)

Please reach out to the organizers or join the Slack workspace toconnect with the other participants and organizers:https://lnkd.in/di_zaMHf


_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] CFP AuTexTification 🤖 vs 👩🏻: Automatic Text Identification shared task at IberLEF 2023

Reply via email to