[Corpora-List] BioCreative VIII Challenge and Workshop 2nd Call for Participation

Islamaj, Rezarta (NIH/NLM/NCBI) [E] via Corpora Wed, 19 Jul 2023 12:12:53 -0700

BioCreative VIII Challenge and Workshop 2nd Call for Participation

Where, When:
The BioCreative VIII workshop<BioCreative%20VIII%20workshop> will run with AMIA 
2023, November 11-15, 2023, In New Orleans, LA.


BioCreative VIII:
The VIIIth BioCreative workshop seeks to attract researchers interested in 
automatic methods of extracting medically relevant information from clinical 
data and aims to bring together the medical NLP community and the health 
professionals community. The challenge tracks include:

  *   BioRED (Biomedical Relation Extraction Dataset) Track will continue to 
address information extraction from biomedical literature
  *   SYMPTEMIST (Symptom TExt Mining Shared Track) will focus on symptom 
extraction from clinical records in Spanish and multilingual corpus
  *   Phenotype extraction (genetic conditions in pediatric patients) Track 
will address phenotype extraction from clinical records
  *   Annotation Tool Track will focus on annotation tools that facilitate the 
job of domain experts by offering seamless integration with relevant ontologies 
and other features to improve efficiency (dataset provided).

Workshop Proceedings and Special Issue:
The BioCreative VIII Proceedings will host all the submissions from 
participating teams, and it will be freely available by the time of the 
workshop.
In addition, we are happy to announce that the journal Database will host the 
BioCreative VIII special issue for work that has passed their peer-review 
process. Invitation to submit will be sent after the workshop.

Participation:
Teams can participate in one or more of these tracks. Team registration will 
continue until final commitment is requested by the individual tracks.
To register a team go to the Registration 
form<https://urldefense.com/v3/__https:/forms.gle/cwEPevGPjrjm687z5__;!!KOmnBZxC8_2BBQ!3qThN96vjtSn1RncSdJvErJEL_mVPAQxonbHa80lTGc5HdtDyNTjaMd2UQieaXT-h-agKlz22LvlkXaNpp4WEoCcpIqd3jE$>.
 If you have restrictions accessing Google forms please send e-mail to 
[email protected]<mailto:[email protected]>


BioCreative VIII Tracks:

Track 1: BioRED (Biomedical Relation Extraction Dataset) Track. (Rezarta 
Islamaj and Zhiyong Lu)

This track aims to foster the development of systems that automatically extract 
biomedical relations in journal articles, and the final resource -- freely 
available to the community -- will consist of 1000 MEDLINE articles fully 
annotated with biological and medically relevant entities, biomedical relations 
between them, and the novelty of the relation (whether the relation is a key 
point of the article versus background knowledge that can be found elsewhere). 
The participants will use the training data (600 articles) to design and 
develop their NLP systems to extract asserted relationships from free text and 
are encouraged to classify relations that are novel findings. In the 
BioCreative setting we will enrich the BioRED training dataset with 400 
recently published MEDLINE articles fully annotated, bringing this valuable 
resource to 1000 articles. This track serves as a continuation of previous 
BioCreative Workshops that addressed the individual extraction of bio entities 
and/or specific relations such as disease-gene, protein-protein, or 
chemical-chemical, in biomedical articles. In contrast from previous 
challenges, this track calls for the extraction of all semantic relations 
expressed in the article and their novelty factor.

Track 2: SYMPTEMIST (Symptom TExt Mining Shared Task) (Martin Krallinger)

A considerable effort has been made to automatically extract from clinical 
texts relevant variables and concepts using advanced entity recognition 
approaches. Despite the importance of clinical signs and symptoms for 
diagnosis, prognosis and healthcare data analytics strategies, this kind of 
clinical entity has received far less attention when compared to other entity 
classes such as medications or diseases. To understand and characterize 
relationships between different symptoms, their onset, or associations of 
symptoms to diseases is a central question for medical research. Due to the 
complexity underlying the annotation process and normalization or mapping of 
symptom mentions to controlled vocabularies, very few datasets or corpora have 
been generated to train and evaluate advanced clinical named entity recognition 
systems. To foster the development, research and evaluation of semantic 
annotation strategies that can be useful for systematically extracting and 
harmonizing symptoms from clinical documents we propose the  SYMPTEMIST track.  
We will invite researchers, health-tech  professionals, NLP, and ontology 
experts to develop tools capable of detecting automatically mentions of 
clinical symptoms from clinical texts in Spanish and normalizing or mapping 
them to a widely used multilingual clinical vocabulary, namely SNOMED CT. For 
this task we will release a large collection of manually annotated symptoms 
mentions, together with detailed annotation guidelines, consistency analysis 
and additional resources. For this track we plan also to release a multilingual 
version of the corpus (English, Italian, Romanian, Catalan, Portuguese, French, 
Dutch, Swedish and Czech). This is a new challenge.

Track 3: Phenotype extraction (genetic conditions in pediatric patients) 
(Graciela Gonzalez, Ian Campbell, Davy Weissenbacher)

The dysmorphology physical examination is a critical component of the 
diagnostic evaluation in clinical genetics. This process catalogs often minor 
morphological differences of the patient's facial structure or body, but it may 
also identify more general medical signs such as neurologic dysfunction. The 
findings enable the correlation of the patient with known rare genetic 
diseases. Although the medical findings are key information, they are nearly 
always captured within the electronic health record (EHR) as unstructured free 
text, making them unavailable for downstream computational analysis. Advanced 
Natural Language Processing methods are therefore required to retrieve the 
information from the records. This is a new challenge.

Track 4: Annotation Tool track (Rezarta Islamaj, Cecilia Arighi, Lynette 
Hirschman, Martin Krallinger, Graciela Gonzalez)

Recognizing the need for freely available, time-saving tools that help build 
quality gold-standard resources, the goal of BioCreative 2023 Annotation Tool 
Track is to foster development of such biocuration annotation systems. This 
track calls for text mining developers to submit systems that are: 1) both 
publicly available, and offer local setup options to allow for data with 
privacy concerns, such as clinical records, 2) able to support team annotation, 
and collaboration between annotators to ensure data annotation quality, 3) able 
to annotate documents for triage, entities, and/or relations, and 4) able to 
integrate the selected ontology, and provide search capabilities/browsing, as 
well as suggestions to the curator for the selected ontology. A select number 
of systems will be showcased at the workshop.

Organizing Committee
*             Dr. Rezarta Islamaj, National Library of Medicine
*             Dr. Cecilia Arighi, University of Delaware
*             Dr. Ian M. Campbell, Children Hospital of Philadelphia
*             Dr. Graciela Gonzalez-Hernandez, Cedars-Sinai Medical Center
*             Dr. Lynette Hirschman, MITRE
*             Dr. Martin Krallinger, Barcelona Supercomputing Center
*             Dr. Davy Weissenbacher, Cedars-Sinai Medical Center
*             Dr. Zhiyong Lu, National Library of Medicine

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] BioCreative VIII Challenge and Workshop 2nd Call for Participation

Reply via email to