In this newsletter:
LDC launches upgraded, mobile-friendly website
Connect with LDC on Bluesky

New publications:
DEFT Spanish Light and Rich ERE 
Annotation<https://catalog.ldc.upenn.edu/LDC2025T04>
MATERIAL Kazakh-English Language Pack<https://catalog.ldc.upenn.edu/LDC2025S03>

________________________________
LDC launches upgraded, mobile-friendly website
We are pleased to announce the launch of the newly upgraded LDC main website: 
https://www.ldc.upenn.edu/. Designed with a modern layout, the site now offers 
an improved experience across all devices. While the LDC Catalog, LDC user 
accounts, and LDC Submissions are not affected by this upgrade, they are now 
more accessible than ever from any page on the site. We invite you to explore 
the website and enjoy a smoother, more intuitive LDC web experience.

Connect with LDC on Bluesky
In addition to Facebook, X and LinkedIn, you can now connect with LDC on the 
microblogging platform, Bluesky<https://bsky.app/profile/ldcupenn.bsky.social>. 
Follow us today to learn the latest news, announcements and corpora releases 
from the Consortium.
________________________________

New publications:
DEFT Spanish Light and Rich ERE 
Annotation<https://catalog.ldc.upenn.edu/LDC2025T04> was developed by LDC and 
consists of 158 Spanish discussion forum and newswire documents annotated for 
entities, relations, and events (ERE). Light ERE annotation labels entity 
mentions for the target set of entity, relation, and event types between and 
among those entities including coreference. Rich ERE annotation expands types 
and tagging in the entities, relations, and events annotation tasks and 
replaces strict event coreference with a more loosely defined event hopper 
annotation. The source data consists of Spanish newswire text and Latin 
American discussion forum data from DEFT Spanish Treebank 
LDC2018T01<https://catalog.ldc.upenn.edu/LDC2018T01>. 128 documents were 
annotated following Light ERE annotation guidelines. 154 files were labeled 
with Rich ERE annotation, 124 of which were also labeled with Light ERE 
annotation.

DARPA's Deep Exploration and Filtering of Text (DEFT) program aimed to address 
remaining capability gaps in state-of-the-art natural language processing 
technologies related to inference, causal relationships and anomaly detection. 
LDC supported the DEFT program by collecting, creating and annotating a variety 
of data sources.

2025 members can access this corpus through their LDC accounts. Non-members may 
license this data for a fee.

*

MATERIAL Kazakh-English Language Pack<https://catalog.ldc.upenn.edu/LDC2025S03> 
was developed by Appen<http://www.appen.com/> for the IARPA 
MATERIAL<https://www.iarpa.gov/index.php/research-programs/material> program 
and contains 57 hours of Kazakh conversational telephone speech, transcripts, 
English translations, annotations, and queries. Calls were made using different 
telephones (e.g., mobile, landline) from a variety of environments. Transcripts 
cover approximately 17% of the speech files, all of which were translated into 
English. This release also includes English queries and their relevance 
annotations.
The MATERIAL program focused on underserved languages with the ultimate goal to 
build cross language information retrieval systems to find speech and text 
content using English search queries.

2025 members can access this corpus through their LDC accounts provided they 
have submitted a completed copy of the special license agreement. Non-members 
may license this data for a fee.

To unsubscribe from this newsletter, log in to your LDC 
account<https://catalog.ldc.upenn.edu/login> and uncheck the box next to 
"Receive Newsletter" under Account Options or contact LDC for assistance.

Membership Coordinator
Linguistic Data Consortium<ldc.upenn.edu>
University of Pennsylvania
T: +1-215-573-1275
E: [email protected]<mailto:[email protected]>
M: 3600 Market St. Suite 810
      Philadelphia, PA 19104





_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to