[Apologies for multiple postings]
We are happy to announce that 1 new Written Corpus and 3 Evaluation
Packages are now available in our catalogue.
*ELRA-W0082 88milSMS. A corpus of authentic text messages in French*
<http://catalog.elra.info/product_info.php?products_id=1239>
ISLRN: 024-713-187-947-8 <http://islrn.org/resources/024-713-187-947-8/>
A pluridisciplinary team of linguists and computer scientists collected
more than 88,000 French authentic text messages in Montpellier (2011),
as part of the sud4science LR project. The text messages were
semi-automatically anonymised, before being partially transcoded (into
standardised French) and annotated.
*ELRA-E0043 CLEFeHealth 2014 Task 3 Evaluation Package*
<http://catalog.elra.info/product_info.php?products_id=1238>
ISLRN: 725-020-897-275-7 <http://islrn.org/resources/725-020-897-275-7/>
The CLEFeHealth 2014 Task 3 Evaluation Package contains data used for
the User-centred health information retrieval Shared task at the
CLEFeHealth Lab conducted in 2014. Task 3 aimed at evaluating
information retrieval to address questions patients may have when
reading clinical reports.
*ELRA-E0044 REPERE Evaluation Package*
<http://catalog.elra.info/product_info.php?products_id=1241>
ISLRN: 360-758-359-485-0 <http://islrn.org/resources/360-758-359-485-0/>
The REPERE Evaluation Package contains the visual annotation of 60 hours
of French news TV shows, for the purpose of person recognition within TV
programs. This annotation concerns both persons and written information
appearing on screen.
Provided data consists of:
- video files with indexes and with manual transcriptions in XGTF format
(Viper),
- audio files compressed in WAV format with transcriptions in TRS format
(Transcriber).
*ELRA-E0045 MAURDOR Evaluation Package*
<http://catalog.elra.info/product_info.php?products_id=1242>
ISLRN: 364-018-517-901-2 <http://islrn.org/resources/364-018-517-901-2/>
The MAURDOR project consists in evaluating systems for automatic
processing of written documents. Collected written documents are scanned
documents (printed, typewritten or manuscripts). This package contains
8,129 documents. Once collected, those documents were submitted to a
manual annotation. This package contains the material provided to the
evaluation campaign participants:
- Consistent development and test data corresponding to the
application concerned;
- Tools for the automatic measurement of system performances;
- A common assessment protocol applicable to each processing stage,
along with a complete automatic processing chain for written documents.
The documents are provided in TIFF format and the annotations are
provided in XML format.
For more information on the catalogue, please contact Valérie Mapelli
mailto:[email protected]
Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates:
http://www.elra.info/en/catalogues/language-resources-announcements/
Follow us on Twitter @ELRANews <https://twitter.com/ELRAnews>
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list