[Our apologies if you have received multiple copies of this announcement.]
We are happy to announce that a set of Pashto Language Resources (1
Broadcast Speech Resource and 6 Written Corpora) and 1 new Multimodal
Resource are now available in our catalogue.
*_Pashto Language Resources:_* This set of Pashto Language Resources was
produced by ELDA within the PEA TRAD project supported by the French
Ministry of Defence (DGA). It consists of 1 Broadcast Speech Resource
and 6 Written Corpora.
Available resources are listed below (click on the links for further
details):
*ELRA-S0381 TRAD Pashto Broadcast News Speech Corpus*
*ISLRN: **918-508-885-913-7 *
<http://islrn.org/resources/918-508-885-913-7/>
This corpus contains 108 hours of broadcast news recordings transcribed,
covering more than 1,000 speakers. Transcriptions are provided together
with the audio files and include about 46,000 segments and 1.1M words.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1265
*ELRA-W0092 TRAD Pashto Monolingual text Corpus*
*ISLRN: **394-903-293-388-0* <http://islrn.org/resources/394-903-293-388-0/>
This is a monolingual text corpus in Pashto. The corpus contains about
112,000,000 tokens collected from 46 different blogs and websites.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1266
*ELRA-W0093 TRAD Pashto-French Parallel corpus of transcribed Broadcast
News Speech - Training data*
*ISLRN: **802-643-297-429-4 <http://islrn.org/resources/802-643-297-429-4/>*
This corpus consists of the transcription of 106 hours of recordings in
Pashto from the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381)
translated into French. It contains about 832,000 source words and
747,000 target words.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1267
*ELRA-W0094 TRAD Pashto-French Parallel corpus of transcribed Broadcast
News Speech - Test data*
*ISLRN: **547-897-479-723-3 *
<http://islrn.org/resources/547-897-479-723-3/>
This is a parallel corpus, which contains 10,000 Pashto words translated
into French. The source texts come from 3 broadcast news transcriptions
of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381).
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1268
*ELRA-W0095 TRAD Pashto-English Parallel corpus of transcribed Broadcast
News Speech - Test data*
*ISLRN: **006-102-605-738-4* <http://islrn.org/resources/006-102-605-738-4/>
This is a parallel corpus, which contains 10,000 Pashto words translated
into English. The source texts come from 3 broadcast news transcriptions
of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381).
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1269
*ELRA-W0096 TRAD Pashto-French News Articles Parallel corpus*
*ISLRN: 649-628-149-051-7 <http://islrn.org/resources/649-628-149-051-7/>
*This is a parallel corpus, which contains 10,000 Pashto words
translated into French by two different translators. The source texts
have been collected from the following news websites: Azadiradio,
Mashaal and Voice of America Pashto.*
*
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1270
*ELRA-W0097 TRAD Pashto-English News Articles Parallel corpus*
*ISLRN: 612-936-517-010-2 <http://islrn.org/resources/612-936-517-010-2/>
*This is a parallel corpus, which contains 10,000 Pashto words
translated into English by two different translators. The source texts
have been collected from the following news websites: Azadiradio,
Mashaal and Voice of America Pashto.*
*
For more information,
see:**http://catalog.elra.info/product_info.php?products_id=1271
*ELRA-S0374 FoxPersonTracks: a Benchmark for Person Re-Identification
from TV Broadcast Shows*
*ISLRN: **168-132-570-218-1* <http://islrn.org/resources/168-132-570-218-1/>
FoxPersonTracks is a person track dataset dedicated to person
re-identification. The dataset is built from a set of real life TV shows
broadcasted from BFMTV and LCP TV french channels, provided during
REPERE challenge. It contains a total 4,604 persontracks (short video
sequences featuring an individual with no background) from 266 persons.
The dataset also provides re-identification results using space-time
histograms as a baseline, together with an evaluation tool in order to
ease the comparison to other re- identification methods.
For more information, see:
http://catalog.elra.info/product_info.php?products_id=1264
For more information on the catalogue, please contact Valérie Mapelli
mailto:[email protected]
If you would like to enquire about having your resources distributed by
ELRA, please do not hesitate to contact us.
Visit our On-line Catalogue: http://catalog.elra.info
Visit the Universal Catalogue: http://universal.elra.info
Archives of ELRA Language Resources Catalogue Updates:
http://www.elra.info/en/catalogues/language-resources-announcements/
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list