Hi Saeed, There are a few summarization datasets in Italian.
We have published (https://ceur-ws.org/Vol-3033/paper65.pdf) a dataset extracted from Wikipedia, available from Hugging Face ( https://huggingface.co/datasets/Silvia/WITS). Another group has recently published some datasets in the news domain ( https://www.mdpi.com/2078-2489/13/5/228), ie, from the newspaper Il Post ( https://huggingface.co/datasets/ARTeLab/ilpost) and Fanpage ( https://huggingface.co/datasets/ARTeLab/fanpage). They also automatically translated MLSum into Italian. Previously, there were some Italian splits of multilingual datasets, e.g. WikiLingua. Unfortunately, I do not know much about datasets in Spanish. I hope this helps. Regards, Silvia Il ven 4 ago 2023, 11:49 Saeed Farzi via Corpora <[email protected]> ha scritto: > > > > > > Hi guys, > I am going to implement a summarization system in the medical domain in > Italian and Spanish. So I am looking for free summarization datasets both > in the public and medical domains in both languages. > Any help would be appreciated. > sincerely > Ciao > -- > *Dr. Saeed Farzi,* > Faculty of Computer Engineering, > K. N. Toosi University of Technology, Tehran, Iran. > Phone: +98-21-8462450-401 > Fax: +98-21-88462066 > P.O. Box: 16315-1355, > Web: http://wp.kntu.ac.ir/saeedfarzi/ > Lab: https://www.trlab.ir/ > > > > > -- > *Dr. Saeed Farzi,* > Faculty of Computer Engineering, > K. N. Toosi University of Technology, Tehran, Iran. > Phone: +98-21-8462450-401 > Fax: +98-21-88462066 > P.O. Box: 16315-1355, > Web: http://wp.kntu.ac.ir/saeedfarzi/ > Lab: https://www.trlab.ir/ > > > _______________________________________________ > Corpora mailing list -- [email protected] > https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ > To unsubscribe send an email to [email protected] > -- -- Le informazioni contenute nella presente comunicazione sono di natura privata e come tali sono da considerarsi riservate ed indirizzate esclusivamente ai destinatari indicati e per le finalità strettamente legate al relativo contenuto. Se avete ricevuto questo messaggio per errore, vi preghiamo di eliminarlo e di inviare una comunicazione all’indirizzo e-mail del mittente. -- The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. If you received this in error, please contact the sender and delete the material.
_______________________________________________ Corpora mailing list -- [email protected] https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to [email protected]
