**** We apologize for the multiple copies of this email. In case you are
already registered to the next webinar, you do not need to register
again. ****
------------------------------------------------------------------------
Dear colleague,
We are happy to announce the next webinar in the Language Technology
webinar series organized by the HiTZ Chair of AI< (https://hitz.eus).
You can view the videos of previous webinars and the schedule for
upcoming webinars here: http://www.hitz.eus/webinars
Next webinar:
*Speaker: *André F. T. Martins (Universidade de Lisboa)
*Title: *xCOMET, Tower, EuroLLM: Open & Multilingual LLMs for Europe
*Date: *Thursday, May 8, 2025 - 15:00 CET
*Summary: *Today, LLMs are Swiss knives and MT one of their tools. Is
this the end of MT research? In this talk, I argue that the connection
between LLM and MT research is two-way. I present some of our recent
work advancing multilingual LLMs, tools to estimate their quality, and
how the two can be combined for test-time scaling. First, I present
xCOMET, an open-source learned metric which integrates sentence-level
evaluation and error span detection, exhibiting state-of-the-art
performance across all types of meta-evaluation (sentence-level,
system-level, and error span detection). Moreover, it does so while
highlighting and categorizing error spans, thus enriching the quality
assessment. Then, I present Tower, a suite of open multilingual LLMs for
translation-related tasks. Tower models are created through continued
pretraining on a carefully curated multilingual mixture of monolingual
and parallel data. The combination of Tower with COMET reranking
obtained the best results in 8 out of 11 language pairs in the WMT
General Translation shared task, according to human evaluation. Finally,
I describe EuroLLM, an ongoing EU-made project whose goal is to train an
open multilingual LLM from scratch using the European HPC infrastructure
(EuroHPC). The last release (EuroLLM-9B) supports 35 languages,
including all 24 official EU languages, and it achieves strong results
in various benchmarks, comparable or better than the best existing
models of similar size.
xCOMET:
https://huggingface.co/collections/Unbabel/xcomet-659eca973b3be2ae4ac023bb
Tower:
https://huggingface.co/collections/Unbabel/tower-659eaedfe36e6dd29eb1805c
EuroLLM: https://huggingface.co/blog/eurollm-team/eurollm-9b
*Bio: *André F. T. Martins (PhD 2012, Carnegie Mellon University and
Instituto Superior Técnico; https://andre-martins.github.io/) is an
Associate Professor at Instituto Superior Técnico, University of Lisbon,
researcher at Instituto de Telecomunicações, and the VP of AI Research
at Unbabel. His research, funded by a ERC Starting Grant (DeepSPIN) and
Consolidator Grant (DECOLLAGE), among other grants, include machine
translation, quality estimation, structure and interpretability in deep
learning systems for NLP. His work has received several paper awards at
ACL conferences. He co-founded and co-organizes the Lisbon Machine
Learning School (LxMLS), and he is a Fellow of the ELLIS society and
co-director of the ELLIS Program in Natural Language Processing. He is a
member of the R&I advisory group of EuroHPC, the European infrastructure
for supercomputing.
*
Upcoming webinars:*
· Mirella Lapata (Thursday, June 5, 2025)
If you are interested in participating, please complete this
registration form: http://www.hitz.eus/webinar_izenematea
If you cannot attend this seminar, but you want to be informed of the
following HiTZ webinars, please complete this registration form instead:
http://www.hitz.eus/webinar_info
Best wishes,
HiTZ Zentroa
P.S: HiTZ will not grant any type of certificate for attendance at these
webinars.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]