School of Computer Science and Digital Technologies, Aston University, UK, is 
offering two PhD positions in language and speech processing in the following 
two topics. The application deadline is 16th February 2024. Applications for 
the position can be submitted via Aston's PGR webpage 
(https://www.aston.ac.uk/graduate-school/how-to-apply/studentships). Enquiries 
about the positions can be made to Dr Tharindu Ranasinghe, School of Computer 
Science and Digital Technologies, Aston University, UK - 
[email protected] .



Building Trustworthy Automatic Speech Recognition Systems

Dr Tharindu 
Ranasinghe<https://research.aston.ac.uk/en/persons/tharindu-ranasinghe> (School 
of Computer Science and Digital Technologies - Applied AI & Robotics Department)

Dr <https://research.aston.ac.uk/en/persons/tharindu-ranasinghe> Phil 
Weber<https://research.aston.ac.uk/en/persons/phil-weber> (Aston Centre for 
Artificial Intelligence Research and Application – ACAIRA, School of Computer 
Science and Digital Technologies - Applied AI & Robotics Department)

Prof Aniko Ekart<https://research.aston.ac.uk/en/persons/aniko-ek%C3%A1rt> 
(Aston Centre for Artificial Intelligence Research and Application – ACAIRA, 
School of Computer Science and Digital Technologies - Applied AI & Robotics 
Department)

Dr Muhidin Mohamed<https://research.aston.ac.uk/en/persons/muhidin-mohamed> 
(College of Business and Social Sciences - Operations & Information Management)



Project Summary, Aim and Objectives:

Automatic Speech Recognition (ASR) has gained popularity in the last decade 
thanks to advancements in speech and natural language processing, along with 
the availability of powerful hardware for processing extensive data streams. 
ASR is crucial in transcription services for various sectors, including legal, 
healthcare, and entertainment. It also plays a vital role in e-learning 
platforms, customer support systems, and enhancing accessibility for 
individuals with disabilities. Additionally, ASR significantly contributes to 
language translation, making it widely adopted across diverse sectors.

Although ASR has come a long way in recent years, it still has limitations, and 
the produced output is far from perfect. However, most commercial ASR systems 
do not explicitly state this to the user, leaving the user to assume that the 
output is accurate. Most large-scale ASR systems perform better for widely 
spoken languages, while low-resource languages have lower quality. ASR systems 
also struggle to handle different accents and dialects, especially of 
non-native speakers. Furthermore, most ASR systems are trained in the general 
domain and do not perform optimally in specific domains such as healthcare. 
These limitations result in wrong outputs, and the lack of transparency and 
accountability can lead to severe consequences, especially in critical domains 
such as healthcare or legal. Therefore, a quality indicator for ASR systems has 
become essential as they can play a significant role in informing the user 
about the output quality.

This PhD research aims to develop a comprehensive quality indicator system for 
ASR. The specific goals are (1) Investigate what makes ASR trustworthy (2) 
Evaluate ASR systems in challenging scenarios (3) Design quality indicator 
metrics in ASR (i.e. sentence level scores, word level error spans, critical 
errors, etc.) (4) Introduce public benchmarks and investigate novel approaches 
for predicting quality in ASR. The output of the PhD will contribute towards 
trustworthy ASR systems..

Knowledge and skills required in applicant:

Natural Language Processing, Speech Processing, Machine Learning and Deep 
Learning. The applicant should be familiar with Python and neural network 
framework(s) such as PyTorch and TensorFlow and should have excellent 
programming skills.


Evidence-based detection of misuse of large language models

Dr<https://research.aston.ac.uk/en/persons/tharindu-ranasinghe> Phil 
Weber<https://research.aston.ac.uk/en/persons/phil-weber> (Aston Centre for 
Artificial Intelligence Research and Application – ACAIRA, School of Computer 
Science and Digital Technologies - Applied AI & Robotics Department)

Dr Tharindu 
Ranasinghe<https://research.aston.ac.uk/en/persons/tharindu-ranasinghe> (School 
of Computer Science and Digital Technologies - Applied AI & Robotics Department)

Dr Muhidin Mohamed<https://research.aston.ac.uk/en/persons/muhidin-mohamed> 
(College of Business and Social Sciences - Operations & Information Management)

Dr Paul Grace<https://research.aston.ac.uk/en/persons/paul-grace> (Cyber 
Security Innovation Research Centre – CSI, School of Computer Science and 
Digital Technologies - School of Computer Science and Digital Technologies)



Project Summary, Aim and Objectives:

Large language models (LLMs) have become ubiquitous since the release of 
ChatGPT, bringing a paradigm shift in the processing and generation of text, 
images, speech and video. New methods for training very large neural models 
using massive unlabelled data created the opportunity for foundation models 
able to generate data with apparently human-like ability. Publicly available 
pre-trained models facilitate novel tools; Google Gemini, Microsoft Co-Pilot, 
Dall-E and many start-ups allow non-experts to conversationally instruct and 
use AI systems in everyday life, seamlessly employing complex technologies 
including automatic speech recognition, natural language processing, machine 
translation and image captioning.



New dangers accompany this rapid and unstructured step-change in technology. 
Beyond unease over energy use, environmental impact, and digital divides, many 
are concerned with the ease with which fake media increasingly difficult to 
distinguish from real media can be created. In education, plagiarism detection 
becomes more nuanced with the need to identify AI-generated text. In the 
justice domain, forensic determination of the source of a voice or face is 
obfuscated by the potential that it was artificially generated. Politicians 
worry about the impact on democracy of undetectable deepfakes, and 
cybersecurity experts about identity theft. The problems are exacerbated by the 
potential for LLM-generated data to be reused for training downstream models.



Scientifically well-founded methods for detecting and quantifying the risk of 
LLM-generated media are therefore urgently needed.



This project builds on established methods in forensic data analysis to develop 
rigorous methods for detecting AI-generated media. Specifically: 1) review 
existing approaches to detecting AI-generated and spoofed media, 2) build on 
methods for forensic voice comparison to develop and validate new approaches to 
forensic text comparison, 3) apply to detecting plagiarism and deep fakes, 4) 
extend to image data, 4) propose principles to contribute to broader questions 
of safe, fair and transparent use of LLMs.

Knowledge and skills required in applicant:

Strong programming skills, preferably in Python, including development of large 
language models. Knowledge of machine learning theory, applications, and 
related statistical and probability theory. Awareness of modern approaches to 
forensic data science.

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to