The Third Workshop on LLM4Eval progresses the discussion from the previous
series. These earlier events investigated the potential and challenges of
using LLMs for search relevance evaluation, automated judgments, and
retrieval-augmented generation (RAG) assessment. As modern IR systems
integrate search, recommendations, conversational interfaces, and
personalization, new evaluation challenges arise beyond basic relevance
assessment. These applications create personalized rankings, explanations,
and adapt to user preferences over time, requiring new evaluation methods.
Overview

Recent advancements in Large Language Models (LLMs) have significantly
impacted evaluation methodologies in Information Retrieval (IR), reshaping
the way relevance, quality, and user satisfaction are assessed. Initially
demonstrating the potential for query-document relevance judgments, LLMs
are now being applied to more complex tasks, including relevance label
generation, assessment of retrieval-augmented generation systems, and
evaluation of the quality of text-generation systems. As IR systems evolve
toward more sophisticated and personalized user experiences, integrating
search, recommendations, and conversational interfaces, new evaluation
methodologies become necessary.

Building upon the success of our previous workshops, this third iteration
of the LLM4Eval workshop at SIGIR 2025 seeks submissions exploring new
opportunities, limitations, and hybrid approaches involving LLM-based
evaluations.
Important Dates

   - Paper submission deadline: April 23, 2025 (AoE)
   - Notification of acceptance: May 21, 2025 (AoE)
   - Workshop date: July 17, 2025

Topics of interest

We invite submissions on topics including, but not limited to:

   - LLMs for query-document relevance assessment
   - Evaluating conversational IR and recommendation systems with LLMs
   - Hybrid evaluation frameworks combining LLM and human annotations
   - Identifying failure modes and limitations of LLM annotations
   - Prompt engineering strategies for improving LLM annotation quality
   - Standardizing protocols for reliable LLM-based evaluations
   - Bias, fairness, and ethical considerations in LLM evaluations
   - LLM annotation robustness, reliability, and reproducibility
   - User-centric evaluations, personalization, and subjective assessments
   with LLMs
   - Case studies and lessons from industry applications of LLM-based
   evaluations

Submission guidelines

   - Papers must follow SIGIR format and should not exceed 9 pages,
   excluding references.
   - We accept full papers (published or unpublished), position papers, and
   demo papers.
   - All papers will be peer-reviewed (double-blind) by the program
   committee and judged by their relevance to the workshop themes and
   potential to generate discussion.
   - Previously published studies can be submitted in their original format
   and will be reviewed solely for their relevance to this workshop.
   - All submissions must be in English (PDF format).
   - Submission through EasyChair:
   https://easychair.org/conferences/?conf=llm4evalsigir25.

Publication options

Authors can choose between archival and non-archival options for their
submissions:

   - Archival: Papers will be included in the workshop proceedings.
   - Non-archival: Papers may be uploaded to arXiv.org, allowing submission
   elsewhere as they will be considered non-archival. The workshop’s website
   will maintain a link to the arXiv versions of the papers.
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to