Hi all, How do you all evaluate the success of your query autocomplete (i.e. suggester) component if you use it?
We cannot use MRR for various reasons (I can go into them if you're interested), so we're thinking of using nDCG since we already use that for relevance eval of our system as a whole. I am also interested in the metric "success at top-k," but I can't find any research papers that explicitly define "success" -- I am assuming it's a suggestion (or suggestions) labeled "relevant," but maybe it could also simply be the suggestion that receives a click from the user? Would love to hear from the hive mind! Best, Audrey --