Re: [discovery] How to measure disagreement between human judges in discernatron?

Jonathan Morgan Wed, 26 Oct 2016 11:32:41 -0700

Disclaimer: I'm not a math nerd, and I don't know the history of
Discernatron very well.


...but re: your second specialized concern, have you considered running
some more sophisticated inter-rater reliability statistics to get a better
sense of the degree of disagreement (controlling for random chance?). See
for example: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

- Jonathan

On Wed, Oct 26, 2016 at 11:21 AM, Erik Bernhardson <
[email protected]> wrote:

> For a little backstory, in discernatron multiple judges provide scores in
> from 0 to 3 for results. Typically we only request a single query to be
> reviewed by two judges. We would like to measure the level of disagreement
> between these two judges, and if it crosses some threshold get two more
> scores, so we can then measure disagreement in the group of 4. Somehow
> though, we need to define how to measure that level of disagreement and
> what the threshold for needing more scores is.
>
> Some specialized concerns:
> * It is probably important to include not just that the users gave
> different values, but also how far apart they are. The difference between a
> 3 and a 2 is much smaller than between a 2 and a 0.
> * If the users agree that 80% of the results are all 0, but disagree on
> the last 20%, even though the average disagreement is low it's probably
> still important? Might be worthwhile to take all the agreements about
> irrelevant results and remove them before calculating disagreement? Not
> sure...
>
> I know we have a few math nerds here on the list, so hoping someone has a
> few ideas.
>
> _______________________________________________
> discovery mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/discovery
>
>


-- 
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>

_______________________________________________
discovery mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/discovery

Re: [discovery] How to measure disagreement between human judges in discernatron?

Reply via email to