Hi Eric, The paper of (Bar-Ilan et al., 2006) could be very useful to you. It discusses various measures, including overlap, Spearman's footrule and G-measure. We have explored those in one study (Trigo et al., 2015), so you may have a look at what we did. There is also an interesting measure based on Spearman, but it giving a greater weight to elements positioned higher in the ranking (Costa et al., 2005).

Greetings Pavel Brazdil http://www.liaad.up.pt/area/pbrazdil/ Bar-Ilan, J., Mat-Hassan, M., and Levene, M. (2006). Methods for comparing rankings of search engine results. Computer networks, 50(10):1448–1463. L Trigo, M Víta, R Sarmento, P Brazdil, Retrieval, visualization and validation of affinities between documents, in 1st Int. Workshop on the design, development and use of Knowledge IT artifacts in professional communities and aggregations - KITA 2015, 2015. SciTePress, 2015 Joaquim Pinto da Costa and Carlos Soares, A WEIGHTED RANK MEASURE OF CORRELATION, Aust. N. Z. J. Stat. 47(4), 2005, 515–529 On Fri, Oct 14, 2016 at 11:25 AM, Hugo Gonçalo Oliveira <hro...@dei.uc.pt> wrote:

Would it make sense to compute the Minimum Edit Distance to transform one list into the other?

Best,
----------
Hugo Gonçalo Oliveira
CISUC, Department of Informatics Engineering, University of Coimbra

On Fri, Oct 14, 2016 at 10:01 AM, Eric Atwell <e.s.atw...@leeds.ac.uk> wrote:

Thanks Olga, BUT I don't see how Spearman's (or other) rank correlation formula measures overlap between 2 ranked lists containing some different words.

For example, what is the similarity between 2 "distributional semantics" representations of God?:

(1 move, 2 bless, 3 and, 4 forbid, 5 have, 6 be, 7 create, 8 do, 9 know, 10 want) and

(1 bless, 2 blesses, 3 delusion, 4 incarnate, 5 hates, 6 himself, 7 exists, 8 almighty, 9 forbid, 10 rest)

A simple measure involves count of overlapping words (bless, forbid) i.e. this scores 2 (or 2/10, or 2/20)

BUT I also want to take into account ranks: (bless 2,1), (forbid 4,9)

Does God shed light on my problems?

Eric

Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence groups, School of Computing, University of Leeds, Times University of the Year 2017

From: Ольга Ляшевская <ole...@yandex.ru>
Sent: 14 October 2016 08:09:36
To: Eric Atwell; CORPORA discussion forum
Cc: AbdulRahman AlOsaimy; Claire Brierley
Subject: Re: [Corpora-List] metric of overlap between two ranked lists?

Dear Eric,

Spearman's rank correlation coefficient will work in this case, I think.
https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

Best,
Olga

14.10.2016, 09:54, "Eric Atwell" <e.s.atw...@leeds.ac.uk>:

Is there a standard metric of overlap between two ranked lists?
e.g. to measure/score the similarity between top 10 keywords extracted using 2 different formulae, such as LL v MI?
OR e.g. to measure/score the similarity between top 10 hits from Google v top 10 hits from Bing for a give search phrase?
OR e.g. to measure/score the similarity between ranked lists of PoS-tags predicted for a word by two rival PoS-taggers in an ensemble tagger?

If these were unranked sets of keywords, i could simply count the intersection. But I want to take rank into account in some senible way.

thanks for expert pointers to proven metrics ...

Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence groups, School of Computing, University of Leeds, Times University of the Year 2017

Olga Lyashevskaya

School of Linguistics, Faculty of Humanities
Higher School of Economics, Moscow

