Re: [Corpora-List] metric of overlap between two ranked lists?

```Hi Eric,

Probably not standard in that setting, but I think calculating a symmetrical
average precision, or (normalised) discounted cumulative, could work.
Basically take the AP (or nDCG) of ranking1 vs ranking2, reverse, and take the
average.```
```

Best regards,

--
Sérgio Matos
Research Associate

On 14 October 2016 at 10:08:37, Eric Atwell
(e.s.atw...@leeds.ac.uk<mailto:e.s.atw...@leeds.ac.uk>) wrote:

Thanks Olga, BUT I don't see how Spearman's (or other) rank correlation formula
measures overlap between 2 ranked lists containing some different words.

For example, what is the similarity between 2 "distributional semantics"
representations of God?:

(1 move, 2 bless, 3 and, 4 forbid, 5 have, 6 be, 7 create, 8 do, 9 know, 10
want) and

(1 bless, 2 blesses, 3 delusion, 4 incarnate, 5 hates, 6 himself, 7 exists, 8
almighty, 9 forbid, 10 rest)

A simple measure involves count of overlapping words (bless, forbid) i.e. this
scores 2 (or 2/10, or 2/20)

BUT I also want to take into account ranks: (bless 2,1), (forbid 4,9)

Does God shed light on my problems?

Eric

Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence groups,
School of Computing, University of Leeds, Times University of the Year 2017
________________________________
From: Ольга Ляшевская <ole...@yandex.ru>
Sent: 14 October 2016 08:09:36
To: Eric Atwell; CORPORA discussion forum
Cc: AbdulRahman AlOsaimy; Claire Brierley
Subject: Re: [Corpora-List] metric of overlap between two ranked lists?

Dear Eric,

Spearman's rank correlation coefficient will work in this case, I think.
https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

Best,
Olga

14.10.2016, 09:54, "Eric Atwell" <e.s.atw...@leeds.ac.uk>:
> Is there a standard metric of overlap between two ranked lists?
> e.g. to measure/score the similarity between top 10 keywords extracted
> using 2 different formulae, such as LL v MI?
> OR e.g. to measure/score the similarity between top 10 hits from Google
> v top 10 hits from Bing for a give search phrase?
> OR e.g. to measure/score the similarity between ranked lists of PoS-tags
> predicted for a word by two rival PoS-taggers in an ensemble tagger?
>
> If these were unranked sets of keywords, i could simply count the
> intersection. But I want to take rank into account in some senible way.
>
> thanks for expert pointers to proven metrics ...
>
> Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence groups,
> School of Computing, University of Leeds, Times University of the Year 2017
>
> _______________________________________________
> Corpora mailing list
> Corpora@uib.no
> http://mailman.uib.no/listinfo/corpora

Olga Lyashevskaya

School of Linguistics, Faculty of Humanities
Higher School of Economics, Moscow
_______________________________________________
```_______________________________________________