Hello Eric, When I ran into the problem of non-identical lists, I simply added the missing terms to the other list with a very low rank (say >100). For simply comparing their relative ranks then, you could probably also use Kendall's Tau of Goodman and Kruskal's Gamma. If you also want to account for the semantics, then incorporating cosine similarities of the respective word vectors in one of these formulae should probably be not that difficult. In one of our projects, I think, we were computing several different metrics at once, including Jaccard similarity of two extracted keyword sets and the reciprocal of the number of crossing edges in the two ranked lists.

Kind regards, Wladimir 2016-10-14 12:25 GMT+02:00 Hugo Gonçalo Oliveira <hro...@dei.uc.pt>: > Hi, > > Would it make sense to compute the Minimum Edit Distance to transform one > list into the other? > > Best, > ---------- > Hugo Gonçalo Oliveira > CISUC, Department of Informatics Engineering, University of Coimbra > http://eden.dei.uc.pt/~hroliv > > On Fri, Oct 14, 2016 at 10:01 AM, Eric Atwell <e.s.atw...@leeds.ac.uk> > wrote: > >> Thanks Olga, BUT I don't see how Spearman's (or other) rank correlation >> formula measures overlap between 2 ranked lists containing some different >> words. >> >> >> For example, what is the similarity between 2 "distributional semantics" >> representations of God?: >> >> (1 move, 2 bless, 3 and, 4 forbid, 5 have, 6 be, 7 create, 8 do, 9 know, >> 10 want) and >> >> (1 bless, 2 blesses, 3 delusion, 4 incarnate, 5 hates, 6 himself, 7 >> exists, 8 almighty, 9 forbid, 10 rest) >> >> >> A simple measure involves count of overlapping words (bless, forbid) >> i.e. this scores 2 (or 2/10, or 2/20) >> >> BUT I also want to take into account ranks: (bless 2,1), (forbid 4,9) >> >> >> Does God shed light on my problems? >> >> >> Eric >> >> >> Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence >> groups, >> School of Computing, University of Leeds, Times University of the Year >> 2017 >> >> ------------------------------ >> *From:* Ольга Ляшевская <ole...@yandex.ru> >> *Sent:* 14 October 2016 08:09:36 >> *To:* Eric Atwell; CORPORA discussion forum >> *Cc:* AbdulRahman AlOsaimy; Claire Brierley >> *Subject:* Re: [Corpora-List] metric of overlap between two ranked lists? >> >> Dear Eric, >> >> Spearman's rank correlation coefficient will work in this case, I think. >> https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient >> >> Best, >> Olga >> >> 14.10.2016, 09:54, "Eric Atwell" <e.s.atw...@leeds.ac.uk>: >> >> > Is there a standard metric of overlap between two ranked lists? >> > e.g. to measure/score the similarity between top 10 keywords extracted >> > using 2 different formulae, such as LL v MI? >> > OR e.g. to measure/score the similarity between top 10 hits from Google >> > v top 10 hits from Bing for a give search phrase? >> > OR e.g. to measure/score the similarity between ranked lists of PoS-tags >> > predicted for a word by two rival PoS-taggers in an ensemble tagger? >> > >> > If these were unranked sets of keywords, i could simply count the >> > intersection. But I want to take rank into account in some senible way. >> > >> > thanks for expert pointers to proven metrics ... >> > >> > Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence >> groups, >> > School of Computing, University of Leeds, Times University of the Year >> 2017 >> > >> > _______________________________________________ >> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora >> > Corpora mailing list >> > Corpora@uib.no >> > http://mailman.uib.no/listinfo/corpora >> >> Olga Lyashevskaya >> >> School of Linguistics, Faculty of Humanities >> Higher School of Economics, Moscow >> >> _______________________________________________ >> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora >> Corpora mailing list >> Corpora@uib.no >> http://mailman.uib.no/listinfo/corpora >> >> > > _______________________________________________ > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora > Corpora mailing list > Corpora@uib.no > http://mailman.uib.no/listinfo/corpora > >

