I agree that Kendall's Tau works comparatively well for this task.

On 10/14/2016 01:38 PM, Wladimir Sidorenko wrote:
> Hello Eric,
> 
> When I ran into the problem of non-identical lists, I simply added the
> missing terms to the other list with a very low rank (say >100). For
> simply comparing their relative ranks then, you could probably also use
> Kendall's Tau of Goodman and Kruskal's Gamma.  If you also want to
> account for the semantics, then incorporating cosine similarities of the
> respective word vectors in one of these formulae should probably be not
> that difficult.  In one of our projects, I think, we were computing
> several different metrics at once, including Jaccard similarity of two
> extracted keyword sets and the reciprocal of the number of crossing
> edges in the two ranked lists.
> 
> Kind regards,
> Wladimir
> 
> 2016-10-14 12:25 GMT+02:00 Hugo Gonçalo Oliveira <hro...@dei.uc.pt
> <mailto:hro...@dei.uc.pt>>:
> 
>     Hi,
> 
>     Would it make sense to compute the Minimum Edit Distance to
>     transform one list into the other?
> 
>     Best,
>     ----------
>     Hugo Gonçalo Oliveira
>     CISUC, Department of Informatics Engineering, University of Coimbra
>     http://eden.dei.uc.pt/~hroliv <http://eden.dei.uc.pt/%7Ehroliv>
> 
>     On Fri, Oct 14, 2016 at 10:01 AM, Eric Atwell
>     <e.s.atw...@leeds.ac.uk <mailto:e.s.atw...@leeds.ac.uk>> wrote:
> 
>         Thanks Olga, BUT I don't see how Spearman's (or other) rank
>         correlation formula measures overlap between 2 ranked lists
>         containing some different words.
> 
> 
>         For example, what is the similarity between 2 "distributional
>         semantics" representations of God?:
> 
>         (1 move, 2 bless, 3 and, 4 forbid, 5 have, 6 be, 7 create, 8 do,
>         9 know, 10 want) and 
> 
>         (1 bless, 2 blesses, 3 delusion, 4 incarnate, 5 hates, 6
>         himself, 7 exists, 8 almighty, 9 forbid, 10 rest) 
> 
> 
>          A simple measure involves count of overlapping words (bless,
>         forbid) i.e. this scores 2 (or 2/10, or 2/20)
> 
>         BUT I also want to take into account ranks: (bless 2,1), (forbid
>         4,9) 
> 
> 
>         Does God shed light on my problems?
> 
> 
>         Eric
> 
> 
>         Eric Atwell, Asst Prof, Language@Leeds and Artificial
>         Intelligence groups,
>         School of Computing, University of Leeds, Times University of
>         the Year 2017
> 
>         
> ------------------------------------------------------------------------
>         *From:* Ольга Ляшевская <ole...@yandex.ru <mailto:ole...@yandex.ru>>
>         *Sent:* 14 October 2016 08:09:36
>         *To:* Eric Atwell; CORPORA discussion forum
>         *Cc:* AbdulRahman AlOsaimy; Claire Brierley
>         *Subject:* Re: [Corpora-List] metric of overlap between two
>         ranked lists?
>          
>         Dear Eric,
> 
>         Spearman's rank correlation coefficient will work in this case,
>         I think.
>         
> https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient
>         
> <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>
> 
>         Best,
>         Olga
> 
>         14.10.2016, 09:54, "Eric Atwell" <e.s.atw...@leeds.ac.uk
>         <mailto:e.s.atw...@leeds.ac.uk>>:
> 
>         > Is there a standard metric of overlap between two ranked lists?
>         > e.g. to measure/score the similarity between top 10 keywords 
> extracted
>         > using 2 different formulae, such as LL v MI?
>         > OR e.g. to measure/score the similarity between top 10 hits from 
> Google
>         > v top 10 hits from Bing for a give search phrase?
>         > OR e.g. to measure/score the similarity between ranked lists of 
> PoS-tags
>         > predicted for a word by two rival PoS-taggers in an ensemble tagger?
>         >
>         > If these were unranked sets of keywords, i could simply count the
>         > intersection. But I want to take rank into account in some senible 
> way.
>         >
>         > thanks for expert pointers to proven metrics ...
>         >
>         > Eric Atwell, Asst Prof, Language@Leeds and Artificial Intelligence 
> groups,
>         > School of Computing, University of Leeds, Times University of the 
> Year 2017
>         >
>         > _______________________________________________
>         > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>         <http://mailman.uib.no/options/corpora>
>         > Corpora mailing list
>         > Corpora@uib.no <mailto:Corpora@uib.no>
>         > http://mailman.uib.no/listinfo/corpora
>         <http://mailman.uib.no/listinfo/corpora>
> 
>         Olga Lyashevskaya
> 
>         School of Linguistics, Faculty of Humanities
>          Higher School of Economics, Moscow
> 
>         _______________________________________________
>         UNSUBSCRIBE from this page:
>         http://mailman.uib.no/options/corpora
>         <http://mailman.uib.no/options/corpora>
>         Corpora mailing list
>         Corpora@uib.no <mailto:Corpora@uib.no>
>         http://mailman.uib.no/listinfo/corpora
>         <http://mailman.uib.no/listinfo/corpora>
> 
> 
> 
>     _______________________________________________
>     UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>     <http://mailman.uib.no/options/corpora>
>     Corpora mailing list
>     Corpora@uib.no <mailto:Corpora@uib.no>
>     http://mailman.uib.no/listinfo/corpora
>     <http://mailman.uib.no/listinfo/corpora>
> 
> 
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora@uib.no
> http://mailman.uib.no/listinfo/corpora
> 

-- 
Solve et coagula!
Andrey

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora@uib.no
http://mailman.uib.no/listinfo/corpora

Reply via email to