I updated my code to use a dissimilarity matrix <http://nbviewer.ipython.org/github/JBPressac/MDS-of-DGG/blob/master/Davis%20Southern%20Women%20MDS%20of%20Jaccard%20coefficient.ipynb>, which gives a better result.
Thank you for your answers,

Jean-Baptiste Pressac

Traitement et analyse de bases de données
Production et diffusion de corpus numériques

Centre de Recherche Bretonne et Celtique
Unité mixte de service (UMS) 3554
20 rue Duquesne
CS 93837
29238 Brest cedex 3

tel : +33 (0)2 98 01 68 95
fax : +33 (0)2 98 01 63 93

Le 03/03/2015 11:54, federico vaggi a écrit :
Hi,

a dissimilarity is simply a measure of how difference a vector A is from a vector B. A metric also has to satisfy four extra rules: http://en.wikipedia.org/wiki/Metric_(mathematics) <http://en.wikipedia.org/wiki/Metric_%28mathematics%29>

Wikipedia makes it more complicated than it has to be, but basically:

 1. /d/(/x/, /y/) ≥ 0     (/non-negativity
    <http://en.wikipedia.org/wiki/Non-negative>/, or separation axiom)
 2. /d/(/x/, /y/) = 0   if and only if /x/ = /y/     (/identity of
    indiscernibles
    <http://en.wikipedia.org/wiki/Identity_of_indiscernibles>/, or
    coincidence axiom)
 3. /d/(/x/, /y/) = /d/(/y/, /x/)     (/symmetry
    <http://en.wikipedia.org/wiki/Symmetric_function>/)
 4. /d/(/x/, /z/) ≤ /d/(/x/, /y/) + /d/(/y/, /z/)     (/subadditivity
    <http://en.wikipedia.org/wiki/Subadditivity>/ / /triangle
    inequality <http://en.wikipedia.org/wiki/Triangle_inequality>/).


The main one is point 4: For a lot of dissimilarities, this is not the case.

For example - take a simple relationship like A likes B:

Alice can like Bob, but Bob can dislike Alice

It would however be meaningless to say:

Alice is 2 meters from Bob, but Bob is 1 meter from Alice.

A lot of clustering and scaling algorithms only work if the distance you are working with is a metric. You can sometimes force the algorithm to work with something which isn't a metric, but you will get results which can be a bit absurd, especially if your dissimilarity is particularly badly behaved.

On Tue, Mar 3, 2015 at 11:42 AM, Jean-Baptiste Pressac <jean-baptiste.pres...@univ-brest.fr <mailto:jean-baptiste.pres...@univ-brest.fr>> wrote:

    But what could be a dissimilarity matrix in the case of events
    co-attended by people ? What is more, in the publication I try to
    reproduce, the author made a MDS from the Jaccard coefficients...
    What is more, my trial is inspired from a code which analyses a
    matrix of distances between european citie
    <http://baoilleach.blogspot.fr/2014/01/convert-distance-matrix-to-2d.html>s.

    By the way, I don't understand the meaning of the dissimilarity
    parameter of manifold.MDS. In which case a "dissimilarity" is
    euclidean ? Sorry, I am quite knew with some concepts.

    Jean-Baptiste Pressac

    Traitement et analyse de bases de données
    Production et diffusion de corpus numériques

    Centre de Recherche Bretonne et Celtique
    Unité mixte de service (UMS) 3554
    20 rue Duquesne
    CS 93837
    29238 Brest cedex 3

    tel :+33 (0)2 98 01 68 95  <tel:%2B33%20%280%292%2098%2001%2068%2095>
    fax :+33 (0)2 98 01 63 93  <tel:%2B33%20%280%292%2098%2001%2063%2093>

    Le 03/03/2015 11:09, Joel Nothman a écrit :
    I think DSW_jaccard_matrix is a matrix of similarity (which is
    what Jaccard usually means) not of dissimilarity. Try negating it
    before MDS.

    On 3 March 2015 at 20:07, Jean-Baptiste Pressac
    <jean-baptiste.pres...@univ-brest.fr
    <mailto:jean-baptiste.pres...@univ-brest.fr>> wrote:

        Hello,
        I tried to reproduce the analysis of events co-attended by
        woman via manifold MDS (shared on IPython Notebook)
        
<http://nbviewer.ipython.org/github/JBPressac/MDS-of-DGG/blob/master/Davis%20Southern%20Women%20MDS%20of%20Jaccard%20coefficient.ipynb>,
        but the MDS does not reflects the data. I certainly did
        something wrong, but I could'nt figure out how to do a proper
        MDS. Any clue would be appreciated.
        Thanks,

-- Jean-Baptiste Pressac

        Traitement et analyse de bases de données
        Production et diffusion de corpus numériques

        Centre de Recherche Bretonne et Celtique
        Unité mixte de service (UMS) 3554
        20 rue Duquesne
        CS 93837
        29238 Brest cedex 3

        tel :+33 (0)2 98 01 68 95  <tel:%2B33%20%280%292%2098%2001%2068%2095>
        fax :+33 (0)2 98 01 63 93  <tel:%2B33%20%280%292%2098%2001%2063%2093>


        
------------------------------------------------------------------------------
        Dive into the World of Parallel Programming The Go Parallel
        Website, sponsored
        by Intel and developed in partnership with Slashdot Media, is
        your hub for all
        things parallel software development, from weekly thought
        leadership blogs to
        news, videos, case studies, tutorials and more. Take a look
        and join the
        conversation now. http://goparallel.sourceforge.net/
        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




    
------------------------------------------------------------------------------
    Dive into the World of Parallel Programming The Go Parallel Website, 
sponsored
    by Intel and developed in partnership with Slashdot Media, is your hub for 
all
    things parallel software development, from weekly thought leadership blogs 
to
    news, videos, case studies, tutorials and more. Take a look and join the
    conversation now.http://goparallel.sourceforge.net/


    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net  
<mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


    
------------------------------------------------------------------------------
    Dive into the World of Parallel Programming The Go Parallel
    Website, sponsored
    by Intel and developed in partnership with Slashdot Media, is your
    hub for all
    things parallel software development, from weekly thought
    leadership blogs to
    news, videos, case studies, tutorials and more. Take a look and
    join the
    conversation now. http://goparallel.sourceforge.net/
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

<<attachment: Jean-Baptiste_Pressac.vcf>>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to