Re: [R] isoMDS vs. other non-metric non-R routines
Hi Phil, Are you using metaMDS in the vegan package? This allows you to determine the number of random starts, and selects the best. It might help. Hank Stevens Dear Phil, I don't have experiences with Minissa but I know that isoMDS is bad in some situations. I have even seen situations with non-metric dissimilarities in which the classical MDS was preferable. Some alternatives that you have: 1) Try to start isoMDS from other initial configurations (by default, it starts from the classical solution). 2) Try sammon mapping (command should be sammon). 3) Have a look at XGvis/GGvis (which may be part of XGobi/GGobi). These are not directly part of R but have R interfaces. They allow you to toy around quite a lot with different algorithms, stress functions (the isoMDS stress is not necessarily what you want) and initial configurations so that you can find a better solution and understand your data better. Unfortunately I don't have the time to give you more detail, but google for it (or somebody else will tell you more). Best, Christian On Tue, 13 Feb 2007, Philip Leifeld wrote: Dear useRs, last week I asked you about a problem related to isoMDS. It turned out that in my case isoMDS was trapped. Nonetheless, I still have some problems with other data sets. Therefore I would like to know if anyone here has experience with how well isoMDS performs in comparison to other non-metric MDS routines, like Minissa. I have the feeling that for large data sets with a high stress value (e.g. around 0.20) in cases where the intrinsic dimensionality of the data cannot be significantly reduced without considerably increasing stress, isoMDS performs worse (and yields a stress value of 0.31 in my example), while solutions tend to be similar for better fits and lower intrinsic dimensionality. I tried this on another data set where isoMDS yields a stress value of 0.19 and Minissa a stress value of 0.14. Now the latter would still be considered a fair solution by some people while the former indicates a poor fit regardless of how strict your judgment is. I generally prefer using R over mixing with different programs, so it would be nice if results were of comparable quality... Cheers Phil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] isoMDS vs. other non-metric non-R routines
Dear useRs, last week I asked you about a problem related to isoMDS. It turned out that in my case isoMDS was trapped. Nonetheless, I still have some problems with other data sets. Therefore I would like to know if anyone here has experience with how well isoMDS performs in comparison to other non-metric MDS routines, like Minissa. I have the feeling that for large data sets with a high stress value (e.g. around 0.20) in cases where the intrinsic dimensionality of the data cannot be significantly reduced without considerably increasing stress, isoMDS performs worse (and yields a stress value of 0.31 in my example), while solutions tend to be similar for better fits and lower intrinsic dimensionality. I tried this on another data set where isoMDS yields a stress value of 0.19 and Minissa a stress value of 0.14. Now the latter would still be considered a fair solution by some people while the former indicates a poor fit regardless of how strict your judgment is. I generally prefer using R over mixing with different programs, so it would be nice if results were of comparable quality... Cheers Phil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] isoMDS vs. other non-metric non-R routines
Dear Phil, I don't have experiences with Minissa but I know that isoMDS is bad in some situations. I have even seen situations with non-metric dissimilarities in which the classical MDS was preferable. Some alternatives that you have: 1) Try to start isoMDS from other initial configurations (by default, it starts from the classical solution). 2) Try sammon mapping (command should be sammon). 3) Have a look at XGvis/GGvis (which may be part of XGobi/GGobi). These are not directly part of R but have R interfaces. They allow you to toy around quite a lot with different algorithms, stress functions (the isoMDS stress is not necessarily what you want) and initial configurations so that you can find a better solution and understand your data better. Unfortunately I don't have the time to give you more detail, but google for it (or somebody else will tell you more). Best, Christian On Tue, 13 Feb 2007, Philip Leifeld wrote: Dear useRs, last week I asked you about a problem related to isoMDS. It turned out that in my case isoMDS was trapped. Nonetheless, I still have some problems with other data sets. Therefore I would like to know if anyone here has experience with how well isoMDS performs in comparison to other non-metric MDS routines, like Minissa. I have the feeling that for large data sets with a high stress value (e.g. around 0.20) in cases where the intrinsic dimensionality of the data cannot be significantly reduced without considerably increasing stress, isoMDS performs worse (and yields a stress value of 0.31 in my example), while solutions tend to be similar for better fits and lower intrinsic dimensionality. I tried this on another data set where isoMDS yields a stress value of 0.19 and Minissa a stress value of 0.14. Now the latter would still be considered a fair solution by some people while the former indicates a poor fit regardless of how strict your judgment is. I generally prefer using R over mixing with different programs, so it would be nice if results were of comparable quality... Cheers Phil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] isoMDS vs. other non-metric non-R routines
Sorry for not threading: I don't subscribe to this list, and the linking of web browser and email seems to be rudimentary. I don't know what is Minissa. Sounds like a piece of software. What is the method it implements? That is, is it supposed to implement the same method as isoMDS or something else? IsoMDS implements Kruskal's (and Young's and Sheperd's and Torgeson's) NMDS, but there are other methods too. You are supposed to get similar results only with the same method. For instance, there are various definitions of stress, two of them amusingly called stress-1 and stress-2, but there are others. You didn't give much detail about how you used isoMDS. We already discussed the danger of trapping in the starting configuration which you can avoid with trying (several) random starting configurations. Have you used 'tol' (and 'maxit') arguments in isoMDS? The default 'tol' is rather slack, and 'maxit' fairly low, since (speculation) the function was written a long time ago when computer were slow, but if you have something better than 75MHz i486, you can try with other values. I have used isoMDS quite a lot, and I have had good experience. Cheers, Jari Oksanen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] isoMDS vs. other non-metric non-R routines
Thanks for your message. I don't know what is Minissa. Sounds like a piece of software. What is the method it implements? That is, is it supposed to implement the same method as isoMDS or something else? IsoMDS implements Kruskal's (and Young's and Sheperd's and Torgeson's) NMDS, but there are other methods too. You are supposed to get similar results only with the same method. For instance, there are various definitions of stress, two of them amusingly called stress-1 and stress-2, but there are others. Yes, Minissa uses Kruskal's NMDS and stress1, so results should be comparable. You didn't give much detail about how you used isoMDS. We already discussed the danger of trapping in the starting configuration which you can avoid with trying (several) random starting configurations. Have you used 'tol' (and 'maxit') arguments in isoMDS? The default 'tol' is rather slack, and 'maxit' fairly low, since (speculation) the function was written a long time ago when computer were slow, but if you have something better than 75MHz i486, you can try with other values. Cheers, Jari Oksanen This was my initial call: mds - isoMDS(dist, y = cmdscale(dist, k = 2), k=2, tol = 1e-3, maxit = 500) I played around a little bit with tol and maxit (adding some zeros...) and increased the number of dimensions, but it did not change the results significantly. Using initMDS did not improve the result either. Unfortunately, my data set is too large to be displayed here. Any other ideas? My stress value is still 1.5 as much as in other implementations of NMDS. Cheers Phil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] isoMDS vs. other non-metric non-R routines
philip.leifeld at uni-konstanz.de wrote: This was my initial call: mds - isoMDS(dist, y = cmdscale(dist, k = 2), k=2, tol = 1e-3, maxit = 500) I played around a little bit with tol and maxit (adding some zeros...) and increased the number of dimensions, but it did not change the results significantly. Using initMDS did not improve the result either. Unfortunately, my data set is too large to be displayed here. Any other ideas? My stress value is still 1.5 as much as in other implementations of NMDS. It is really difficult to believe that isoMDS would work so completely differently from other implementations. I guess you already tried tol=1e-7? After this, a radical trick is to give the Minissa result as the starting configuration, and see if you stay there and get the same stress as Minissa reported. You should. In particular, if you iterate away from the starting configuration, then the starting configuration was not as good as you assumed. If this happens, it would be time to check the data. I assume you have read in dissimilarities from external files, and surprises do happen (it makes sense to check the data anyway). Increasing the number of dimensions should not get you into a similar solution as with some other implementation using a lower number of dimensions. About the problems Christian Hennig mentioned: My interpretation of his message was that he was not concerned about isoMDS in particular but about NMDS in general (but he will correct me if my interpretation was wrong). I can imagine cases where non-metric solution works badly, in particular with small data sets. However, that should concern all implementations similarly, and probably it should be visible in Shepard plots (see isoMDS help). Cheers, Jari Oksanen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] isoMDS - high stress value and strange configuration
Dear R users, I have a specific question about isoMDS. Imagine the following (fake) distance table: hamburg bremen berlin munich cologne hamburg 0911982677 424 bremen 911 0293547 513 berlin 982293 0785 875 munich 677547785 0 375 cologne 424513875375 0 Now if I try a non-metric multidimensional scaling on these dissimilarities using isoMDS (or metaMDS), the stress value is 6.34. Nevertheless, other programs (e.g. the Minissa routine implemented in UCINet) yield a stress value of 0.00, and the configuration looks completely different. I tried this with multiple distance matrices: One time UCINet computed a stress value of 0.21 while isoMDS produced a stress of 0.33, and again the configuration was completely different and apparently random (while the configuration in UCINet still made sense). Here is what I tried: isoMDS(cities, y = cmdscale(cities, k = 2), k = 2, maxit = 50) Please give me a hint on how to improve the results. I suppose the above command is not complete, or something is wrong with it, or maybe the input distances are not in the right format. Btw, the problem does not occur when I use the real distances between these cities, not some other numbers, so apparently three-digit numbers should be fine as input values? Thanks! Phil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] isoMDS - high stress value and strange configuration
I have a specific question about isoMDS. Imagine the following (fake) distance table: hamburg bremen berlin munich cologne hamburg 0911982677 424 bremen 911 0293547 513 berlin 982293 0785 875 munich 677547785 0 375 cologne 424513875375 0 Now if I try a non-metric multidimensional scaling on these dissimilarities using isoMDS (or metaMDS), the stress value is 6.34. Nevertheless, other programs (e.g. the Minissa routine implemented in UCINet) yield a stress value of 0.00, and the configuration looks completely different. This indeed seems to be a case where NMDS is trapped in its starting configuration. Metric scaling (cmdscale) produces a cute horseshoe, but the best NMDS solutions looks completely different. Any small change from the initial solution leads into a worse configuration, and you need a bigger change in the beginning. Using a random configuration seems to help: isoMDS(dis, initMDS(dis)) initial value 36.383132 iter 5 value 28.671652 iter 10 value 16.711327 iter 15 value 6.392572 iter 20 value 3.007208 final value 0.00 converged $points [,1] [,2] hamburg 29.428121 -36.07858 bremen2.740499 32.38745 berlin1.984215 35.35429 munich -16.910941 -14.13750 cologne -13.844187 -15.24468 $stress [1] 1.56159e-14 In this case I generated the random configuration using function initMDS of vegan, but you can do that quite well by any other way. Another point (which does not matter here so much) is that isoMDS multiplies stress by 100, so that your stress of 6 would corresponde 0.06 in some other software (assuming they use the same stress). cheers, jari oksanen -- Jari Oksanen [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] isoMDS and 0 distances
Short answer: you cannot compare distances including NAs, so there is no way to find a monotone mapping of distances. If the data really are identical for two rows, you can easily drop one of them whilst doing MDS, and then assign the position found for one to the other. On Tue, 18 Apr 2006, Tyler Smith wrote: Hi, I'm trying to do a non-metric multidimensional scaling using isoMDS. However, I have some '0' distances in my data, and I'm not sure how to deal with them. I'd rather not drop rows from the original data, as I am comparing several datasets (morphology and molecular data) for the same individuals, and it's interesting to see how much morphological variation can be associated with an identical genotype. I've tried replacing the 0's with NA, but the isoMDS appears to stop on the first iteration and the stress does not improve: distA # A dist object with 13695 elements, 4 of which == 0 cmdsA - cmdscale(distA, k=2) distB - distA distB[which(distB==0)] - NA isoA - isoMDS(distB, cmdsA) initial value 21.835691 final value 21.835691 converged The other approach I've tried is replacing the 0's with small numbers. In this case isoMDS does reduce the stress values. min(distA[which(distA0)]) [1] 0.02325581 distC - distA distC[which(distC==0)] - 0.001 isoC - isoMDS(distC) initial value 21.682854 iter 5 value 16.862093 iter 10 value 16.451800 final value 16.339224 converged So my questions are: what am I doing wrong in the first example? Why does isoMDS converge without doing anything? Is replacing the 0's with small numbers an appropriate alternative? Thanks for your time, Tyler R 2.2.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS and 0 distances
On Tue, 2006-04-18 at 22:06 -0400, Tyler Smith wrote: I'm trying to do a non-metric multidimensional scaling using isoMDS. However, I have some '0' distances in my data, and I'm not sure how to deal with them. I'd rather not drop rows from the original data, as I am comparing several datasets (morphology and molecular data) for the same individuals, and it's interesting to see how much morphological variation can be associated with an identical genotype. I've tried replacing the 0's with NA, but the isoMDS appears to stop on the first iteration and the stress does not improve: distA # A dist object with 13695 elements, 4 of which == 0 cmdsA - cmdscale(distA, k=2) distB - distA distB[which(distB==0)] - NA isoA - isoMDS(distB, cmdsA) initial value 21.835691 final value 21.835691 converged The other approach I've tried is replacing the 0's with small numbers. In this case isoMDS does reduce the stress values. min(distA[which(distA0)]) [1] 0.02325581 distC - distA distC[which(distC==0)] - 0.001 isoC - isoMDS(distC) initial value 21.682854 iter 5 value 16.862093 iter 10 value 16.451800 final value 16.339224 converged So my questions are: what am I doing wrong in the first example? Why does isoMDS converge without doing anything? Is replacing the 0's with small numbers an appropriate alternative? Tyler, My experience is that isoMDS *may* fail to go away from the starting configuration if there are identical values in initial configuration, and this will happen if you use cmdscale() to get the initial configuration. You *may* get over this by shifting duplicates a bit: con - cmdscale(dis) dups - duplicated(con) sum(dups) [1] 2 con[dups, ] - con[dups,] + runif(2*sum(dups), -0.01, 0.01) Then isoMDS may go further. Another issue is that at a quick look isoMDS() seems to do nothing sensible with missing values, although it accepts them. The only thing is that they are ordered last, or regarded as very long distances (in your case they rather should be regarded as very short distances). The keylines in isoMDS are: ord - order(dis) nd - sum(!is.na(ord)) Even when 'dis' has missing values, the result of order() ('ord') has no missing values, but with default argument na.last=TRUE they are put last in the list. An obvious looking change would be to replace the second line with: nd - sum(!is.na(dis)) but this dumps the core of R at least in my machine: probably you need the full length of vectors also in addition to number of non-missing entries. (This quick look was based on the latest release version of MASS/VR: there may be a newer version already with the upcoming R release, but that's not released yet.) You may check working with NA: are duplicate points identical in results? Then about replacing zero distances with a tiny number: this has been discussed before in this list, and Ripley said no, no!. I do it all the time, but only in secrecy. A suggested solution was to drop duplicates, but then there still is a weighting issue, and isoMDS does not have weights argument. cheers, jari oksanen -- Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS and 0 distances
On Wed, 2006-04-19 at 07:46 +0100, Prof Brian Ripley wrote: Short answer: you cannot compare distances including NAs, so there is no way to find a monotone mapping of distances. The original Kruskal-Young-Shepard-Torgerson programme KYST (version 1 from 1973) could handle missing values. Unfortunately I've lost the documents, but if I remember correctly, the argument was that you don't need but a subset (representative for points) of (dis)similarities to get a monotone regression. KYST -- and computers of that time (I used Burroughs!) -- had limitations on data size, and removing some of the dissimilarities was a way of getting more than 64 data points into analysis. However, better not go into details since: C THIS INFORMATION IS PROPRIETARY AND IS THE C PROPERTY OF BELL TELEPHONE LABORATORIES, C INCORPORATED. ITS REPRODUCTION OR DISCLOSURE C TO OTHERS, EITHER ORALLY OR IN WRITING, IS C PROHIBITED WITHOUT WRITTEN PRERMISSION OF C BELL LABORATORIES. CKYST-2A AUGUST, 1977 cheers, jari oksanen -- Jari Oksanen -- Biologian laitos, Oulun yliopisto, 90014 Oulu sposti [EMAIL PROTECTED], kotisivu http://cc.oulu.fi/~jarioksa/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS and 0 distances
About replacing the zeroes with tiny numbers: isoMDS works with the rankings of the distances. Therefore replacing zeroes by tiny values gives them a rank above the real zeroes (distance to same observation) and below all the non-zero distances. If this makes sense in your application (in my experience it usually does), you can do it. Sometimes the classical MDS solution is a local optimum of the isoMDS criterion. In these cases isoMDS converges in one step (rather it gives you the classical MDS solution). This may happen with and without zero or NA distances. Best, Christian On Tue, 18 Apr 2006, Tyler Smith wrote: Hi, I'm trying to do a non-metric multidimensional scaling using isoMDS. However, I have some '0' distances in my data, and I'm not sure how to deal with them. I'd rather not drop rows from the original data, as I am comparing several datasets (morphology and molecular data) for the same individuals, and it's interesting to see how much morphological variation can be associated with an identical genotype. I've tried replacing the 0's with NA, but the isoMDS appears to stop on the first iteration and the stress does not improve: distA # A dist object with 13695 elements, 4 of which == 0 cmdsA - cmdscale(distA, k=2) distB - distA distB[which(distB==0)] - NA isoA - isoMDS(distB, cmdsA) initial value 21.835691 final value 21.835691 converged The other approach I've tried is replacing the 0's with small numbers. In this case isoMDS does reduce the stress values. min(distA[which(distA0)]) [1] 0.02325581 distC - distA distC[which(distC==0)] - 0.001 isoC - isoMDS(distC) initial value 21.682854 iter 5 value 16.862093 iter 10 value 16.451800 final value 16.339224 converged So my questions are: what am I doing wrong in the first example? Why does isoMDS converge without doing anything? Is replacing the 0's with small numbers an appropriate alternative? Thanks for your time, Tyler R 2.2.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS and 0 distances
Thanks all! From Christian's explanation I think I will be alright adding small values to my zero distances. In my application my distances are limited by the number of primer pairs I use, and it is reasonable to expect that adding primer pairs would eventually reveal some small genetic difference among plants collected from locations many hundreds of miles apart. I've also found that using Jari's metaMDSiter() function from the vegan package gets me out of the local minimum traps that troubled me earlier. Cheers, Tyler Christian Hennig wrote: About replacing the zeroes with tiny numbers: isoMDS works with the rankings of the distances. Therefore replacing zeroes by tiny values gives them a rank above the real zeroes (distance to same observation) and below all the non-zero distances. If this makes sense in your application (in my experience it usually does), you can do it. Sometimes the classical MDS solution is a local optimum of the isoMDS criterion. In these cases isoMDS converges in one step (rather it gives you the classical MDS solution). This may happen with and without zero or NA distances. Best, Christian On Tue, 18 Apr 2006, Tyler Smith wrote: Hi, I'm trying to do a non-metric multidimensional scaling using isoMDS. However, I have some '0' distances in my data, and I'm not sure how to deal with them. I'd rather not drop rows from the original data, as I am comparing several datasets (morphology and molecular data) for the same individuals, and it's interesting to see how much morphological variation can be associated with an identical genotype. I've tried replacing the 0's with NA, but the isoMDS appears to stop on the first iteration and the stress does not improve: distA # A dist object with 13695 elements, 4 of which == 0 cmdsA - cmdscale(distA, k=2) distB - distA distB[which(distB==0)] - NA isoA - isoMDS(distB, cmdsA) initial value 21.835691 final value 21.835691 converged The other approach I've tried is replacing the 0's with small numbers. In this case isoMDS does reduce the stress values. min(distA[which(distA0)]) [1] 0.02325581 distC - distA distC[which(distC==0)] - 0.001 isoC - isoMDS(distC) initial value 21.682854 iter 5 value 16.862093 iter 10 value 16.451800 final value 16.339224 converged So my questions are: what am I doing wrong in the first example? Why does isoMDS converge without doing anything? Is replacing the 0's with small numbers an appropriate alternative? Thanks for your time, Tyler R 2.2.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] isoMDS and 0 distances
Hi, I'm trying to do a non-metric multidimensional scaling using isoMDS. However, I have some '0' distances in my data, and I'm not sure how to deal with them. I'd rather not drop rows from the original data, as I am comparing several datasets (morphology and molecular data) for the same individuals, and it's interesting to see how much morphological variation can be associated with an identical genotype. I've tried replacing the 0's with NA, but the isoMDS appears to stop on the first iteration and the stress does not improve: distA # A dist object with 13695 elements, 4 of which == 0 cmdsA - cmdscale(distA, k=2) distB - distA distB[which(distB==0)] - NA isoA - isoMDS(distB, cmdsA) initial value 21.835691 final value 21.835691 converged The other approach I've tried is replacing the 0's with small numbers. In this case isoMDS does reduce the stress values. min(distA[which(distA0)]) [1] 0.02325581 distC - distA distC[which(distC==0)] - 0.001 isoC - isoMDS(distC) initial value 21.682854 iter 5 value 16.862093 iter 10 value 16.451800 final value 16.339224 converged So my questions are: what am I doing wrong in the first example? Why does isoMDS converge without doing anything? Is replacing the 0's with small numbers an appropriate alternative? Thanks for your time, Tyler R 2.2.1 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
On Wed, 2004-09-08 at 21:31, Doran, Harold wrote: Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as This error message is quite informative: you have either missing or non-finite entries in your data. The only surprising thing here is that cmdscale works: it should fail, too. Are you sure that you haven't done anything with your data matrix in between, like changed it from matrix to a dist object? If the Inf/NaN/NA values are on the diagonal, they will magically disappear with as.dist. Anyway, if you're able to get a metric scaling result, you can manually feed that into isoMDS for the initial configuration, and avoid the check. See ?isoMDS. d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. If you don't know whether you should use isoMDS or cmdscale, you probably should use cmdscale. If you know, things are different. Probably isoMDS gives you `better'(TM) results, but it is more complicated to handle. cheers, jari oksanen -- Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061 email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS
On Thu, 2004-09-09 at 04:53, Kjetil Brinchmann Halvorsen wrote: Mardia, kent Bibby defines the standard transformation from a similarity matrix to a dissimilarity (distance) matrix by d_rs - sqrt( c_rr -2*c_rs + c_ss) where c_rs are the similarities. This assures the diagonal of the dissimilarity matrix to be zero. You could try that. In R notation, this would be sim2dist - function(x) as.dist(sqrt(outer(diag(x), diag(x), +) - 2*x)) Mardia, Kent Bibby indeed passingly say that this is a `standard transformation' (page 403). However, it is really a canonical way only if diagonal elements in similarity matrix are sums of squares, and off-diagonal elements are cross products. In that case the `standard transformation' gives you Euclidean distances (or if you have variances/covariances or ones/correlations it gives you something similar). However, it is no standard if your similarities are something else, and cannot be transformed into Euclidean distances. However, in isoMDS this *may* not matter, since NMDS uses only rank order of dissimilarities, and any transformation giving dissimilarities in the same rank order *may* give similar results. The statement was conditions (may), since isoMDS uses cmdscale for the starting configuration, and cmdscale will give different results with different transformations. So isoMDS may stop in different (local) optima. Setting `tol' parameter low enough in isoMDS (see ?isoMDS) helped in a couple of cases I tried, and the results were practically identical with different transformations. So it doesn't matter too much how you change your similarities to dissimilarities, since isoMDS indeed treats them as dissimilarities (but cmdscale treats them as distances). cheers, jari oksanen -- J.Oksanen, Oulu, Finland. Object-oriented programming is an exceptionally bad idea which could only have originated in California. E. Dijkstra __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have reproduced my steps below for clarification if this happens to shed any light. Here is the original total matrix (see opening thread if you care how this is created) a b c d e f g h a 4 4 2 4 1 2 0 0 b 4 4 2 4 1 2 0 0 c 2 2 4 2 3 2 2 1 d 4 4 2 4 1 2 0 0 e 1 1 3 1 4 3 3 2 f 2 2 2 2 3 4 2 1 g 0 0 2 0 3 2 4 3 h 0 0 1 0 2 1 3 4 So, there are 8 items. This matrix indicates that items 1,2, and 4 were always grouped together (or viewed as being similar by individuals). I transformed this using tt-max(t)-t which results in a b c d e f g h a 0 0 2 0 3 2 4 4 b 0 0 2 0 3 2 4 4 c 2 2 0 2 1 2 2 3 d 0 0 2 0 3 2 4 4 e 3 3 1 3 0 1 1 2 f 2 2 2 2 1 0 2 3 g 4 4 2 4 1 2 0 1 h 4 4 3 4 2 3 1 0 When I run isoMDS on this new matrix, it tells me to specify the initial config because of the NA/INFs/ But when I perform cmdscale on this same matrix I end up with the following results, bt-cmdscale(tt);bt [,1] [,2] a -1.79268634 -0.2662750 b -1.79268634 -0.2662750 c -0.02635497 0.5798934 d -1.79268634 -0.2662750 e 1.08978620 0.6265313 f -0.02635497 0.5798934 g 2.20852966 0.2828937 h 2.13245309 -1.2703869 The results suggest that items 1,2, and 4 have similar locations as is expected. Also items 3 and 6 have similar locations as would also be expected. So, my results seem to have been replicated correctly using cmdscale. I've tried to specify an initial config using isoMDS in a few ways without success, so I am surely doing something wrong. So far, I have tried the following: ll-isoMDS(tt, y=cmdscale(tt)) which tells me zero or negative distance between objects 1 and 2 ll-isoMDS(tt, y=cmdscale(tt, k=2)) Again, thanks, Harold -Original Message- From: Jari Oksanen [mailto:[EMAIL PROTECTED] Sent: Thu 9/9/2004 4:26 AM To: Doran, Harold Cc: Prof Brian Ripley; R-News Subject: RE: [R] isoMDS On Wed, 2004-09-08 at 21:31, Doran, Harold wrote: Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as This error message is quite informative: you have either missing or non-finite entries in your data. The only surprising thing here is that cmdscale works: it should fail, too. Are you sure that you haven't done anything with your data matrix in between, like changed it from matrix to a dist object? If the Inf/NaN/NA values are on the diagonal, they will magically disappear with as.dist. Anyway, if you're able to get a metric scaling result, you can manually feed that into isoMDS for the initial configuration, and avoid the check. See ?isoMDS. d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. If you don't know whether you should use isoMDS or cmdscale, you probably should use cmdscale. If you know, things are different. Probably isoMDS gives you `better'(TM) results, but it is more complicated to handle. cheers, jari oksanen -- Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061 email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/ [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
On Thu, 2004-09-09 at 14:25, Doran, Harold wrote: Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have reproduced my steps below for clarification if this happens to shed any light. --- snip --- Doran, Your data clarified things. It seems to me now, that your data are not a a matrix but a data.frame. A problem for an ordinary user is that data.frames and matrices look identical, but that's only surface: you shouldn't be shallow but look deep in their souls to see that they are compeletely different, and therefore isoMDS fails. At least isoMDS gives just that error for a data.frame, but cmdscale casts data.frame to a matrix therefore it works. So the following should work (worked when I tied): tt - as.matrix(tt) isoMDS(tt) (and you could down to a dist object with tt - as.dist(tt) which seems to handle data.frames directly, too). Then you will still need to avoid the complaint about zero-distances among points. This means that you have some identical points in your data, and isoMDS does not like them. This issue was discussed here in April, 2004 (and many other times). Search archives for the subject question on isoMDS. cheers, jari oksanen -- Jari Oksanen [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
I get the following message: Error in isoMDS(tt) : zero or negative distance between objects 1 and 2 This makes sense since a and b are identical in their relationship to c to h. Drop row 1 and col 1 and you get isoMDS(tt[2:8,2:8]) initial value 14.971992 iter 5 value 8.027815 iter 10 value 4.433377 iter 15 value 3.496364 iter 20 value 3.346726 final value 3.233738 converged $points [,1] [,2] [1,] -2.3143653 -0.1259226 [2,] -0.3205746 -1.1534662 [3,] -2.8641922 -0.1182906 [4,] 0.7753674 0.1497328 [5,] -0.5705552 1.2416843 [6,] 2.2305175 -0.6995917 [7,] 3.0638025 0.7058540 $stress [1] 3.233738 Does this help? -Original Message- From: Doran, Harold [mailto:[EMAIL PROTECTED] Sent: September 9, 2004 8:26 AM To: Jari Oksanen Cc: Doran, Harold; Prof Brian Ripley; R-News Subject: RE: [R] isoMDS Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have reproduced my steps below for clarification if this happens to shed any light. Here is the original total matrix (see opening thread if you care how this is created) a b c d e f g h a 4 4 2 4 1 2 0 0 b 4 4 2 4 1 2 0 0 c 2 2 4 2 3 2 2 1 d 4 4 2 4 1 2 0 0 e 1 1 3 1 4 3 3 2 f 2 2 2 2 3 4 2 1 g 0 0 2 0 3 2 4 3 h 0 0 1 0 2 1 3 4 So, there are 8 items. This matrix indicates that items 1,2, and 4 were always grouped together (or viewed as being similar by individuals). I transformed this using tt-max(t)-t which results in a b c d e f g h a 0 0 2 0 3 2 4 4 b 0 0 2 0 3 2 4 4 c 2 2 0 2 1 2 2 3 d 0 0 2 0 3 2 4 4 e 3 3 1 3 0 1 1 2 f 2 2 2 2 1 0 2 3 g 4 4 2 4 1 2 0 1 h 4 4 3 4 2 3 1 0 When I run isoMDS on this new matrix, it tells me to specify the initial config because of the NA/INFs/ But when I perform cmdscale on this same matrix I end up with the following results, bt-cmdscale(tt);bt [,1] [,2] a -1.79268634 -0.2662750 b -1.79268634 -0.2662750 c -0.02635497 0.5798934 d -1.79268634 -0.2662750 e 1.08978620 0.6265313 f -0.02635497 0.5798934 g 2.20852966 0.2828937 h 2.13245309 -1.2703869 The results suggest that items 1,2, and 4 have similar locations as is expected. Also items 3 and 6 have similar locations as would also be expected. So, my results seem to have been replicated correctly using cmdscale. I've tried to specify an initial config using isoMDS in a few ways without success, so I am surely doing something wrong. So far, I have tried the following: ll-isoMDS(tt, y=cmdscale(tt)) which tells me zero or negative distance between objects 1 and 2 ll-isoMDS(tt, y=cmdscale(tt, k=2)) Again, thanks, Harold -Original Message- From: Jari Oksanen [mailto:[EMAIL PROTECTED] Sent: Thu 9/9/2004 4:26 AM To: Doran, Harold Cc: Prof Brian Ripley; R-News Subject: RE: [R] isoMDS On Wed, 2004-09-08 at 21:31, Doran, Harold wrote: Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as This error message is quite informative: you have either missing or non-finite entries in your data. The only surprising thing here is that cmdscale works: it should fail, too. Are you sure that you haven't done anything with your data matrix in between, like changed it from matrix to a dist object? If the Inf/NaN/NA values are on the diagonal, they will magically disappear with as.dist. Anyway, if you're able to get a metric scaling result, you can manually feed that into isoMDS for the initial configuration, and avoid the check. See ?isoMDS. d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. If you don't know whether you should use isoMDS or cmdscale, you probably
[R] isoMDS
Dear List: I have a question regarding an MDS procedure that I am accustomed to using. I have searched around the archives a bit and the help doc and still need a little assistance. The package isoMDS is what I need to perform the non-metric scaling, but I am working with similarity matrices, not dissimilarities. The question may end up being resolved simply. Here is a bit of substantive background. I am working on a technique where individuals organize items based on how similar they perceive the items to be. For example, assume there are 10 items. Person 1 might group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn this grouping into a binomial similarity matrix. The following is a sample matrix for Person 1 based on this hypothetical grouping. The off diagonals are the similar items with the 1's representing similarities. a b c d e f g h i j a 1 1 1 1 1 0 0 0 0 0 b 1 1 1 1 1 0 0 0 0 0 c 1 1 1 1 1 0 0 0 0 0 d 1 1 1 1 1 0 0 0 0 0 e 1 1 1 1 1 0 0 0 0 0 f 0 0 0 0 0 1 1 1 1 1 g 0 0 0 0 0 1 1 1 1 1 h 0 0 0 0 0 1 1 1 1 1 i 0 0 0 0 0 1 1 1 1 1 j 0 0 0 0 0 1 1 1 1 1 Each of these individual matrices are summed over individuals. So, in this summed matrix diagonal elements represent the total number of participants and the off-diagonals represent the number of times an item was viewed as being similar by members of the group (obviously the matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c' means that these items were viewed as being similar by 4 people. A sample total matrix is at the bottom of this email describing the perceived similarities of 10 items across 4 individuals. It is this total matrix that I end up working with in the MDS. I have previously worked in systat where I run the MDS and specify the matrix as a similarity matrix. I then take the resulting data from the MDS and perform a k-means cluster analysis to identify which items belong to a particular cluster, centroids, etc. So, here are my questions. 1) Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? 2) If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Thanks for any help offered, Harold a b c d e f g h i j a 4 2 4 3 3 2 0 0 0 0 b 2 4 2 3 1 0 2 2 2 2 c 4 2 4 3 3 2 0 0 0 0 d 3 3 3 4 2 1 1 1 1 1 e 3 1 3 2 4 3 1 1 1 1 f 2 0 2 1 3 4 2 2 2 2 g 0 2 0 1 1 2 4 4 4 4 h 0 2 0 1 1 2 4 4 4 4 i 0 2 0 1 1 2 4 4 4 4 j 0 2 0 1 1 2 4 4 4 4 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS
On Wed, 8 Sep 2004, Doran, Harold wrote: 1)Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? Yes. The method, as well as the function in package MASS. All other MDS packages are doing a conversion, probably without telling you how. 2)If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Normally similarities are in the range [0,1], and people use D = 1 - S or sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of dissimilarities, apart from finding the starting configuration.) See the references on the help page for isoMDS. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
Distances cannot always be constructed from similarities. This can be done only if the matrix of similarities is nonnegative definite. With the nonnegative definite condition, and with the maximum similarity scaled so that s_ii=1, d_ik=(2*(1-s_ik))^-.5 Check out the vegan package. Alex -Original Message- From: Doran, Harold [mailto:[EMAIL PROTECTED] Sent: September 8, 2004 10:00 AM To: [EMAIL PROTECTED] Cc: Doran, Harold Subject: [R] isoMDS Dear List: I have a question regarding an MDS procedure that I am accustomed to using. I have searched around the archives a bit and the help doc and still need a little assistance. The package isoMDS is what I need to perform the non-metric scaling, but I am working with similarity matrices, not dissimilarities. The question may end up being resolved simply. Here is a bit of substantive background. I am working on a technique where individuals organize items based on how similar they perceive the items to be. For example, assume there are 10 items. Person 1 might group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn this grouping into a binomial similarity matrix. The following is a sample matrix for Person 1 based on this hypothetical grouping. The off diagonals are the similar items with the 1's representing similarities. a b c d e f g h i j a 1 1 1 1 1 0 0 0 0 0 b 1 1 1 1 1 0 0 0 0 0 c 1 1 1 1 1 0 0 0 0 0 d 1 1 1 1 1 0 0 0 0 0 e 1 1 1 1 1 0 0 0 0 0 f 0 0 0 0 0 1 1 1 1 1 g 0 0 0 0 0 1 1 1 1 1 h 0 0 0 0 0 1 1 1 1 1 i 0 0 0 0 0 1 1 1 1 1 j 0 0 0 0 0 1 1 1 1 1 Each of these individual matrices are summed over individuals. So, in this summed matrix diagonal elements represent the total number of participants and the off-diagonals represent the number of times an item was viewed as being similar by members of the group (obviously the matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c' means that these items were viewed as being similar by 4 people. A sample total matrix is at the bottom of this email describing the perceived similarities of 10 items across 4 individuals. It is this total matrix that I end up working with in the MDS. I have previously worked in systat where I run the MDS and specify the matrix as a similarity matrix. I then take the resulting data from the MDS and perform a k-means cluster analysis to identify which items belong to a particular cluster, centroids, etc. So, here are my questions. 1) Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? 2) If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Thanks for any help offered, Harold a b c d e f g h i j a 4 2 4 3 3 2 0 0 0 0 b 2 4 2 3 1 0 2 2 2 2 c 4 2 4 3 3 2 0 0 0 0 d 3 3 3 4 2 1 1 1 1 1 e 3 1 3 2 4 3 1 1 1 1 f 2 0 2 1 3 4 2 2 2 2 g 0 2 0 1 1 2 4 4 4 4 h 0 2 0 1 1 2 4 4 4 4 i 0 2 0 1 1 2 4 4 4 4 j 0 2 0 1 1 2 4 4 4 4 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
On Wed, 8 Sep 2004, Hanke, Alex wrote: Distances cannot always be constructed from similarities. This can be done only if the matrix of similarities is nonnegative definite. With the nonnegative definite condition, and with the maximum similarity scaled so that s_ii=1, d_ik=(2*(1-s_ik))^-.5 But isoMDDS works with dissimilarities not distances. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
I don't understand. If isoMDS does not work with distances, why does the help for isoMDS indicate that the Data are assumed to be dissimilarities or relative distances ? Equally confusing is the loose use of the terms dissimilarities and distances in the literature. As you point out in your book Distances are often called disimilarities. -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: September 8, 2004 11:58 AM To: Hanke, Alex Cc: 'Doran, Harold'; '[EMAIL PROTECTED]' Subject: RE: [R] isoMDS On Wed, 8 Sep 2004, Hanke, Alex wrote: Distances cannot always be constructed from similarities. This can be done only if the matrix of similarities is nonnegative definite. With the nonnegative definite condition, and with the maximum similarity scaled so that s_ii=1, d_ik=(2*(1-s_ik))^-.5 But isoMDDS works with dissimilarities not distances. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] isoMDS
Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. Again, many thanks. Harold -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 08, 2004 9:58 AM To: Doran, Harold Cc: [EMAIL PROTECTED] Subject: Re: [R] isoMDS On Wed, 8 Sep 2004, Doran, Harold wrote: 1)Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? Yes. The method, as well as the function in package MASS. All other MDS packages are doing a conversion, probably without telling you how. 2)If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Normally similarities are in the range [0,1], and people use D = 1 - S or sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of dissimilarities, apart from finding the starting configuration.) See the references on the help page for isoMDS. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] isoMDS
Doran, Harold wrote: Thank you. Quick clarification. isoMDS only works with dissimilarities. Converting my similarity matrix into the dissimilarity matrix is done as (from an email I found on the archives) d- max(tt)-tt Mardia, kent Bibby defines the standard transformation from a similarity matrix to a dissimilarity (distance) matrix by d_rs - sqrt( c_rr -2*c_rs + c_ss) where c_rs are the similarities. This assures the diagonal of the dissimilarity matrix to be zero. You could try that. Kjetil halvorsen Where tt is the similarity matrix. With this, I tried isoMDS as follows: tt.mds-isoMDS(d) and I get the following error message. Error in isoMDS(d) : An initial configuration must be supplied with NA/Infs in d. I was a little confused on exactly how to specify this initial config. So, from here I ran cmdscale on d as d.mds-cmdscale(d) which seemed to work fine and produce reasonable results. I was able to take the coordinates and run them through a k-means cluster and the results seemed to correctly match the grouping structure I created for this sample analysis. Cmdscale is for metric scaling, but it seemed to produce the results correctly. So, did I correctly convert the similarity matrix to the dissimilarity matrix? Second, should I have used cmdscale rather than isoMDS as I have done? Or, is there a way to specify the initial configuration that I have not done correctly. Again, many thanks. Harold -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 08, 2004 9:58 AM To: Doran, Harold Cc: [EMAIL PROTECTED] Subject: Re: [R] isoMDS On Wed, 8 Sep 2004, Doran, Harold wrote: 1) Can isoMDS work only with dissimilarities? Or, is there a way that it can perform the analysis on the similarity matrix as I have described it? Yes. The method, as well as the function in package MASS. All other MDS packages are doing a conversion, probably without telling you how. 2) If I cannot perform the analysis on the similarity matrix, how can I turn this matrix into a dissimilarity matrix necessary? I am less familiar with this matrix and how it would be constructed? Normally similarities are in the range [0,1], and people use D = 1 - S or sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of dissimilarities, apart from finding the starting configuration.) See the references on the help page for isoMDS. -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R]: isoMDS using
Happy New Year! I tried to use isoMDS to present graphically matrix of coefficients of divergence, and I have seen error NAs/Infs not allowed in d. But there no NAs or Inf's in my matrix! Function `as.vector' (which is applied to test input data with `!is.finite' ) returns in one case input matrix and in other case returns sequence of values of input matrix. When it returns matrix I receive error NAs/Infs not allowed in d. When it returns sequence of values I don't receive this error. Is it possible to use coefficients of divergence instead dissimilarity? Thank you!! --- Altshuler Eugenij P. Moscow South-West High School mailto:[EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] isoMDS results
Hi, this is a second try to post this to the R-help mailing list. The first one has been rejected because of a too large attachment. Now I ask this without attaching the data. If you want to reproduce the results, please contact me directly to get the data. (First mail, rejected:) Attached there is a 149*149 dissimilarity matrix; it is a file obtained by save(dm,file=dissim.Rsav). OK, here is my question: I worry about the reproducability of the results of isoMDS. I try set.seed(5678) mdslinux - isoMDS(dm,k=4) initial value 31.071976 final value 31.071976 converged R.version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major1 minor6.2 year 2003 month01 day 10 language R My co-worker works also with the same dissimilarity matrix and did the same on a Windows machine (unfortunately I do not have the version data, but it should not be too old) and got set.seed(5678) mdswin - isoMDS(dm,k=4) initial value 31.071976 final value 24.16980 converged As to be expected, also the resulting MDS-configurations differ. Initially, the cmdcsale version seems to be used, and this is identical. BTW, I often observed that the isoMDS iteration does not change anything (but not always) from the cmdscale initial configuration on my machine, and I have been somewhat sceptical more than once if this is correct. Can all this be explained with Windows/Linux differences or what else may happen here? Best, Christian -- *** Christian Hennig Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently) and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg [EMAIL PROTECTED], http://stat.ethz.ch/~hennig/ [EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/ ### ich empfehle www.boag.de __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help