Re: [R] isoMDS vs. other non-metric non-R routines

2007-02-14 Thread stevenmh
Hi Phil,
Are you using metaMDS in the vegan package? This allows you to determine
the number of random starts, and selects the best. It might help.
Hank Stevens
 Dear Phil,

 I don't have experiences with Minissa but I know that isoMDS is bad in
 some situations. I have even seen situations with non-metric
 dissimilarities in which the classical MDS was preferable.

 Some alternatives that you have:
 1) Try to start isoMDS from other initial configurations (by default, it
 starts from the classical solution).
 2) Try sammon mapping (command should be sammon).
 3) Have a look at XGvis/GGvis (which may be part of XGobi/GGobi). These
 are not directly part of R but have R interfaces. They allow you to toy
 around quite a lot with different algorithms, stress functions (the
 isoMDS stress is not necessarily what you want) and initial
 configurations so that you can find a better solution and understand your
 data better. Unfortunately I don't have the time to give you more detail,
 but google for it (or somebody else will tell you more).

 Best,
 Christian


 On Tue, 13 Feb 2007, Philip Leifeld wrote:

 Dear useRs,

 last week I asked you about a problem related to isoMDS. It turned
 out that in my case isoMDS was trapped. Nonetheless, I still have
 some problems with other data sets. Therefore I would like to know if
 anyone here has experience with how well isoMDS performs in
 comparison to other non-metric MDS routines, like Minissa.

 I have the feeling that for large data sets with a high stress value
 (e.g. around 0.20) in cases where the intrinsic dimensionality of the
 data cannot be significantly reduced without considerably increasing
 stress, isoMDS performs worse (and yields a stress value of 0.31 in
 my example), while solutions tend to be similar for better fits and
 lower intrinsic dimensionality. I tried this on another data set
 where isoMDS yields a stress value of 0.19 and Minissa a stress value
 of 0.14.

 Now the latter would still be considered a fair solution by some
 people while the former indicates a poor fit regardless of how strict
 your judgment is. I generally prefer using R over mixing with
 different programs, so it would be nice if results were of comparable
 quality...

 Cheers

 Phil

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 *** --- ***
 Christian Hennig
 University College London, Department of Statistical Science
 Gower St., London WC1E 6BT, phone +44 207 679 1698
 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] isoMDS vs. other non-metric non-R routines

2007-02-13 Thread Philip Leifeld
Dear useRs,

last week I asked you about a problem related to isoMDS. It turned 
out that in my case isoMDS was trapped. Nonetheless, I still have 
some problems with other data sets. Therefore I would like to know if 
anyone here has experience with how well isoMDS performs in 
comparison to other non-metric MDS routines, like Minissa.

I have the feeling that for large data sets with a high stress value 
(e.g. around 0.20) in cases where the intrinsic dimensionality of the 
data cannot be significantly reduced without considerably increasing 
stress, isoMDS performs worse (and yields a stress value of 0.31 in 
my example), while solutions tend to be similar for better fits and 
lower intrinsic dimensionality. I tried this on another data set 
where isoMDS yields a stress value of 0.19 and Minissa a stress value 
of 0.14.

Now the latter would still be considered a fair solution by some 
people while the former indicates a poor fit regardless of how strict 
your judgment is. I generally prefer using R over mixing with 
different programs, so it would be nice if results were of comparable 
quality...

Cheers

Phil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] isoMDS vs. other non-metric non-R routines

2007-02-13 Thread Christian Hennig
Dear Phil,

I don't have experiences with Minissa but I know that isoMDS is bad in 
some situations. I have even seen situations with non-metric 
dissimilarities in which the classical MDS was preferable.

Some alternatives that you have:
1) Try to start isoMDS from other initial configurations (by default, it 
starts from the classical solution).
2) Try sammon mapping (command should be sammon).
3) Have a look at XGvis/GGvis (which may be part of XGobi/GGobi). These 
are not directly part of R but have R interfaces. They allow you to toy
around quite a lot with different algorithms, stress functions (the 
isoMDS stress is not necessarily what you want) and initial 
configurations so that you can find a better solution and understand your 
data better. Unfortunately I don't have the time to give you more detail, 
but google for it (or somebody else will tell you more).

Best,
Christian


On Tue, 13 Feb 2007, Philip Leifeld wrote:

 Dear useRs,

 last week I asked you about a problem related to isoMDS. It turned
 out that in my case isoMDS was trapped. Nonetheless, I still have
 some problems with other data sets. Therefore I would like to know if
 anyone here has experience with how well isoMDS performs in
 comparison to other non-metric MDS routines, like Minissa.

 I have the feeling that for large data sets with a high stress value
 (e.g. around 0.20) in cases where the intrinsic dimensionality of the
 data cannot be significantly reduced without considerably increasing
 stress, isoMDS performs worse (and yields a stress value of 0.31 in
 my example), while solutions tend to be similar for better fits and
 lower intrinsic dimensionality. I tried this on another data set
 where isoMDS yields a stress value of 0.19 and Minissa a stress value
 of 0.14.

 Now the latter would still be considered a fair solution by some
 people while the former indicates a poor fit regardless of how strict
 your judgment is. I generally prefer using R over mixing with
 different programs, so it would be nice if results were of comparable
 quality...

 Cheers

 Phil

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] isoMDS vs. other non-metric non-R routines

2007-02-13 Thread Jari Oksanen
Sorry for not threading: I don't subscribe to this list, and the 
linking of web browser and email seems to be rudimentary.

I don't know what is Minissa. Sounds like a piece of software. What is 
the method it implements? That is, is it supposed to implement the same 
method as isoMDS or something else? IsoMDS implements Kruskal's (and 
Young's and Sheperd's and Torgeson's) NMDS, but there are other methods 
too. You are supposed to get similar results only with the same method. 
For instance, there are various definitions of stress, two of them 
amusingly called stress-1 and stress-2, but there are others.

You didn't give much detail about how you used isoMDS. We already 
discussed the danger of trapping in the starting configuration which 
you can avoid with trying (several) random starting configurations. 
Have you used 'tol' (and 'maxit') arguments in isoMDS? The default 
'tol' is rather slack, and 'maxit' fairly low, since (speculation) the 
function was written a long time ago when computer were slow, but if 
you have something better than 75MHz i486, you can try with other 
values.

I have used isoMDS quite a lot, and I have had good experience.

Cheers, Jari Oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] isoMDS vs. other non-metric non-R routines

2007-02-13 Thread Philip Leifeld
Thanks for your message.

 I don't know what is Minissa. Sounds like a piece of software. What
 is the method it implements? That is, is it supposed to implement 
 the same method as isoMDS or something else? IsoMDS implements
 Kruskal's (and Young's and Sheperd's and Torgeson's) NMDS, but
 there are other methods too. You are supposed to get similar
 results only with the same method. For instance, there are various
 definitions of stress, two of them amusingly called stress-1 and
 stress-2, but there are others.

Yes, Minissa uses Kruskal's NMDS and stress1, so results should be 
comparable.

 You didn't give much detail about how you used isoMDS. We already
 discussed the danger of trapping in the starting configuration
 which you can avoid with trying (several) random starting
 configurations. Have you used 'tol' (and 'maxit') arguments in
 isoMDS? The default 'tol' is rather slack, and 'maxit' fairly low,
 since (speculation) the function was written a long time ago when
 computer were slow, but if you have something better than 75MHz
 i486, you can try with other values.
 Cheers, Jari Oksanen

This was my initial call:

mds - isoMDS(dist, y = cmdscale(dist, k = 2), k=2, tol = 1e-3, maxit 
= 500)

I played around a little bit with tol and maxit (adding some 
zeros...) and increased the number of dimensions, but it did not 
change the results significantly. Using initMDS did not improve the 
result either. Unfortunately, my data set is too large to be 
displayed here. Any other ideas? My stress value is still 1.5 as much 
as in other implementations of NMDS.

Cheers

Phil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] isoMDS vs. other non-metric non-R routines

2007-02-13 Thread Jari Oksanen
philip.leifeld at uni-konstanz.de wrote:
 This was my initial call:
 
 mds - isoMDS(dist, y = cmdscale(dist, k = 2), k=2, tol = 1e-3, maxit 
 = 500)
 
 I played around a little bit with tol and maxit (adding some 
 zeros...) and increased the number of dimensions, but it did not 
 change the results significantly. Using initMDS did not improve the 
 result either. Unfortunately, my data set is too large to be 
 displayed here. Any other ideas? My stress value is still 1.5 as much 
 as in other implementations of NMDS.
 
It is really difficult to believe that isoMDS would work so completely
differently from other implementations. I guess you already tried
tol=1e-7? After this, a radical trick is to give the Minissa result as
the starting configuration, and see if you stay there and  get the same
stress as Minissa reported. You should. In particular, if you iterate
away from the starting configuration, then the starting configuration
was not as good as you assumed.  If this happens, it would be time to
check the data. I assume you have read in dissimilarities from external
files, and surprises do happen (it makes sense to check the data
anyway).

Increasing the number of dimensions should not get you into a similar
solution as with some other implementation using a lower number of
dimensions.

About the problems Christian Hennig mentioned: My interpretation of his
message was that he was not concerned about isoMDS in particular but
about NMDS in general (but he will correct me if my interpretation was
wrong). I can imagine cases where non-metric solution works badly, in
particular with small data sets. However, that should concern all
implementations similarly, and probably it should be visible in Shepard
plots (see isoMDS help). 

Cheers, Jari Oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] isoMDS - high stress value and strange configuration

2007-02-07 Thread Philip Leifeld
Dear R users,

I have a specific question about isoMDS. Imagine the following (fake) 
distance table:

hamburg bremen berlin munich cologne
hamburg   0911982677 424
bremen  911  0293547 513
berlin  982293  0785 875
munich  677547785  0 375
cologne 424513875375   0

Now if I try a non-metric multidimensional scaling on these 
dissimilarities using isoMDS (or metaMDS), the stress value is 6.34. 
Nevertheless, other programs (e.g. the Minissa routine implemented in 
UCINet) yield a stress value of 0.00, and the configuration looks 
completely different. I tried this with multiple distance matrices: 
One time UCINet computed a stress value of 0.21 while isoMDS produced 
a stress of 0.33, and again the configuration was completely 
different and apparently random (while the configuration in UCINet 
still made sense). Here is what I tried:

isoMDS(cities, y = cmdscale(cities, k = 2), k = 2, maxit = 50)

Please give me a hint on how to improve the results. I suppose the 
above command is not complete, or something is wrong with it, or 
maybe the input distances are not in the right format.

Btw, the problem does not occur when I use the real distances between 
these cities, not some other numbers, so apparently three-digit 
numbers should be fine as input values?

Thanks!

Phil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] isoMDS - high stress value and strange configuration

2007-02-07 Thread Jari Oksanen

 I have a specific question about isoMDS. Imagine the following (fake) 
 distance table:
 
 hamburg bremen berlin munich cologne
 hamburg   0911982677 424
 bremen  911  0293547 513
 berlin  982293  0785 875
 munich  677547785  0 375
 cologne 424513875375   0
 
 Now if I try a non-metric multidimensional scaling on these 
 dissimilarities using isoMDS (or metaMDS), the stress value is 6.34. 
 Nevertheless, other programs (e.g. the Minissa routine implemented in 
 UCINet) yield a stress value of 0.00, and the configuration looks 
 completely different. 

This indeed seems to be a case where NMDS is trapped in its starting
configuration. Metric scaling (cmdscale) produces a cute horseshoe,
but the best NMDS solutions looks completely different. Any small change
from the initial solution leads into a worse configuration, and you need
a bigger change in the beginning. Using a random configuration seems to
help:

 isoMDS(dis, initMDS(dis))
initial  value 36.383132 
iter   5 value 28.671652
iter  10 value 16.711327
iter  15 value 6.392572
iter  20 value 3.007208
final  value 0.00 
converged
$points
  [,1]  [,2]
hamburg  29.428121 -36.07858
bremen2.740499  32.38745
berlin1.984215  35.35429
munich  -16.910941 -14.13750
cologne -13.844187 -15.24468

$stress
[1] 1.56159e-14

In this case I generated the random configuration using function initMDS
of vegan, but you can do that quite well by any other way.

Another point (which does not matter here so much) is that isoMDS
multiplies stress by 100, so that your stress of 6 would corresponde
0.06 in some other software (assuming they use the same stress).

cheers, jari oksanen
-- 
Jari Oksanen [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] isoMDS and 0 distances

2006-04-19 Thread Prof Brian Ripley
Short answer: you cannot compare distances including NAs, so there is no 
way to find a monotone mapping of distances.

If the data really are identical for two rows, you can easily drop one of 
them whilst doing MDS, and then assign the position found for one to the 
other.

On Tue, 18 Apr 2006, Tyler Smith wrote:

 Hi,

 I'm trying to do a non-metric multidimensional scaling using isoMDS.
 However, I have some '0' distances in my data, and I'm not sure how to
 deal with them. I'd rather not drop rows from the original data, as I am
 comparing several datasets (morphology and molecular data) for the same
 individuals, and it's interesting to see how much morphological
 variation can be associated with an identical genotype.

 I've tried replacing the 0's with NA, but the isoMDS appears to stop on
 the first iteration and the stress does not improve:

 distA # A dist object with 13695 elements, 4 of which == 0
 cmdsA - cmdscale(distA, k=2)

 distB - distA
 distB[which(distB==0)] - NA

 isoA - isoMDS(distB, cmdsA)
 initial  value 21.835691
 final  value 21.835691
 converged

 The other approach I've tried is replacing the 0's with small numbers.
 In this case isoMDS does reduce the stress values.

 min(distA[which(distA0)])
 [1] 0.02325581

 distC - distA
 distC[which(distC==0)] - 0.001
 isoC - isoMDS(distC)
 initial  value 21.682854
 iter   5 value 16.862093
 iter  10 value 16.451800
 final  value 16.339224
 converged

 So my questions are: what am I doing wrong in the first example? Why
 does isoMDS converge without doing anything? Is replacing the 0's with
 small numbers an appropriate alternative?

 Thanks for your time,

 Tyler
 R 2.2.1

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS and 0 distances

2006-04-19 Thread Jari Oksanen
On Tue, 2006-04-18 at 22:06 -0400, Tyler Smith wrote:

 I'm trying to do a non-metric multidimensional scaling using isoMDS. 
 However, I have some '0' distances in my data, and I'm not sure how to 
 deal with them. I'd rather not drop rows from the original data, as I am 
 comparing several datasets (morphology and molecular data) for the same 
 individuals, and it's interesting to see how much morphological 
 variation can be associated with an identical genotype.
 
 I've tried replacing the 0's with NA, but the isoMDS appears to stop on 
 the first iteration and the stress does not improve:
 
 distA # A dist object with 13695 elements, 4 of which == 0
 cmdsA - cmdscale(distA, k=2)
 
 distB - distA
 distB[which(distB==0)] - NA
 
 isoA - isoMDS(distB, cmdsA)
 initial  value 21.835691
 final  value 21.835691
 converged
 
 The other approach I've tried is replacing the 0's with small numbers. 
 In this case isoMDS does reduce the stress values.
 
 min(distA[which(distA0)])
 [1] 0.02325581
 
 distC - distA
 distC[which(distC==0)] - 0.001
 isoC - isoMDS(distC)
 initial  value 21.682854
 iter   5 value 16.862093
 iter  10 value 16.451800
 final  value 16.339224
 converged
 
 So my questions are: what am I doing wrong in the first example? Why 
 does isoMDS converge without doing anything? Is replacing the 0's with 
 small numbers an appropriate alternative?
 
Tyler,

My experience is that isoMDS *may* fail to go away from the starting
configuration if there are identical values in initial configuration,
and this will happen if you use cmdscale() to get the initial
configuration. You *may* get over this by shifting duplicates a bit:

 con - cmdscale(dis)
 dups - duplicated(con)
 sum(dups)
[1] 2
 con[dups, ] - con[dups,] + runif(2*sum(dups), -0.01, 0.01)

Then isoMDS may go further.

Another issue is that at a quick look isoMDS() seems to do nothing
sensible with missing values, although it accepts them. The only thing
is that they are ordered last, or regarded as very long distances (in
your case they rather should be regarded as very short distances). The
keylines in isoMDS are:

ord - order(dis)
nd - sum(!is.na(ord))

Even when 'dis' has missing values,  the result of order() ('ord') has
no missing values, but with default argument na.last=TRUE they are put
last in the list. An obvious looking change would be to replace the
second line with:

nd - sum(!is.na(dis))

but this dumps the core of R at least in my machine: probably you need
the full length of vectors also in addition to number of non-missing
entries. (This quick look was based on the latest release version of
MASS/VR: there may be a newer version already with the upcoming R
release, but that's not released yet.)

You may check working with NA: are duplicate points identical in
results?

Then about replacing zero distances with a tiny number: this has been
discussed before in this list, and Ripley said no, no!. I do it all
the time, but only in secrecy. A suggested solution was to drop
duplicates, but then there still is a weighting issue, and isoMDS does
not have weights argument.


cheers, jari oksanen
-- 
Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland
email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS and 0 distances

2006-04-19 Thread Jari Oksanen
On Wed, 2006-04-19 at 07:46 +0100, Prof Brian Ripley wrote:
 Short answer: you cannot compare distances including NAs, so there is no 
 way to find a monotone mapping of distances.
 
The original Kruskal-Young-Shepard-Torgerson programme KYST (version 1
from 1973) could handle missing values. Unfortunately I've lost the
documents, but if I remember correctly, the argument was that you don't
need but a subset (representative for points) of (dis)similarities to
get a monotone regression. KYST -- and computers of that time (I used
Burroughs!) -- had limitations on data size, and removing some of the
dissimilarities was a way of getting more than 64 data points into
analysis. However, better not go into details since:

C THIS INFORMATION IS PROPRIETARY AND IS THE
C PROPERTY OF BELL TELEPHONE LABORATORIES,
C INCORPORATED.  ITS REPRODUCTION OR DISCLOSURE
C TO OTHERS, EITHER ORALLY OR IN WRITING, IS
C PROHIBITED WITHOUT WRITTEN PRERMISSION OF
C BELL LABORATORIES.
CKYST-2A AUGUST, 1977   

cheers, jari oksanen
-- 
Jari Oksanen -- Biologian laitos, Oulun yliopisto, 90014 Oulu
sposti [EMAIL PROTECTED], kotisivu http://cc.oulu.fi/~jarioksa/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS and 0 distances

2006-04-19 Thread Christian Hennig
About replacing the zeroes with tiny numbers:
isoMDS works with the rankings of the distances. Therefore replacing 
zeroes by tiny values gives them a rank above the real zeroes (distance 
to same observation) and below all the non-zero distances. If this makes 
sense in your application (in my experience it usually does), you can do 
it.

Sometimes the classical MDS solution is a local optimum of the isoMDS 
criterion. In these cases isoMDS converges in one step (rather it gives 
you the classical MDS solution). This may happen with and without zero 
or NA distances.

Best,
Christian

On Tue, 18 Apr 2006, Tyler Smith wrote:

 Hi,

 I'm trying to do a non-metric multidimensional scaling using isoMDS.
 However, I have some '0' distances in my data, and I'm not sure how to
 deal with them. I'd rather not drop rows from the original data, as I am
 comparing several datasets (morphology and molecular data) for the same
 individuals, and it's interesting to see how much morphological
 variation can be associated with an identical genotype.

 I've tried replacing the 0's with NA, but the isoMDS appears to stop on
 the first iteration and the stress does not improve:

 distA # A dist object with 13695 elements, 4 of which == 0
 cmdsA - cmdscale(distA, k=2)

 distB - distA
 distB[which(distB==0)] - NA

 isoA - isoMDS(distB, cmdsA)
 initial  value 21.835691
 final  value 21.835691
 converged

 The other approach I've tried is replacing the 0's with small numbers.
 In this case isoMDS does reduce the stress values.

 min(distA[which(distA0)])
 [1] 0.02325581

 distC - distA
 distC[which(distC==0)] - 0.001
 isoC - isoMDS(distC)
 initial  value 21.682854
 iter   5 value 16.862093
 iter  10 value 16.451800
 final  value 16.339224
 converged

 So my questions are: what am I doing wrong in the first example? Why
 does isoMDS converge without doing anything? Is replacing the 0's with
 small numbers an appropriate alternative?

 Thanks for your time,

 Tyler
 R 2.2.1

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
[EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS and 0 distances

2006-04-19 Thread Tyler Smith
Thanks all!

 From Christian's explanation I think I will be alright adding small 
values to my zero distances. In my application my distances are limited 
by the number of primer pairs I use, and it is reasonable to expect that 
adding primer pairs would eventually reveal some small genetic 
difference among plants collected from locations many hundreds of miles 
apart. I've also found that using Jari's metaMDSiter() function from the 
vegan package gets me out of the local minimum traps that troubled me 
earlier.

Cheers,

Tyler

Christian Hennig wrote:

 About replacing the zeroes with tiny numbers:
 isoMDS works with the rankings of the distances. Therefore replacing 
 zeroes by tiny values gives them a rank above the real zeroes 
 (distance to same observation) and below all the non-zero distances. 
 If this makes sense in your application (in my experience it usually 
 does), you can do it.

 Sometimes the classical MDS solution is a local optimum of the isoMDS 
 criterion. In these cases isoMDS converges in one step (rather it 
 gives you the classical MDS solution). This may happen with and 
 without zero or NA distances.

 Best,
 Christian

 On Tue, 18 Apr 2006, Tyler Smith wrote:

 Hi,

 I'm trying to do a non-metric multidimensional scaling using isoMDS.
 However, I have some '0' distances in my data, and I'm not sure how to
 deal with them. I'd rather not drop rows from the original data, as I am
 comparing several datasets (morphology and molecular data) for the same
 individuals, and it's interesting to see how much morphological
 variation can be associated with an identical genotype.

 I've tried replacing the 0's with NA, but the isoMDS appears to stop on
 the first iteration and the stress does not improve:

 distA # A dist object with 13695 elements, 4 of which == 0
 cmdsA - cmdscale(distA, k=2)

 distB - distA
 distB[which(distB==0)] - NA

 isoA - isoMDS(distB, cmdsA)
 initial value 21.835691
 final value 21.835691
 converged

 The other approach I've tried is replacing the 0's with small numbers.
 In this case isoMDS does reduce the stress values.

 min(distA[which(distA0)])
 [1] 0.02325581

 distC - distA
 distC[which(distC==0)] - 0.001
 isoC - isoMDS(distC)
 initial value 21.682854
 iter 5 value 16.862093
 iter 10 value 16.451800
 final value 16.339224
 converged

 So my questions are: what am I doing wrong in the first example? Why
 does isoMDS converge without doing anything? Is replacing the 0's with
 small numbers an appropriate alternative?

 Thanks for your time,

 Tyler
 R 2.2.1

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


 *** --- ***
 Christian Hennig
 University College London, Department of Statistical Science
 Gower St., London WC1E 6BT, phone +44 207 679 1698
 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] isoMDS and 0 distances

2006-04-18 Thread Tyler Smith
Hi,

I'm trying to do a non-metric multidimensional scaling using isoMDS. 
However, I have some '0' distances in my data, and I'm not sure how to 
deal with them. I'd rather not drop rows from the original data, as I am 
comparing several datasets (morphology and molecular data) for the same 
individuals, and it's interesting to see how much morphological 
variation can be associated with an identical genotype.

I've tried replacing the 0's with NA, but the isoMDS appears to stop on 
the first iteration and the stress does not improve:

distA # A dist object with 13695 elements, 4 of which == 0
cmdsA - cmdscale(distA, k=2)

distB - distA
distB[which(distB==0)] - NA

isoA - isoMDS(distB, cmdsA)
initial  value 21.835691
final  value 21.835691
converged

The other approach I've tried is replacing the 0's with small numbers. 
In this case isoMDS does reduce the stress values.

min(distA[which(distA0)])
[1] 0.02325581

distC - distA
distC[which(distC==0)] - 0.001
isoC - isoMDS(distC)
initial  value 21.682854
iter   5 value 16.862093
iter  10 value 16.451800
final  value 16.339224
converged

So my questions are: what am I doing wrong in the first example? Why 
does isoMDS converge without doing anything? Is replacing the 0's with 
small numbers an appropriate alternative?

Thanks for your time,

Tyler
R 2.2.1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-09 Thread Jari Oksanen
On Wed, 2004-09-08 at 21:31, Doran, Harold wrote:
 Thank you. Quick clarification. isoMDS only works with dissimilarities.
 Converting my similarity matrix into the dissimilarity matrix is done as
 (from an email I found on the archives)
 
  d- max(tt)-tt
 
 Where tt is the similarity matrix. With this, I tried isoMDS as follows:
 
  tt.mds-isoMDS(d)
 
 and I get the following error message. 
 
 Error in isoMDS(d) : An initial configuration must be supplied with
 NA/Infs in d. I was a little confused on exactly how to specify this
 initial config. So, from here I ran cmdscale on d as
 
This error message is quite informative: you have either missing or
non-finite entries in your data. The only surprising thing here is that
cmdscale works: it should fail, too. Are you sure that you haven't done
anything with your data matrix in between, like changed it from matrix
to a dist object? If the Inf/NaN/NA values are on the diagonal, they
will magically disappear with as.dist. Anyway, if you're able to get a
metric scaling result, you can manually feed that into isoMDS for the
initial configuration, and  avoid the check. See ?isoMDS.

  d.mds-cmdscale(d)
 
 which seemed to work fine and produce reasonable results. I was able to
 take the coordinates and run them through a k-means cluster and the
 results seemed to correctly match the grouping structure I created for
 this sample analysis.
 
 Cmdscale is for metric scaling, but it seemed to produce the results
 correctly. 
 
 So, did I correctly convert the similarity matrix to the dissimilarity
 matrix? Second, should I have used cmdscale rather than isoMDS as I have
 done? Or, is there a way to specify the initial configuration that I
 have not done correctly.

If you don't know whether you should use isoMDS or cmdscale, you
probably should use cmdscale. If you know, things are different.
Probably isoMDS gives you `better'(TM) results, but it is more
complicated to handle.

cheers, jari oksanen
-- 
Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland
Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061
email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS

2004-09-09 Thread Jari Oksanen
On Thu, 2004-09-09 at 04:53, Kjetil Brinchmann Halvorsen wrote:

 
 Mardia, kent  Bibby defines the standard transformation from a 
 similarity matrix to a dissimilarity
 (distance) matrix by
 
 d_rs -  sqrt( c_rr -2*c_rs + c_ss)
 
 where c_rs are the similarities. This assures the diagonal of the 
 dissimilarity matrix to be zero.
 You could try that.
 
In R notation, this would be

sim2dist - function(x) 
 as.dist(sqrt(outer(diag(x), diag(x), +) - 2*x))

Mardia, Kent  Bibby indeed passingly say that this is a `standard
transformation' (page 403). However, it is really a canonical way only
if diagonal elements in similarity matrix are sums of squares, and
off-diagonal elements are cross products. In that case the `standard
transformation' gives you Euclidean distances (or if you have
variances/covariances or ones/correlations it gives you something
similar). However, it is no standard if your similarities are something
else, and cannot be transformed into Euclidean distances.

However, in isoMDS this *may* not matter, since NMDS uses only rank
order of dissimilarities, and any transformation giving dissimilarities
in the same rank order *may* give similar results. The statement was
conditions (may), since isoMDS uses cmdscale for the starting
configuration, and cmdscale will give different results with different
transformations. So isoMDS may stop in different (local) optima. Setting
`tol' parameter low enough in isoMDS (see ?isoMDS) helped in a couple of
cases I tried, and the results were practically identical with different
transformations. So it doesn't matter too much how you change your
similarities to dissimilarities, since isoMDS indeed treats them as
dissimilarities (but cmdscale treats them as distances).

cheers, jari oksanen
-- 
J.Oksanen, Oulu, Finland.
Object-oriented programming is an exceptionally bad idea which could
only have originated in California. E. Dijkstra

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-09 Thread Doran, Harold
Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have reproduced 
my steps below for clarification if this happens to shed any light.
 
Here is the original total matrix (see opening thread if you care how this is created)
 
  a b c d e f g h
a 4 4 2 4 1 2 0 0
b 4 4 2 4 1 2 0 0
c 2 2 4 2 3 2 2 1
d 4 4 2 4 1 2 0 0
e 1 1 3 1 4 3 3 2
f 2 2 2 2 3 4 2 1
g 0 0 2 0 3 2 4 3
h 0 0 1 0 2 1 3 4
 
So, there are 8 items. This matrix indicates that items 1,2, and 4 were always grouped 
together (or viewed as being similar by individuals). I transformed this using 
 
tt-max(t)-t
 
which results in
  a b c d e f g h
a 0 0 2 0 3 2 4 4
b 0 0 2 0 3 2 4 4
c 2 2 0 2 1 2 2 3
d 0 0 2 0 3 2 4 4
e 3 3 1 3 0 1 1 2
f 2 2 2 2 1 0 2 3
g 4 4 2 4 1 2 0 1
h 4 4 3 4 2 3 1 0
 
When I run isoMDS on this new matrix, it tells me to specify the initial config 
because of the NA/INFs/ But when I perform cmdscale on this same matrix I end up with 
the following results,
 
 bt-cmdscale(tt);bt
 
 [,1]   [,2]
a -1.79268634 -0.2662750
b -1.79268634 -0.2662750
c -0.02635497  0.5798934
d -1.79268634 -0.2662750
e  1.08978620  0.6265313
f -0.02635497  0.5798934
g  2.20852966  0.2828937
h  2.13245309 -1.2703869
 
The results suggest that items 1,2, and 4 have similar locations as is expected. Also 
items 3 and 6 have similar locations as would also be expected. So, my results seem to 
have been replicated correctly using cmdscale. 
 
I've tried to specify an initial config using isoMDS in a few ways without success, so 
I am surely doing something wrong. So far, I have tried the following:
 
 ll-isoMDS(tt, y=cmdscale(tt))
 
which tells me zero or negative distance between objects 1 and 2
 
 ll-isoMDS(tt, y=cmdscale(tt, k=2))
 
 
Again, thanks,
 
Harold
 
 

-Original Message- 
From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
Sent: Thu 9/9/2004 4:26 AM 
To: Doran, Harold 
Cc: Prof Brian Ripley; R-News 
Subject: RE: [R] isoMDS



On Wed, 2004-09-08 at 21:31, Doran, Harold wrote:
 Thank you. Quick clarification. isoMDS only works with dissimilarities.
 Converting my similarity matrix into the dissimilarity matrix is done as
 (from an email I found on the archives)

  d- max(tt)-tt

 Where tt is the similarity matrix. With this, I tried isoMDS as follows:

  tt.mds-isoMDS(d)

 and I get the following error message.

 Error in isoMDS(d) : An initial configuration must be supplied with
 NA/Infs in d. I was a little confused on exactly how to specify this
 initial config. So, from here I ran cmdscale on d as

This error message is quite informative: you have either missing or
non-finite entries in your data. The only surprising thing here is that
cmdscale works: it should fail, too. Are you sure that you haven't done
anything with your data matrix in between, like changed it from matrix
to a dist object? If the Inf/NaN/NA values are on the diagonal, they
will magically disappear with as.dist. Anyway, if you're able to get a
metric scaling result, you can manually feed that into isoMDS for the
initial configuration, and  avoid the check. See ?isoMDS.

  d.mds-cmdscale(d)

 which seemed to work fine and produce reasonable results. I was able to
 take the coordinates and run them through a k-means cluster and the
 results seemed to correctly match the grouping structure I created for
 this sample analysis.

 Cmdscale is for metric scaling, but it seemed to produce the results
 correctly.

 So, did I correctly convert the similarity matrix to the dissimilarity
 matrix? Second, should I have used cmdscale rather than isoMDS as I have
 done? Or, is there a way to specify the initial configuration that I
 have not done correctly.

If you don't know whether you should use isoMDS or cmdscale, you
probably should use cmdscale. If you know, things are different.
Probably isoMDS gives you `better'(TM) results, but it is more
complicated to handle.

cheers, jari oksanen
--
Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland
Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061
email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/




[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-09 Thread Jari Oksanen
On Thu, 2004-09-09 at 14:25, Doran, Harold wrote:
 Thank you. I use the same matrix on cmdscale as I did with isoMDS. I
 have reproduced my steps below for clarification if this happens to
 shed any light.
--- snip ---

Doran,

Your data clarified things. It seems to me now, that your data are not a
a matrix but a data.frame. A problem for an ordinary user is that
data.frames and matrices look identical, but that's only surface: you
shouldn't be shallow but look deep in their souls to see that they are
compeletely different, and therefore isoMDS fails. At least isoMDS gives
just that error for a data.frame, but cmdscale casts data.frame to a
matrix therefore it works.

So the following should work (worked when I tied):

tt - as.matrix(tt)
isoMDS(tt)

(and you could down to a dist object with tt - as.dist(tt) which seems
to handle data.frames directly, too).

Then you will still need to avoid the complaint about zero-distances
among points. This means that you have some identical points in your
data, and isoMDS does not like them. This issue was discussed here in
April, 2004 (and many other times). Search archives for the subject
question on isoMDS.

cheers, jari oksanen
-- 
Jari Oksanen [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-09 Thread Hanke, Alex
I get the following message:
Error in isoMDS(tt) : zero or negative distance between objects 1 and 2
This makes sense since a and b are identical in their relationship to c to
h. 
Drop row 1 and col 1 and you get
 isoMDS(tt[2:8,2:8])
initial  value 14.971992 
iter   5 value 8.027815
iter  10 value 4.433377
iter  15 value 3.496364
iter  20 value 3.346726
final  value 3.233738 
converged
$points
   [,1]   [,2]
[1,] -2.3143653 -0.1259226
[2,] -0.3205746 -1.1534662
[3,] -2.8641922 -0.1182906
[4,]  0.7753674  0.1497328
[5,] -0.5705552  1.2416843
[6,]  2.2305175 -0.6995917
[7,]  3.0638025  0.7058540

$stress
[1] 3.233738

Does this help?

-Original Message-
From: Doran, Harold [mailto:[EMAIL PROTECTED] 
Sent: September 9, 2004 8:26 AM
To: Jari Oksanen
Cc: Doran, Harold; Prof Brian Ripley; R-News
Subject: RE: [R] isoMDS


Thank you. I use the same matrix on cmdscale as I did with isoMDS. I have
reproduced my steps below for clarification if this happens to shed any
light.
 
Here is the original total matrix (see opening thread if you care how this
is created)
 
  a b c d e f g h
a 4 4 2 4 1 2 0 0
b 4 4 2 4 1 2 0 0
c 2 2 4 2 3 2 2 1
d 4 4 2 4 1 2 0 0
e 1 1 3 1 4 3 3 2
f 2 2 2 2 3 4 2 1
g 0 0 2 0 3 2 4 3
h 0 0 1 0 2 1 3 4
 
So, there are 8 items. This matrix indicates that items 1,2, and 4 were
always grouped together (or viewed as being similar by individuals). I
transformed this using 
 
tt-max(t)-t
 
which results in
  a b c d e f g h
a 0 0 2 0 3 2 4 4
b 0 0 2 0 3 2 4 4
c 2 2 0 2 1 2 2 3
d 0 0 2 0 3 2 4 4
e 3 3 1 3 0 1 1 2
f 2 2 2 2 1 0 2 3
g 4 4 2 4 1 2 0 1
h 4 4 3 4 2 3 1 0
 
When I run isoMDS on this new matrix, it tells me to specify the initial
config because of the NA/INFs/ But when I perform cmdscale on this same
matrix I end up with the following results,
 
 bt-cmdscale(tt);bt
 
 [,1]   [,2]
a -1.79268634 -0.2662750
b -1.79268634 -0.2662750
c -0.02635497  0.5798934
d -1.79268634 -0.2662750
e  1.08978620  0.6265313
f -0.02635497  0.5798934
g  2.20852966  0.2828937
h  2.13245309 -1.2703869
 
The results suggest that items 1,2, and 4 have similar locations as is
expected. Also items 3 and 6 have similar locations as would also be
expected. So, my results seem to have been replicated correctly using
cmdscale. 
 
I've tried to specify an initial config using isoMDS in a few ways without
success, so I am surely doing something wrong. So far, I have tried the
following:
 
 ll-isoMDS(tt, y=cmdscale(tt))
 
which tells me zero or negative distance between objects 1 and 2
 
 ll-isoMDS(tt, y=cmdscale(tt, k=2))
 
 
Again, thanks,
 
Harold
 
 

-Original Message- 
From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
Sent: Thu 9/9/2004 4:26 AM 
To: Doran, Harold 
Cc: Prof Brian Ripley; R-News 
Subject: RE: [R] isoMDS



On Wed, 2004-09-08 at 21:31, Doran, Harold wrote:
 Thank you. Quick clarification. isoMDS only works with
dissimilarities.
 Converting my similarity matrix into the dissimilarity matrix is
done as
 (from an email I found on the archives)

  d- max(tt)-tt

 Where tt is the similarity matrix. With this, I tried isoMDS as
follows:

  tt.mds-isoMDS(d)

 and I get the following error message.

 Error in isoMDS(d) : An initial configuration must be supplied
with
 NA/Infs in d. I was a little confused on exactly how to specify
this
 initial config. So, from here I ran cmdscale on d as

This error message is quite informative: you have either missing or
non-finite entries in your data. The only surprising thing here is
that
cmdscale works: it should fail, too. Are you sure that you haven't
done
anything with your data matrix in between, like changed it from
matrix
to a dist object? If the Inf/NaN/NA values are on the diagonal, they
will magically disappear with as.dist. Anyway, if you're able to get
a
metric scaling result, you can manually feed that into isoMDS for
the
initial configuration, and  avoid the check. See ?isoMDS.

  d.mds-cmdscale(d)

 which seemed to work fine and produce reasonable results. I was
able to
 take the coordinates and run them through a k-means cluster and
the
 results seemed to correctly match the grouping structure I created
for
 this sample analysis.

 Cmdscale is for metric scaling, but it seemed to produce the
results
 correctly.

 So, did I correctly convert the similarity matrix to the
dissimilarity
 matrix? Second, should I have used cmdscale rather than isoMDS as
I have
 done? Or, is there a way to specify the initial configuration that
I
 have not done correctly.

If you don't know whether you should use isoMDS or cmdscale, you
probably

[R] isoMDS

2004-09-08 Thread Doran, Harold
Dear List:

I have a question regarding an MDS procedure that I am accustomed to
using. I have searched around the archives a bit and the help doc and
still need a little assistance. The package isoMDS is what I need to
perform the non-metric scaling, but I am working with similarity
matrices, not dissimilarities. The question may end up being resolved
simply.

Here is a bit of substantive background. I am working on a technique
where individuals organize items based on how similar they perceive the
items to be. For example, assume there are 10 items. Person 1 might
group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn
this grouping into a binomial similarity matrix. The following is a
sample matrix for Person 1 based on this hypothetical grouping. The off
diagonals are the similar items with the 1's representing similarities. 
  a b c d e f g h i j
a 1 1 1 1 1 0 0 0 0 0
b 1 1 1 1 1 0 0 0 0 0
c 1 1 1 1 1 0 0 0 0 0
d 1 1 1 1 1 0 0 0 0 0
e 1 1 1 1 1 0 0 0 0 0
f 0 0 0 0 0 1 1 1 1 1
g 0 0 0 0 0 1 1 1 1 1
h 0 0 0 0 0 1 1 1 1 1
i 0 0 0 0 0 1 1 1 1 1
j 0 0 0 0 0 1 1 1 1 1


Each of these individual matrices are summed over individuals. So, in
this summed matrix diagonal elements represent the total number of
participants and the off-diagonals represent the number of times an item
was viewed as being similar by members of the group (obviously the
matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c'
means that these items were viewed as being similar by 4 people. A
sample total matrix is at the bottom of this email describing the
perceived similarities of 10 items across 4 individuals.

It is this total matrix that I end up working with in the MDS. I have
previously worked in systat where I run the MDS and specify the matrix
as a similarity matrix. I then take the resulting data from the MDS and
perform a k-means cluster analysis to identify which items belong to a
particular cluster, centroids, etc.

So, here are my questions. 

1)  Can isoMDS work only with dissimilarities? Or, is there a way
that it can perform the analysis on the similarity matrix as I have
described it?
2)  If I cannot perform the analysis on the similarity matrix, how
can I turn this matrix into a dissimilarity matrix necessary? I am less
familiar with this matrix and how it would be constructed?

Thanks for any help offered,

Harold 


  a b c d e f g h i j
a 4 2 4 3 3 2 0 0 0 0
b 2 4 2 3 1 0 2 2 2 2
c 4 2 4 3 3 2 0 0 0 0
d 3 3 3 4 2 1 1 1 1 1
e 3 1 3 2 4 3 1 1 1 1
f 2 0 2 1 3 4 2 2 2 2
g 0 2 0 1 1 2 4 4 4 4
h 0 2 0 1 1 2 4 4 4 4
i 0 2 0 1 1 2 4 4 4 4
j 0 2 0 1 1 2 4 4 4 4

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS

2004-09-08 Thread Prof Brian Ripley
On Wed, 8 Sep 2004, Doran, Harold wrote:

 1)Can isoMDS work only with dissimilarities? Or, is there a way
 that it can perform the analysis on the similarity matrix as I have
 described it?

Yes.  The method, as well as the function in package MASS.  All other 
MDS packages are doing a conversion, probably without telling you how.

 2)If I cannot perform the analysis on the similarity matrix, how
 can I turn this matrix into a dissimilarity matrix necessary? I am less
 familiar with this matrix and how it would be constructed?

Normally similarities are in the range [0,1], and people use D = 1 - S or
sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of
dissimilarities, apart from finding the starting configuration.)  See the
references on the help page for isoMDS.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Hanke, Alex
Distances cannot always be constructed from similarities. This can be done
only if the matrix of similarities is nonnegative definite. With the
nonnegative definite condition, and with the maximum similarity scaled so
that s_ii=1, d_ik=(2*(1-s_ik))^-.5

Check out the vegan package.
Alex

-Original Message-
From: Doran, Harold [mailto:[EMAIL PROTECTED] 
Sent: September 8, 2004 10:00 AM
To: [EMAIL PROTECTED]
Cc: Doran, Harold
Subject: [R] isoMDS


Dear List:

I have a question regarding an MDS procedure that I am accustomed to
using. I have searched around the archives a bit and the help doc and
still need a little assistance. The package isoMDS is what I need to
perform the non-metric scaling, but I am working with similarity
matrices, not dissimilarities. The question may end up being resolved
simply.

Here is a bit of substantive background. I am working on a technique
where individuals organize items based on how similar they perceive the
items to be. For example, assume there are 10 items. Person 1 might
group items 1,2,3,4,5 in group 1 and the others in group 2. I then turn
this grouping into a binomial similarity matrix. The following is a
sample matrix for Person 1 based on this hypothetical grouping. The off
diagonals are the similar items with the 1's representing similarities. 
  a b c d e f g h i j
a 1 1 1 1 1 0 0 0 0 0
b 1 1 1 1 1 0 0 0 0 0
c 1 1 1 1 1 0 0 0 0 0
d 1 1 1 1 1 0 0 0 0 0
e 1 1 1 1 1 0 0 0 0 0
f 0 0 0 0 0 1 1 1 1 1
g 0 0 0 0 0 1 1 1 1 1
h 0 0 0 0 0 1 1 1 1 1
i 0 0 0 0 0 1 1 1 1 1
j 0 0 0 0 0 1 1 1 1 1


Each of these individual matrices are summed over individuals. So, in
this summed matrix diagonal elements represent the total number of
participants and the off-diagonals represent the number of times an item
was viewed as being similar by members of the group (obviously the
matrix is symmetric below the diagonal). So, a 4 in row 'a' column 'c'
means that these items were viewed as being similar by 4 people. A
sample total matrix is at the bottom of this email describing the
perceived similarities of 10 items across 4 individuals.

It is this total matrix that I end up working with in the MDS. I have
previously worked in systat where I run the MDS and specify the matrix
as a similarity matrix. I then take the resulting data from the MDS and
perform a k-means cluster analysis to identify which items belong to a
particular cluster, centroids, etc.

So, here are my questions. 

1)  Can isoMDS work only with dissimilarities? Or, is there a way
that it can perform the analysis on the similarity matrix as I have
described it?
2)  If I cannot perform the analysis on the similarity matrix, how
can I turn this matrix into a dissimilarity matrix necessary? I am less
familiar with this matrix and how it would be constructed?

Thanks for any help offered,

Harold 


  a b c d e f g h i j
a 4 2 4 3 3 2 0 0 0 0
b 2 4 2 3 1 0 2 2 2 2
c 4 2 4 3 3 2 0 0 0 0
d 3 3 3 4 2 1 1 1 1 1
e 3 1 3 2 4 3 1 1 1 1
f 2 0 2 1 3 4 2 2 2 2
g 0 2 0 1 1 2 4 4 4 4
h 0 2 0 1 1 2 4 4 4 4
i 0 2 0 1 1 2 4 4 4 4
j 0 2 0 1 1 2 4 4 4 4

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Prof Brian Ripley
On Wed, 8 Sep 2004, Hanke, Alex wrote:

 Distances cannot always be constructed from similarities. This can be done
 only if the matrix of similarities is nonnegative definite. With the
 nonnegative definite condition, and with the maximum similarity scaled so
 that s_ii=1, d_ik=(2*(1-s_ik))^-.5

But isoMDDS works with dissimilarities not distances.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Hanke, Alex
I don't understand.
If isoMDS does not work with distances, why does the help for isoMDS
indicate that the Data are assumed to be dissimilarities or relative
distances ? 
Equally confusing is the loose use of the terms dissimilarities and
distances in the literature. As you point out in your book Distances are
often called disimilarities. 

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: September 8, 2004 11:58 AM
To: Hanke, Alex
Cc: 'Doran, Harold'; '[EMAIL PROTECTED]'
Subject: RE: [R] isoMDS


On Wed, 8 Sep 2004, Hanke, Alex wrote:

 Distances cannot always be constructed from similarities. This can be done
 only if the matrix of similarities is nonnegative definite. With the
 nonnegative definite condition, and with the maximum similarity scaled so
 that s_ii=1, d_ik=(2*(1-s_ik))^-.5

But isoMDDS works with dissimilarities not distances.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] isoMDS

2004-09-08 Thread Doran, Harold
Thank you. Quick clarification. isoMDS only works with dissimilarities.
Converting my similarity matrix into the dissimilarity matrix is done as
(from an email I found on the archives)

 d- max(tt)-tt

Where tt is the similarity matrix. With this, I tried isoMDS as follows:

 tt.mds-isoMDS(d)

and I get the following error message. 

Error in isoMDS(d) : An initial configuration must be supplied with
NA/Infs in d. I was a little confused on exactly how to specify this
initial config. So, from here I ran cmdscale on d as

 d.mds-cmdscale(d)

which seemed to work fine and produce reasonable results. I was able to
take the coordinates and run them through a k-means cluster and the
results seemed to correctly match the grouping structure I created for
this sample analysis.

Cmdscale is for metric scaling, but it seemed to produce the results
correctly. 

So, did I correctly convert the similarity matrix to the dissimilarity
matrix? Second, should I have used cmdscale rather than isoMDS as I have
done? Or, is there a way to specify the initial configuration that I
have not done correctly.

Again, many thanks.

Harold

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 08, 2004 9:58 AM
To: Doran, Harold
Cc: [EMAIL PROTECTED]
Subject: Re: [R] isoMDS

On Wed, 8 Sep 2004, Doran, Harold wrote:

 1)Can isoMDS work only with dissimilarities? Or, is there a way
 that it can perform the analysis on the similarity matrix as I have
 described it?

Yes.  The method, as well as the function in package MASS.  All other 
MDS packages are doing a conversion, probably without telling you how.

 2)If I cannot perform the analysis on the similarity matrix, how
 can I turn this matrix into a dissimilarity matrix necessary? I am
less
 familiar with this matrix and how it would be constructed?

Normally similarities are in the range [0,1], and people use D = 1 - S
or
sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of
dissimilarities, apart from finding the starting configuration.)  See
the
references on the help page for isoMDS.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] isoMDS

2004-09-08 Thread Kjetil Brinchmann Halvorsen
Doran, Harold wrote:
Thank you. Quick clarification. isoMDS only works with dissimilarities.
Converting my similarity matrix into the dissimilarity matrix is done as
(from an email I found on the archives)
 

d- max(tt)-tt
   

 

Mardia, kent  Bibby defines the standard transformation from a 
similarity matrix to a dissimilarity
(distance) matrix by

d_rs -  sqrt( c_rr -2*c_rs + c_ss)
where c_rs are the similarities. This assures the diagonal of the 
dissimilarity matrix to be zero.
You could try that.

Kjetil halvorsen

Where tt is the similarity matrix. With this, I tried isoMDS as follows:
 

tt.mds-isoMDS(d)
   

and I get the following error message. 

Error in isoMDS(d) : An initial configuration must be supplied with
NA/Infs in d. I was a little confused on exactly how to specify this
initial config. So, from here I ran cmdscale on d as
 

d.mds-cmdscale(d)
   

which seemed to work fine and produce reasonable results. I was able to
take the coordinates and run them through a k-means cluster and the
results seemed to correctly match the grouping structure I created for
this sample analysis.
Cmdscale is for metric scaling, but it seemed to produce the results
correctly. 

So, did I correctly convert the similarity matrix to the dissimilarity
matrix? Second, should I have used cmdscale rather than isoMDS as I have
done? Or, is there a way to specify the initial configuration that I
have not done correctly.
Again, many thanks.
Harold
-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 08, 2004 9:58 AM
To: Doran, Harold
Cc: [EMAIL PROTECTED]
Subject: Re: [R] isoMDS

On Wed, 8 Sep 2004, Doran, Harold wrote:
 

1)	Can isoMDS work only with dissimilarities? Or, is there a way
that it can perform the analysis on the similarity matrix as I have
described it?
   

Yes.  The method, as well as the function in package MASS.  All other 
MDS packages are doing a conversion, probably without telling you how.

 

2)	If I cannot perform the analysis on the similarity matrix, how
can I turn this matrix into a dissimilarity matrix necessary? I am
   

less
 

familiar with this matrix and how it would be constructed?
   

Normally similarities are in the range [0,1], and people use D = 1 - S
or
sqrt(1-S). (Which does not matter for isoMDS since it only uses ranks of
dissimilarities, apart from finding the starting configuration.)  See
the
references on the help page for isoMDS.
 


--
Kjetil Halvorsen.
Peace is the most effective weapon of mass construction.
  --  Mahdi Elmandjra
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R]: isoMDS using

2004-01-03 Thread Eugenij P. Altshuler
Happy New Year!

I tried to use isoMDS to present graphically matrix of coefficients of
divergence, and I
have seen error NAs/Infs not allowed in d.

But there no NAs or Inf's in my matrix!
Function `as.vector' (which is applied to test input data with
`!is.finite' ) returns in one case input matrix and in other case returns
sequence of values of input matrix. When it returns matrix I  receive error
NAs/Infs not allowed in d. When it returns sequence
of values I don't receive this error.

Is it possible to use coefficients of divergence instead dissimilarity?

Thank you!!

---
Altshuler Eugenij P.
Moscow South-West High School
mailto:[EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] isoMDS results

2003-03-25 Thread Christian Hennig
Hi,

this is a second try to post this to the R-help mailing list. The first one
has been rejected because of a too large attachment.
Now I ask this without attaching the data. If you want to reproduce the
results, please contact me directly to get the data.

(First mail, rejected:)
 Attached there is a 149*149 dissimilarity matrix; it is a file obtained by 
 save(dm,file=dissim.Rsav).

OK, here is my question:

I worry about the reproducability of the results of isoMDS. 

I try
 set.seed(5678)
 mdslinux - isoMDS(dm,k=4)
initial  value 31.071976 
final  value 31.071976 
converged

 R.version
 _
platform i686-pc-linux-gnu
arch i686 
os   linux-gnu
system   i686, linux-gnu  
status
major1
minor6.2  
year 2003 
month01   
day  10   
language R


My co-worker works also with the same dissimilarity matrix and did the same
on a Windows machine (unfortunately I do not have the version data, but
it should not be too old) and got
 set.seed(5678)
 mdswin - isoMDS(dm,k=4)
initial  value 31.071976 
final  value 24.16980
converged

As to be expected, also the resulting MDS-configurations differ. 
Initially, the cmdcsale version seems to be used, and this is
identical. BTW, I often observed that the isoMDS iteration does not change
anything (but not always) 
from the cmdscale initial configuration on my machine, and I have
been somewhat sceptical more than once if this is correct. 

Can all this be explained with Windows/Linux differences or what else may
happen here?

Best,
Christian



-- 
***
Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
[EMAIL PROTECTED], http://stat.ethz.ch/~hennig/
[EMAIL PROTECTED], http://www.math.uni-hamburg.de/home/hennig/
###
ich empfehle www.boag.de

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help