If you want to see how Venables and Ripley get their lda distances, then this is a quick path to the (uncommented) source:

1) > methods(lda)
[1] lda.data.frame* lda.default*    lda.formula*    lda.matrix*
2) since you call involved a formula I looked first at:
> getAnywhere(lda.formula)
A single object matching ‘lda.formula’ was found
<snipped output because after setup the data was dispatched to lda.default
3) > getAnywhere(lda.default)
A single object matching ‘lda.default’ was found
<snipped output.
<scroll down to the section where they do their scaling and learn from the masters.

The source code in the original package may have advantages in that the comments would be visible. There was an R News, if memory serves, by Uwe Ligges, if memory serves, a few years ago on getting access to the source that is good reading.
--
David


2)On Sep 29, 2009, at 12:39 PM, Pete Shepard wrote:

Thanks David,

Yes, I am talking about the MASS package.Thank you for pointing out that
these scale the same. My question is, how do I get from the V1 data:

      V1
1 164.4283
2 166.2492
3 170.5232
4 156.5622
5 127.7540
6 136.7704
7 136.3436

to the other set of data:


+ 1 -2.3769280
+ 2 -2.7049437
+ 3 -3.4748309
+ 4 -0.9599825
+ 5  4.2293774
+ 6  2.6052193
+ 7  2.6820884

On Mon, Sep 28, 2009 at 3:41 PM, David Winsemius <dwinsem...@comcast.net >wrote:

Your results are the same (after scaling and sign reversal) out to the 4th decimal place as those from lda (which by the way is almost certainly from
the MASS package and not from an impossible to find "lda package".)

read.table(textConnection(txt))
      V1
1 164.4283
2 166.2492
3 170.5232
4 156.5622
5 127.7540
6 136.7704
7 136.3436
est <-read.table(textConnection(txt))
scale(est)
           V1
[1,]  0.7656185
[2,]  0.8712707
[3,]  1.1192567
[4,]  0.3092117
[5,] -1.3622976
[6,] -0.8391481
[7,] -0.8639119
attr(,"scaled:center")
   V1
151.233
attr(,"scaled:scale")
    V1
17.23484

LD1est <- read.table(textConnection(" LD1
+ 1 -2.3769280
+ 2 -2.7049437
+ 3 -3.4748309
+ 4 -0.9599825
+ 5  4.2293774
+ 6  2.6052193
+ 7  2.6820884"), header=T)


scale(LD1est)
       LD1
1 -0.7656170
2 -0.8712721
3 -1.1192555
4 -0.3092138
5  1.3622976
6  0.8391505
7  0.8639103
attr(,"scaled:center")
        LD1
-3.172066e-17
attr(,"scaled:scale")
   LD1
3.104591


On Sep 28, 2009, at 5:54 PM, Pete Shepard wrote:

I am having a problem understanding the lda package. I have a dataset
here:

 [,1] [,2] [,3]
[1,] 2.95 6.63    0
[2,] 2.53 7.79    0
[3,] 3.57 5.65    0
[4,] 3.16 5.47    0
[5,] 2.58 4.46    1
[6,] 2.16 6.22    1
[7,] 3.27 3.52    1

If I do the following;

"names(d)<-c("y","x1","x2")
d$x1 = d$x1 * 100
d$x2 = d$x2 * 100
g<-lda( y ~ x1 + x2, data=d)
v2 <- predict(g, d)",

I get;
     LD1
1 -2.3769280
2 -2.7049437
3 -3.4748309
4 -0.9599825
5  4.2293774
6  2.6052193
7  2.6820884

However, If I do it manually,

"rawdata<-matrix(scan("tab1_1.


dat"),ncol=3,byrow=T)
group <- rawdata[,1]
X <- 100 * rawdata[,2:3]
Apf <- X[group==1,]
Af <- X[group==0,]
xbar1 <- apply(Af, 2, mean)
S1 <- var(Af)
N1 <- dim(Af)[1]
xbar2 <- apply(Apf, 2, mean)
S2 <- var(Apf)
N2 <- dim(Apf)[1]
S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
Sinv=solve(S)
d<-xbar1-xbar2
b <- Sinv %*% d
v <- X %*% b",

I get;

     [,1]
[1,] 164.4283
[2,] 166.2492
[3,] 170.5232
[4,] 156.5622
[5,] 127.7540
[6,] 136.7704
[7,] 136.3436








I am having a problem understanding the lda package. I have a dataset
here:

 [,1] [,2] [,3]
[1,] 2.95 6.63    0
[2,] 2.53 7.79    0
[3,] 3.57 5.65    0
[4,] 3.16 5.47    0
[5,] 2.58 4.46    1
[6,] 2.16 6.22    1
[7,] 3.27 3.52    1

If I do the following;

"names(d)<-c("y","x1","x2")
d$x1 = d$x1 * 100
d$x2 = d$x2 * 100
g<-lda( y ~ x1 + x2, data=d)
v2 <- predict(g, d)",

I get;
     LD1
1 -2.3769280
2 -2.7049437
3 -3.4748309
4 -0.9599825
5  4.2293774
6  2.6052193
7  2.6820884

However, If I do it manually,

"rawdata<-matrix(scan("tab1_1.dat"),ncol=3,byrow=T)
group <- rawdata[,1]
X <- 100 * rawdata[,2:3]
Apf <- X[group==1,]
Af <- X[group==0,]
xbar1 <- apply(Af, 2, mean)
S1 <- var(Af)
N1 <- dim(Af)[1]
xbar2 <- apply(Apf, 2, mean)
S2 <- var(Apf)
N2 <- dim(Apf)[1]
S<-((N1-1)*S1+(N2-1)*S2)/(N1+N2-2)
Sinv=solve(S)
d<-xbar1-xbar2
b <- Sinv %*% d
v <- X %*% b",

I get;

     [,1]
[1,] 164.4283
[2,] 166.2492
[3,] 170.5232
[4,] 156.5622
[5,] 127.7540
[6,] 136.7704
[7,] 136.3436


It seems there is an extra step that I am missing? The predict step that adds a constant to the second set of values? Can anyone clear this up for
me?





David Winsemius, MD
Heritage Laboratories
West Hartford, CT



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to