Re: [R] Block comments in R?

2006-10-06 Thread Jose Claudio Faria
Hi all,

Indeed, block comment is more clean and elegant than line by line.

If the R interpreter will recognizes it (I'm not sure if already recognizes), 
we 
will spend no more than few hours to make the syntax of the main editors 
compatible, isn't it?

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Small help with plot.mca

2006-10-05 Thread Jose Claudio Faria
Dear list,

Please, how can I get all the levels of the same factor to have the same color 
and different color for each factor?

Script:

library(MASS)
plot(mca(farms, abbrev = TRUE), rows=F)

Thanks in advance,
-- 
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Doubt about Student t distribution simulation

2006-08-04 Thread Jose Claudio Faria
Dear R list,

I would like to illustrate the origin of the Student t distribution using R.

So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t 
distribution with (sample.size - 1) degree free, what is wrong with the 
simulation below? I think that the theoretical curve should agree with 
the relative frequencies of the t values calculated:

#== begin options=
# parameters
   mu= 10
   sigma =  5

# size of sample
   n = 3

# repetitions
   nsim = 1

# histogram parameter
   nchist = 150
#== end options===

t   = numeric()
pop = rnorm(1, mean = mu, sd = sigma)

for (i in 1:nsim) {
   amo.i = sample(pop, n, replace = TRUE)
   t[i]  = (mean(amo.i) - mu) / (sigma / sqrt(n))
}

win.graph(w = 5, h = 7)
split.screen(c(2,1))
screen(1)
hist(t,
  main = histogram,
  breaks   = nchist,
  col  = lightgray,
  xlab = '', ylab = Fi,
  font.lab = 2, font = 2)

screen(2)
hist(t,
  probability = T,
  main= 'f.d.p and histogram',
  breaks  = nchist,
  col = 'lightgray',
  xlab= 't', ylab = 'f(t)',
  font.lab= 2, font = 2)

x = t
curve(dt(x, df = n-1), add = T, col = red, lwd = 2)

Many thanks for any help,
___
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt about Student t distribution simulation

2006-08-04 Thread Jose Claudio Faria
Dears John, Peter and Sundar,

Many thanks for the quick answers!!!

.. and sorry for all..

[]s
___
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

John Fox escreveu:
 Dear Jose,
 
 The problem is that you're using the population standard deviation (sigma)
 rather than the sample SD of each sample [i.e., t[i]  = (mean(amo.i) - mu) /
 (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as
 they appear to be.
 
 A couple of smaller points: (1) Even after this correction, you're sampling
 from a discrete population (albeit with replacement) and so the values won't
 be exactly t-distributed. You could draw the samples directly from N(mu,
 sigma) instead. (2) It would be preferable to make a quantile-comparison
 plot against the t-distribution, since you'd get a better picture of what's
 going on in the tails.
 
 I hope this helps,
  John 
 
 
 John Fox
 Department of Sociology
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 905-525-9140x23604
 http://socserv.mcmaster.ca/jfox 
  
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Jose 
 Claudio Faria
 Sent: Friday, August 04, 2006 3:09 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] Doubt about Student t distribution simulation

 Dear R list,

 I would like to illustrate the origin of the Student t 
 distribution using R.

 So, if (sample.mean - pop.mean) / standard.error(sample.mean) 
 has t distribution with (sample.size - 1) degree free, what 
 is wrong with the simulation below? I think that the 
 theoretical curve should agree with the relative frequencies 
 of the t values calculated:

 #== begin options=
 # parameters
mu= 10
sigma =  5

 # size of sample
n = 3

 # repetitions
nsim = 1

 # histogram parameter
nchist = 150
 #== end options===

 t   = numeric()
 pop = rnorm(1, mean = mu, sd = sigma)

 for (i in 1:nsim) {
amo.i = sample(pop, n, replace = TRUE)
t[i]  = (mean(amo.i) - mu) / (sigma / sqrt(n)) }

 win.graph(w = 5, h = 7)
 split.screen(c(2,1))
 screen(1)
 hist(t,
   main = histogram,
   breaks   = nchist,
   col  = lightgray,
   xlab = '', ylab = Fi,
   font.lab = 2, font = 2)

 screen(2)
 hist(t,
   probability = T,
   main= 'f.d.p and histogram',
   breaks  = nchist,
   col = 'lightgray',
   xlab= 't', ylab = 'f(t)',
   font.lab= 2, font = 2)

 x = t
 curve(dt(x, df = n-1), add = T, col = red, lwd = 2)

 Many thanks for any help,
 ___
 Jose Claudio Faria
 Brasil/Bahia/Ilheus/UESC/DCET
 Estatística Experimental/Prof. Adjunto
 mails: [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 Esta mensagem foi verificada pelo E-mail Protegido Terra.
 Scan engine: McAfee VirusScan / Atualizado em 04/08/2006 / Versão: 4.4.00/4822
 Proteja o seu e-mail Terra: http://mail.terra.com.br/
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis Title in persp() Overlaps with Axis Labels

2006-07-26 Thread Jose Claudio Faria
Dear Kilian,

Also give a looked at: 
http://wiki.r-project.org/rwiki/doku.php?id=graph_gallery:new-graphics

You will see a new and very flexible function to 3D plot.

Regards,
__
Jose Claudio Faria
Brasil/Bahia/Ilheus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails: [EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

Paul Murrell p.murrell at auckland.ac.nz writes:

 
  Hi
 
  Kilian Plank wrote:
   Good morning,
  
   in a 3D plot based on persp() the axis title (of dimension z) 
overlaps with
   the axis labels.
   How can the distance (between axis labels and axis title) be increased?
 

  Paul

   Another way to do it: get the perspective matrix
back from persp() and use trans3d() to redo essentially
the same calculations that persp() does to decide where
to put the label:

x - seq(-10, 10, length= 30)
y - x
f - function(x,y) { r - sqrt(x^2+y^2); 10 * sin(r)/r }
z - outer(x, y, f)
z[is.na(z)] - 1

par(mfrow=c(2, 2))
persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed)

persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed,
   zlab=\n\n\n\nz)

p1 - persp(x, y, z, theta = 30, phi = 30, expand = 0.5,
   col = lightblue, ticktype=detailed,zlab=)

ranges - t(sapply(list(x,y,z),range))
means - rowMeans(ranges)

## label offset distance, as a fraction of the plot width
labelspace - 0.12  ## tweak this until you like the result

xpos - min(x)-(diff(range(x)))*labelspace
ypos - min(y)-(diff(range(y)))*labelspace
labelbot3d - c(xpos,ypos,min(z))
labeltop3d - c(xpos,ypos,max(z))
labelmid3d - c(xpos,ypos,mean(range(z)))

trans3dfun - function(v) { trans3d(v[1],v[2],v[3],p1) }
labelbot2d - trans3dfun(labelbot3d)
labelmid2d - trans3dfun(labelmid3d)
labeltop2d - trans3dfun(labeltop3d)
labelang - 
180/pi*atan2(labeltop2d$y-labelbot2d$y,labeltop2d$x-labelbot2d$x)
par(xpd=NA,srt=labelang)  ## disable clipping and set string rotation
text(labelmid2d[1]$x,labelmid2d$y,z label)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA: Help with SSQ decomposition and contrasts

2006-02-13 Thread Jose Claudio Faria
# Dear R list,
#
# A have a doubt about SSQ decomposition and contrasts with ANOVAs.
# So, I would like a tip from more advanced R users.

# Below my data, the basic script and my doubts:

# Data
r   = paste('r', gl(3, 8), sep='')
e   = paste('e', rep(gl(2, 4), 3), sep='')
tra = sort(paste('t', rep(1:6, 4), sep=''))
y   = c(26.2, 26.0, 25.0, 25.4, 24.8, 24.6, 26.7, 25.2,
 25.7, 26.3, 25.1, 26.4, 19.6, 21.1, 19.0, 18.6,
 22.8, 19.4, 18.8, 19.2, 19.8, 21.4, 22.8, 21.3)
dF  = data.frame(r, e, tra, y)

# Graphic
par(mfrow=c(2,1))
interaction.plot(dF$r, dF$e, dF$y,
  col = 'blue', ylab = 'Y', xlab = 'R')

interaction.plot(dF$e, dF$r, dF$y,
  col = 'blue', ylab = 'Y', xlab = 'R')

# ANOVAs
av0 = aov(y ~ tra, data=dF)
summary(av0)

av1 = aov(y ~ r*e, data=dF)
summary(av1)

av2 = aov(y ~ r/e, data=dF)
e_r = summary(av2, split = list('r:e' = list(
   'e1 vs e2/r1' = 1, 'e1 vs e2/r2' = 2, 'e1 vs e2/r3' = 3)))
e_r

av3 = aov(y ~ e/r, data=dF)
r_e = summary(av3, split = list('e:r' = list(
  'r/e1' = c(1,3), 'r/e2' = c(2,4
r_e

# My Doubts

# a) How to make SSQ decomposition to complete the ANOVA below?

#Df  Sum Sq Mean Sq F valuePr(F)
# e1  19.082  19.082  14.875  0.001155
# e:r  4 156.622  39.155  30.524 8.438e-08
#   e:r: r/e1  2  87.122  43.561  33.958 7.776e-07
# r1 vs (r2, r3)   1   ?   ?   ? ?
# r2 vs r3 1   ?   ?   ? ?
#   e:r: r/e2  2  69.500  34.750  27.090 3.730e-06
# r1 vs (r2, r3)   1   ?   ?   ? ?
# r2 vs r3 1   ?   ?   ? ?
# Residuals   18  23.090   1.283

# b) How to make SSQ decomposition to complete the ANOVA below?

#Df  Sum Sq Mean Sq F valuePr(F)
# e1  19.082  19.082  14.875  0.001155
# e:r  4 156.622  39.155  30.524 8.438e-08
#   e:r: r/e1  2  87.122  43.561  33.958 7.776e-07
# r2 vs (r1, r3)   1   ?   ?   ? ?
# r1 vs r3 1   ?   ?   ? ?
#   e:r: r/e2  2  69.500  34.750  27.090 3.730e-06
# r2 vs (r1, r3)   1   ?   ?   ? ?
# r1 vs r3 1   ?   ?   ? ?
# Residuals   18  23.090   1.283

# TIA,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] linear contrasts with anova

2006-01-24 Thread Jose Claudio Faria
 Date: Mon, 23 Jan 2006 13:25:33 +0100
 From: Christoph Buser [EMAIL PROTECTED]
 Subject: Re: [R] linear contrasts with anova
 To: Posta Univ. Cagliari [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Message-ID: [EMAIL PROTECTED]
 Content-Type: text/plain; charset=us-ascii
 
 Dear Marco
 
 If you are interested in a comparison of a reference level
 against each other level (in your case level 1 against level 2
 and level 1 against level 3), you can use the summary.lm(),
 since this is the default contrast (see ?contr.treatment)
 
 ar - data.frame(GROUP = factor(rep(1:3, each = 8)),
  DIP = c(3.0, 3.0, 4.0, 5.0, 6.0, 7.0, 3.0, 2.0, 1.0, 6.0, 
 5.0,
7.0, 2.0, 3.0, 1.5, 1.7, 17.0, 12.0, 15.0, 16.0, 12.0, 
 23.0,
19.0, 21.0))
 
 
 r.aov10 - aov(DIP ~  GROUP, data = ar)
 anova(r.aov10)
 summary.lm(r.aov10)
 
 As result you will get the comparison GROUP 2 against GROUP 1,
 denoted by GROUP2 and the comparison GROUP 3 against GROUP 1,
 denoted by GROUP3.
 
 Be careful. In your example you include both GROUP and C1 or C2,
 respectively in your model. This results in a over parameterized
 model and you get a warning that not all coefficients have been
 estimated, due to singularities.
 
 It is possible to use other contrasts than contr.treatment,
 contr.sum, contr.helmert, contr.poly, but then you have to
 specify the correctly in the model.
 
 Regards,
 
 Christoph Buser
 
 --
 Christoph Buser [EMAIL PROTECTED]
 Seminar fuer Statistik, LEO C13
 ETH (Federal Inst. Technology)8092 Zurich  SWITZERLAND
 phone: x-41-44-632-4673   fax: 632-1228
 http://stat.ethz.ch/~buser/
 --

Dear Marco

Try also the the below:

# Loading packages
library(gplots)
library(gmodels)

# Testing the nature of dF
is.data.frame(dF)
is.factor(dF$GROUP)
is.numeric(dF$DIP)

#Plotting GROUPS
win.graph(w = 4, h = 5)
plotmeans(DIP ~ GROUP, data = dF, mean.labels = TRUE,
   digits = 3, col = 'blue', connect = FALSE,
   ylab = 'DIP', xlab = 'GROUP', pch='')

# Contrasts
attach(dF)
  #1   2   3 GROUP
cmat = rbind('1 vs. 3' = c( 1,  0, -1,),
 '1 vs. 2' = c( 1, -1,  0,))

   av  = aov(DIP ~ GROUP, data = dF, contrasts = list(GROUP = 
make.contrasts(cmat)))
   sav = summary(av1, split = list(GROUP=list('1 vs. 3'=1,
 '1 vs. 2'=2)))
   sav
detach(dF)

# Another option

attach(dF)
 #A   B   C
   fit.contrast(av, GROUP, c( 1,  0, -1)) # from gmodels
   fit.contrast(av, GROUP, c( 1, -1,  0))
detach(dF)

HTH, []s,
-- 
Jose Claudio Faria
Brasil/Bahia/Ilhéus/UESC/DCET
Estatística Experimental/Prof. Adjunto
mails:
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R (2.2.0), R-DCOM and Delphi

2005-11-07 Thread Jose Claudio Faria
Dear Dieter,

First, thank you for your work!
Below the error message I got when trying to compile your RDCom.dpr

[Warning] STATCONNECTORCLNTLib_TLB.pas(319): Unsafe type 'EventDispIDs: Pointer'
[Warning] STATCONNECTORCLNTLib_TLB.pas(320): Unsafe type 'LicenseKey: Pointer'
[Warning] STATCONNECTORCLNTLib_TLB.pas(324): Unsafe code '@ operator'
[Warning] STATCONNECTORCLNTLib_TLB.pas(367): Unsafe type 'EventDispIDs: Pointer'
[Warning] STATCONNECTORCLNTLib_TLB.pas(368): Unsafe type 'LicenseKey: Pointer'
[Warning] STATCONNECTORCLNTLib_TLB.pas(372): Unsafe code '@ operator'
[Warning] STATCONNECTORCLNTLib_TLB.pas(374): Unsafe code '@ operator'
[Warning] STATCONNECTORSRVLib_TLB.pas(376): Unsafe type 'LicenseKey: Pointer'
[Warning] STATCONNECTORSRVLib_TLB.pas(379): Unsafe code '@ operator'
[Warning] RCom.pas(93): Unsafe code 'String index to var param'
[Error] RCom.pas(119): Undeclared identifier: 'VarType'
[Error] RCom.pas(124): Undeclared identifier: 'VarArrayDimCount'
[Error] RCom.pas(127): Undeclared identifier: 'VarArrayHighBound'
[Error] RCom.pas(140): Undeclared identifier: 'VarType'
[Error] RCom.pas(145): Undeclared identifier: 'VarArrayDimCount'
[Error] RCom.pas(148): Undeclared identifier: 'VarArrayHighBound'
[Error] RCom.pas(161): Undeclared identifier: 'VarType'
[Error] RCom.pas(166): Undeclared identifier: 'VarArrayDimCount'
[Error] RCom.pas(169): Undeclared identifier: 'VarArrayHighBound'
[Error] RCom.pas(181): Undeclared identifier: 'VarArrayCreate'
[Error] RCom.pas(196): Undeclared identifier: 'VarArrayCreate'
[Error] RCom.pas(211): Undeclared identifier: 'VarArrayCreate'
[Error] RCom.pas(226): Undeclared identifier: 'VarArrayCreate'
[Error] RCom.pas(253): Undeclared identifier: 'VarType'
[Error] RCom.pas(258): Undeclared identifier: 'VarArrayDimCount'
[Error] RCom.pas(261): Undeclared identifier: 'VarArrayHighBound'
[Fatal Error] RDComMain.pas(14): Could not compile used unit 'RCom.pas'

I'm using Delphi 7 under WinXP pro/SP2.
Could you give me a tip?

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help with lattice, regressions and respective lines

2005-10-14 Thread Jose Claudio Faria
# Dear R list,
#
# I'm needing help with lattice, regression and respective lines.
# My data is below:

bra  = gl(2, 24, label = c('c', 's'))
em   = rep(gl(3, 8,  label = c('po', 'pov', 'ce')), 2)
tem  = rep(c(0, 0, 30, 30, 60, 60, 90, 90), 6)
tem2 = tem^2
r= rep(1:2, 24)
y= c(40.58, 44.85, 32.55, 35.68, 64.86, 51.95, 42.52, 52.21,
  40.58, 44.85, 33.46, 46.09, 12.75, 18.01, 16.82, 13.69,
  40.58, 44.85, 34.45, 29.89, 34.91, 28.10, 27.52, 22.24,
  48.68, 47.25, 45.58, 45.33, 41.03, 51.20, 45.85, 54.45,
  48.68, 47.25, 19.88, 19.67, 16.20, 13.49, 13.75, 18.80,
  48.68, 47.25, 42.19, 39.91, 34.69, 34.11, 32.74, 34.24)

Df = data.frame(bra, em, tem, tem2, r, y)

# Regressions
attach(Df)
   Dfs1=subset(Df, (bra=='s'  em=='pov'), select=c(bra, em, tem, tem2, r, y))
   Dfs1
   rlin1=lm(y ~ tem + tem2, data=Dfs1)
   summary(rlin1)

   Dfs2=subset(Df, (bra=='s'  em=='po'), select=c(bra, em, tem, r, y))
   Dfs2
   rlin2=lm(y ~ tem, data=Dfs2)
   summary(rlin2)

   Dfs3=subset(Df, (bra=='s'  em=='ce'), select=c(bra, em, tem, tem2, r, y))
   Dfs3
   rlin3=lm(y ~ tem + tem2, data=Dfs3)
   summary(rlin3)
detach(Df)

# I would like to plot with lattice 'y ~ tem | em',
# with the panels ('po', 'pov' and 'ce'),
# and the its respective regressions lines:
# a) linear for panel 'po' or better, without line;
# b) quadratic for 'pov' and 'ce'

# Is it possible? Could somebody hel me?

# I'm trying:
library(lattice)
attach(Df)
   Dfs=subset(Df, bra=='s', select=c(bra, em, tem, y))
   Dfs
   xyplot(y ~ tem | em,
  data = Dfs, ylim=c(10, 60), xlim=c(-10, 110),
  ylab='y', xlab='Time, days',
  layout = c(3,1))
detach(Df)

TIA,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with a more flexible funtion for multiple comparisio n of means

2005-09-12 Thread Jose Claudio Faria
Rogers, James A [PGRD Groton] wrote:
 Jose - 
 
 Before implementing SNK and Duncan's, you may want to be aware of some
 criticisms of these methods:
 
From Hsu (1996), 
 
 Newman-Keuls multiple range test is not a confident inequalities method and
 cannot be recommended.
 
 Duncan's multiple range test is not a confident inequalities method and
 cannot be recommended either. In the words of Tukey (1991), Duncan's
 multiple range test was a 'distraction' in the history of multiple
 comparisons, amounting to 'talking 5% while using more than 5%
 simultaneous'
 
 @Book{Hsu1996,
   author = {Jason C. Hsu},
   title =  {Multiple Comparisons: Theory and Methods},
   publisher =  {Chapman  Hall},
   year =   {1996}
 }
 
 
 -- Jim 

Ok Jim, I know these limitations, but these tests exists.
BTW, I am making this function for academic reasons and to learning how to make 
more advanced R functions.

Thanks for all.

Best,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help with one matrix

2005-09-11 Thread Jose Claudio Faria
Gabor Grothendieck wrote:

 On 9/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:
 
Dear R-list,

Could anybody tell me how to make one matrix as the below:

 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]-23456
[2,]2-2345
[3,]32-234
[4,]432-23
[5,]5432-2
[6,]65432-

 
 
 Assuming that - means NA 
 
 dd - diag(NA, 6)
 abs(col(dd) - row(dd)) + 1 + dd
 
 or
 
 abs(outer(1:6, 1:6, -)) + 1 + diag(NA,6)
 
 or
 
 f - function(x,y) ifelse(x==y, NA, abs(x-y)+1)
 outer(1:6, 1:6, f)

Hi,

You are always solving (and teaching) my R doubts: thanks Gabor, very much!
Because I need one, I've been trying to make a more flexible function for 
multiple comparison test of means (Tukey, SNK and Duncan). The matrix above is 
necessary for SNK and Duncan tests. So, when running I will to sent it for you 
for suggestions.

Best,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help with a more flexible funtion for multiple comparision of means

2005-09-11 Thread Jose Claudio Faria
Dear R-list,

Could anybody tell me (or give me a tip) of how to implement the Duncan 
distribution in R?

I've been trying to make a new and more flexible function for multiple 
comparison of means: Tukey, SNK and Duncan, from 'aov' objects, like TukeyHSD 
function.

For while, it is running nice (Tukey and SNK), for simple design (completely 
randomized, randomized block and Latin squares) and simple experimental schemes 
  (one factor).

I'm needing only two informations: 'qduncan' and 'pduncan',
similar to already available in R 'qtukey' and 'ptukey'. The basic algorithm 
implemented with SNK test will be used for Duncan test.

Below a sample:
a) Generating data and calling the function:

tra = gl(4, 5, label = c('A', 'B', 'C', 'D'))
blo   = rep(1:5, 4)
pro = c(NA, 26, 20, 23, 21, 31, 25, 28, 27, 24, 22, 26, NA, 25, 29, 33, 29, 31, 
34, NA)

x   = aov(pro ~ tra) #or x= aov(pro ~ blo + tra)
res = mctm(x, which='tra', test='SNK', conf.level=0.95)
print(res)

b) The R output:

$Table
Tables of means
Grand mean

26.70588

  tra
A  BC D
 22.5 27 25.5 31.75
rep  4.0  5  4.0  4.00

$Ordered means
tra
 D B C A
31.75 27.00 25.50 22.50

$Result
   D  B  C  A
D -  *  *  *
B *  - ns ns
C * ns  - ns
A * ns ns  -

$Test
[1] SNK

$Conf.level
[1] 0.95

$Mean differences
  DBCA
D 0.00 4.75 6.25 9.25
B 4.75 0.00 1.50 4.50
C 6.25 1.50 0.00 3.00
A 9.25 4.50 3.00 0.00

$Minimum Significative Differences - MSD
  DBCA
D 0.00 3.83 4.93 5.48
B 3.83 0.00 3.83 4.68
C 4.93 3.83 0.00 4.04
A 5.48 4.68 4.04 0.00

$Replicates number
 D   B   C   A
D   - 4:5 4:4 4:4
B 5:4   - 5:4 5:4
C 4:4 4:5   - 4:4
A 4:4 4:5 4:4   -

Thanks in advance,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] help with one matrix

2005-09-11 Thread Jose Claudio Faria
Hi Jim,

Many thanks for the function!
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help with one matrix

2005-09-10 Thread Jose Claudio Faria
Dear R-list,

Could anybody tell me how to make one matrix as the below:

  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]-23456
[2,]2-2345
[3,]32-234
[4,]432-23
[5,]5432-2
[6,]65432-

Thanks in advance,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Statistics with R

2005-08-30 Thread Jose Claudio Faria
Hi Vincent,

Thanks for your good work!

Best,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] An small suggestion for the R team

2005-08-04 Thread Jose Claudio Faria
Hi all,

I would like to suggest that all R functions/etc like:
   codes-deprecated
   grid-internal
   ns-alt
   ns-dblcolon
   ns-hooks
   ns-internals
   ns-lowlev
   ns-reflect.Rd
   tools-internal
   ts-defunct
   utils-deprecated
   utils-internal
   ... and another

i.e, function/word separate for '-'

were all substituted by
   codes_deprecated
   grid_internal
   ns_alt
   ns_dblcolon
   ns_hooks
   ns_internals
   ns_lowlev
   ns_reflect.Rd
   tools_internal
   ts_defunct
   utils_deprecated
   utils_internal
   ... and another

i.e, by '_'

Because it is impossible to make a good highlighter with the first one.

How to differentiating myValue:

varOne = 100
varTwo = 50
myValue = varOne-varTwo

from codes-deprecated, or ns-alt, for example.

TIA,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] An small suggestion for the R team

2005-08-04 Thread Jose Claudio Faria
Gabor Grothendieck wrote:
 On 8/4/05, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 
On 8/4/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:

Hi all,

I would like to suggest that all R functions/etc like:
  codes-deprecated
  grid-internal
  ns-alt
  ns-dblcolon
  ns-hooks
  ns-internals
  ns-lowlev
  ns-reflect.Rd
  tools-internal
  ts-defunct
  utils-deprecated
  utils-internal
  ... and another

i.e, function/word separate for '-'

were all substituted by
  codes_deprecated
  grid_internal
  ns_alt
  ns_dblcolon
  ns_hooks
  ns_internals
  ns_lowlev
  ns_reflect.Rd
  tools_internal
  ts_defunct
  utils_deprecated
  utils_internal
  ... and another

i.e, by '_'

Because it is impossible to make a good highlighter with the first one.

How to differentiating myValue:

varOne = 100
varTwo = 50
myValue = varOne-varTwo

from codes-deprecated, or ns-alt, for example.

One can create a variable with a minus in it by using
backquotes:


`a-b` - 3
a - 10
b - 4
`a-b` + a-b

[1] 9  # 3 + 10 - 4

 
 
 One other point.  The help system, i.e. ?, accommodates minus
 signs like this where the last two statements below give
 the same result:
 
 library(survival)
 ?survival-internal
 internal?survival
 
 Another example is,
 
 library(dyn)
 package?dyn

Indeed, the above are very good examples of how to contour the original problem 
Gabor!
But the central problem is to make Good R highlighter, as I'm trying to do
for the programs Tinn-R and jEdit with this functions.

Is not possible differentiate the signal minus as a simple character 
(word-word) from
the mathematical operator minus (object-object).

I think that is more easy to remove all this functions/reserved words.

TIA,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] An small suggestion for the R team

2005-08-04 Thread Jose Claudio Faria
Hi THomas,

This is very a good information: thanks very much!
So, I will just now to remove all from the highlighters.
BTW, do you know if the signal minus is used in some function or reserved word?

TIA,

Jose Claudio Faria

Thomas Lumley wrote:
 On Thu, 4 Aug 2005, Jose Claudio Faria wrote:
 
 I would like to suggest that all R functions/etc like:
  codes-deprecated
  grid-internal
  ns-alt
  ns-dblcolon
  ns-hooks
  ns-internals
  ns-lowlev
  ns-reflect.Rd
  tools-internal
  ts-defunct
  utils-deprecated
  utils-internal
  ... and another

 
 These are not R functions or reserved works. They are just the names of 
 help pages.  They should not occur in code (except in double-quoted 
 strings).
 
 -thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] An small suggestion for the R team

2005-08-04 Thread Jose Claudio Faria
Ok Duncan,

Thank you very much for the tip!
I will to remove all these identifiers.

Regards,
--
Jose Claudio Faria

Duncan Murdoch wrote:
 Jose Claudio Faria wrote:
 
 Hi all,

 I would like to suggest that all R functions/etc like:
codes-deprecated
grid-internal
ns-alt
ns-dblcolon
ns-hooks
ns-internals
ns-lowlev
ns-reflect.Rd
tools-internal
ts-defunct
utils-deprecated
utils-internal
... and another

 i.e, function/word separate for '-'

 were all substituted by
codes_deprecated
grid_internal
ns_alt
ns_dblcolon
ns_hooks
ns_internals
ns_lowlev
ns_reflect.Rd
tools_internal
ts_defunct
utils_deprecated
utils_internal
... and another

 i.e, by '_'

 Because it is impossible to make a good highlighter with the first one.
 
 
 Your suggested names are all valid identifiers.  The - gives strings 
 that are not.  That's intentional; the help system relies on it.  You 
 find utils-deprecated using deprecated?utils.
 
 If your highlighter has problems with the -, then don't bother 
 highlighting.  Those names aren't very common.  Or...
 

 How to differentiating myValue:

 varOne = 100
 varTwo = 50
 myValue = varOne-varTwo

 from codes-deprecated, or ns-alt, for example.
 
 
 ...you could do it by context.  codes-deprecated will only show up in 
 *.Rd files as a possible search term:  an alias, or a name.
 
 Duncan Murdoch


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with Mahalanobis

2005-07-10 Thread Jose Claudio Faria
Well, as I did not get a satisfactory reply to the original question I tried to 
make a basic function that, I find, solve the question.

I think it is not the better function, but it is working.

So, perhaps it can be useful to other people.

#
# Calculate the matrix of Mahalanobis Distances between groups
# from data.frames
#
# by: José Cláudio Faria
# date: 10/7/05 13:23:48
#

D2Mah = function(y, x) {

   stopifnot(is.data.frame(y), !missing(x))
   stopifnot(dim(y)[1] != dim(x)[1])
   y= as.matrix(y)
   x= as.factor(x)
   man  = manova(y ~ x)
   E= summary(man)$SS[2] #Matrix E
   S= as.matrix(E$Residuals)/man$df.residual
   InvS = solve(S)
   mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

   colnames(mds) = names(y)
   Objects   = levels(x)
   rownames(mds) = Objects

   library(gtools)
   nObjects = nrow(mds)
   comb = combinations(nObjects, 2)

   tmpD2 = numeric()
   for (i in 1:dim(comb)[1]){
 a = comb[i,1]
 b = comb[i,2]
 tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
   }

   # Thanks Gabor for the below
   tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
   tmpMah[lower.tri(tmpMah)] = tmpD2
   D2 = tmpMah + t(tmpMah)
   return(D2)
}

#
# To try
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)

Thanks all for the complementary aid (specially to Gabor).

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with Mahalanobis

2005-07-10 Thread Jose Claudio Faria
Indeed, it is very nice Gabor (as always)!

So, a doubt: how to preserve the 'rowname' and 'colname' of D2, like in the 
first function? I think it is useful to posterior analyzes (as cluster, for 
example).

Regards,

# A small correction (reference to gtools was eliminated)
D2Mah2 = function(y, x) {
   stopifnot(is.data.frame(y), !missing(x))
   stopifnot(dim(y)[1] != dim(x)[1])
   y= as.matrix(y)
   x= as.factor(x)
   man  = manova(y ~ x)
   E= summary(man)$SS[2] #Matrix E
   S= as.matrix(E$Residuals)/man$df.residual
   InvS = solve(S)
   mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
   nObjects = nrow(mds)
   f = function(a,b) mapply(function(a,b)
 (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a, b)
   D2 = outer(seq(nObjects), seq(nObjects), f)
}

#
# test
#
D2M2 = D2Mah2(iris[,1:4], iris[,5])
print(D2M2)

Gabor Grothendieck wrote:
 I think you could simplify this by replacing everything after the
 nObjects = nrow(mds) line with just these two statements.
 
   f - function(a,b) mapply(function(a,b)
 (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
 
   D2 - outer(seq(nObjects), seq(nObjects), f)
 
 This also eliminates dependence on gtools and the complexity
 of dealing with triangular matrices.
 
 Regards.
 
 Here it is in full:
 
 D2Mah2 = function(y, x) {
 
   stopifnot(is.data.frame(y), !missing(x))
   stopifnot(dim(y)[1] != dim(x)[1])
   y= as.matrix(y)
   x= as.factor(x)
   man  = manova(y ~ x)
   E= summary(man)$SS[2] #Matrix E
   S= as.matrix(E$Residuals)/man$df.residual
   InvS = solve(S)
   mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
 
   library(gtools)
   nObjects = nrow(mds)
 
   ### changed part is next two statements
   f - function(a,b) mapply(function(a,b)
 (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)
 
   D2 - outer(seq(nObjects), seq(nObjects), f)
 }
 
 #
 # test
 #
 D2M2 = D2Mah2(iris[,1:4], iris[,5])
 print(D2M2)
 
 
 
 
 On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:
 
Well, as I did not get a satisfactory reply to the original question I tried 
to
make a basic function that, I find, solve the question.

I think it is not the better function, but it is working.

So, perhaps it can be useful to other people.

#
# Calculate the matrix of Mahalanobis Distances between groups
# from data.frames
#
# by: José Cláudio Faria
# date: 10/7/05 13:23:48
#

D2Mah = function(y, x) {

  stopifnot(is.data.frame(y), !missing(x))
  stopifnot(dim(y)[1] != dim(x)[1])
  y= as.matrix(y)
  x= as.factor(x)
  man  = manova(y ~ x)
  E= summary(man)$SS[2] #Matrix E
  S= as.matrix(E$Residuals)/man$df.residual
  InvS = solve(S)
  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

  colnames(mds) = names(y)
  Objects   = levels(x)
  rownames(mds) = Objects

  library(gtools)
  nObjects = nrow(mds)
  comb = combinations(nObjects, 2)

  tmpD2 = numeric()
  for (i in 1:dim(comb)[1]){
a = comb[i,1]
b = comb[i,2]
tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
  }

  # Thanks Gabor for the below
  tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
  tmpMah[lower.tri(tmpMah)] = tmpD2
  D2 = tmpMah + t(tmpMah)
  return(D2)
}

#
# To try
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)

Thanks all for the complementary aid (specially to Gabor).

Regards,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 
 
 Esta mensagem foi verificada pelo E-mail Protegido Terra.
 Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - 
 Dat 4531
 Proteja o seu e-mail Terra: http://mail.terra.com.br/
 
 
 


-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with Mahalanobis

2005-07-10 Thread Jose Claudio Faria
I think we now have a very good R function here.
Thanks for all Gabor!

JCFaria

Gabor Grothendieck wrote:
 This one adds the labels:
 
 
 D2Mah4 = function(y, x) {
 
  stopifnot(is.data.frame(y), !missing(x))
  stopifnot(dim(y)[1] != dim(x)[1])
  y= as.matrix(y)
  x= as.factor(x)
  man  = manova(y ~ x)
  E= summary(man)$SS[2] #Matrix E
  S= as.matrix(E$Residuals)/man$df.residual
  InvS = solve(S)
  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
 
  f - function(a,b) mapply(function(a,b)
mahalanobis(mds[a,],mds[b,],InvS,TRUE), a, b)
  seq. - seq(length = nrow(mds))
  names(seq.) - levels(x)
  D2 - outer(seq., seq., f)
 }
 
 #
 # test
 #
 D2M4 = D2Mah4(iris[,1:4], iris[,5])
 print(D2M4)
 
 
 
 On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:
 
Indeed, it is very nice Gabor (as always)!

So, a doubt: how to preserve the 'rowname' and 'colname' of D2, like in the
first function? I think it is useful to posterior analyzes (as cluster, for
example).

Regards,

# A small correction (reference to gtools was eliminated)
D2Mah2 = function(y, x) {
  stopifnot(is.data.frame(y), !missing(x))
  stopifnot(dim(y)[1] != dim(x)[1])
  y= as.matrix(y)
  x= as.factor(x)
  man  = manova(y ~ x)
  E= summary(man)$SS[2] #Matrix E
  S= as.matrix(E$Residuals)/man$df.residual
  InvS = solve(S)
  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
  nObjects = nrow(mds)
  f = function(a,b) mapply(function(a,b)
(mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a, b)
  D2 = outer(seq(nObjects), seq(nObjects), f)
}

#
# test
#
D2M2 = D2Mah2(iris[,1:4], iris[,5])
print(D2M2)

Gabor Grothendieck wrote:

I think you could simplify this by replacing everything after the
nObjects = nrow(mds) line with just these two statements.

  f - function(a,b) mapply(function(a,b)
(mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)

  D2 - outer(seq(nObjects), seq(nObjects), f)

This also eliminates dependence on gtools and the complexity
of dealing with triangular matrices.

Regards.

Here it is in full:

D2Mah2 = function(y, x) {

  stopifnot(is.data.frame(y), !missing(x))
  stopifnot(dim(y)[1] != dim(x)[1])
  y= as.matrix(y)
  x= as.factor(x)
  man  = manova(y ~ x)
  E= summary(man)$SS[2] #Matrix E
  S= as.matrix(E$Residuals)/man$df.residual
  InvS = solve(S)
  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

  library(gtools)
  nObjects = nrow(mds)

  ### changed part is next two statements
  f - function(a,b) mapply(function(a,b)
(mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b)

  D2 - outer(seq(nObjects), seq(nObjects), f)
}

#
# test
#
D2M2 = D2Mah2(iris[,1:4], iris[,5])
print(D2M2)




On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:


Well, as I did not get a satisfactory reply to the original question I 
tried to
make a basic function that, I find, solve the question.

I think it is not the better function, but it is working.

So, perhaps it can be useful to other people.

#
# Calculate the matrix of Mahalanobis Distances between groups
# from data.frames
#
# by: José Cláudio Faria
# date: 10/7/05 13:23:48
#

D2Mah = function(y, x) {

 stopifnot(is.data.frame(y), !missing(x))
 stopifnot(dim(y)[1] != dim(x)[1])
 y= as.matrix(y)
 x= as.factor(x)
 man  = manova(y ~ x)
 E= summary(man)$SS[2] #Matrix E
 S= as.matrix(E$Residuals)/man$df.residual
 InvS = solve(S)
 mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

 colnames(mds) = names(y)
 Objects   = levels(x)
 rownames(mds) = Objects

 library(gtools)
 nObjects = nrow(mds)
 comb = combinations(nObjects, 2)

 tmpD2 = numeric()
 for (i in 1:dim(comb)[1]){
   a = comb[i,1]
   b = comb[i,2]
   tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,])
 }

 # Thanks Gabor for the below
 tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects))
 tmpMah[lower.tri(tmpMah)] = tmpD2
 D2 = tmpMah + t(tmpMah)
 return(D2)
}

#
# To try
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)

Thanks all for the complementary aid (specially to Gabor).

Regards,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html



Esta mensagem foi verificada pelo E-mail Protegido Terra.
Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - 
Dat 4531
Proteja o seu e-mail Terra: http://mail.terra.com.br/





--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting

Re: [R] Help: Mahalanobis distances between 'Species' from iris

2005-07-10 Thread Jose Claudio Faria
Dear Mulholland,

I'm sorry, I was not clearly and objective sufficiently.
Below you can see what I'm was trying to do:

# D2Mah by JCFaria and Gabor Grothiendieck: 10/7/05 20:46:41
D2Mah = function(y, x) {

  stopifnot(is.data.frame(y), !missing(x))
  stopifnot(dim(y)[1] != dim(x)[1])
  y= as.matrix(y)
  x= as.factor(x)
  man  = manova(y ~ x)
  E= summary(man)$SS[2] #Matrix E
  S= as.matrix(E$Residuals)/man$df.residual
  InvS = solve(S)
  mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

  f = function(a,b) mapply(function(a,b)
mahalanobis(mds[a,], mds[b,], InvS, TRUE), a, b)
  seq. = seq(length = nrow(mds))
  names(seq.) = levels(x)
  D2 = outer(seq., seq., f)
}

#
# test
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)
dend = hclust(as.dist(sqrt(D2M)), method='complete')
plot(dend)

So, Thanks for the reply.
Best,

JCFaria

Mulholland, Tom wrote:
 Why don't you use mahalanobis in the stats package. The function Returns the 
 Mahalanobis distance of all rows in 'x' and the vector mu='center' with 
 respect to Sigma='cov'. This is (for vector 'x') defined as
 
  D^2 = (x - mu)' Sigma^{-1} (x - mu)
 
 I don't have any idea what this does but it appears to be talking about the 
 same topic. If it is not suitable package fpc has related functions and 
 adehabitat has a function for Habitat Suitability Mapping with Mahalanobis 
 Distances
 
 Tom
 
 
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Jose 
Claudio Faria
Sent: Thursday, 7 July 2005 5:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Help: Mahalanobis distances between 'Species' from iris


Dear R list,

I'm trying to calculate Mahalanobis distances for 'Species' 
of 'iris' data
as obtained below:

Squared Distance to Species From Species:

   Setosa Versicolor Virginica
Setosa   0   89.86419 179.38471
Versicolor  89.86419  0  17.20107
Virginica  179.38471   17.20107 0

This distances above were obtained with proc 'CANDISC' of SAS, please,
see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/
chap21/sect19.htm

 From this distance my intention is to make a cluster 
analysis as below, using
the package 'mclust':

#
# --- Begin R script ---
#
# For units compatibility of 'iris' from R dataset and 'iris' 
data used in
# the SAS example:
Measures = iris[,1:4]*10
Species  = iris[,5]
irisSAS  = data.frame(Measures, Species)

n   = 3
Mah = c(0,
  89.86419,0,
 179.38471, 17.20107, 0)

# My Question is: how to obtain 'Mah' with R from 'irisSAS' data?

D = matrix(0, n, n)

nam = c('Set', 'Ver', 'Vir')
rownames(D) = nam
colnames(D) = nam

k = 0
for (i in 1:n) {
for (j in 1:i) {
   k  = k+1
   D[i,j] = Mah[k]
   D[j,i] = Mah[k]
}
}

D=sqrt(D) #D2 - D

library(mclust)
dendroS = hclust(as.dist(D), method='single')
dendroC = hclust(as.dist(D), method='complete')

win.graph(w = 3.5, h = 6)
split.screen(c(2, 1))
screen(1)
plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')

screen(2)
plot(dendroC, main='Complete', sub='', xlab='', col='red')
#
# --- End R script ---
#

I always need of this type of analysis and I'm not founding 
how to make it in 
the CRAN documentation (Archives, packages: mclust, cluster, 
fpc and mva).

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help with Mahalanobis

2005-07-09 Thread Jose Claudio Faria
Christian Hennig wrote:

 Dear Jose,
 
 normal mixture clustering (mclust) operates on points times variables data
 and not on a distance matrix. Therefore
 it doesn't make sense to compute Mahalanobis distances before using
 mclust.
 Furthermore, cluster analysis based on distance matrices (hclust or pam,
 say) operates on a point by point distance matrix (be it Mahalanobis or
 something else). You show a group by group matrix below, for which I don't
 see any purpose in cluster analysis.
 Have you looked at function mahalanobis?
 
 Christian

Dear Christian,

First of all, thanks for the reply!

So, multivariate analysis is not my field of domain, I'm studying this because
it is necessary in my works.

I'm using 'iris' only as an example of my real problem, because I normally work
with many response variables (5 or more), with replicates (10 or more) of many
groups (20 or more). In these cases, I think, the final dendogram using 'mclust'
package is not very good/clear.

I learned, in these cases, that the generalized distance of Mahalanobis,
obtained as in the prior example (see script), is one of the best choice to
study the similarity between the groups. Do you agree?

If yes, I need to cluster the objects from this matrix of distances between the
groups. My option by 'mclust' package was because I'm studying also it, no more,
and I think that, for the purpose, it works nice.

Could you help me about another (and simple) choice of analyze?

JCFaria

 On Fri, 8 Jul 2005, Jose Claudio Faria wrote:
 
 
Dear R list,

I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data
as obtained below:

Squared Distance to Species From Species:

   Setosa Versicolor Virginica
Setosa   0   89.86419 179.38471
Versicolor  89.86419  0  17.20107
Virginica  179.38471   17.20107 0

These distances were obtained with proc 'CANDISC' of SAS, please,
see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm

 From these distances my intention is to make a cluster analysis as below, 
 using
the package 'mclust':

In prior mail, my basic question was: how to obtain this matrix with R
from 'iris' data?

Well, I think that the basic soluction to calculate this distances is:

#
# --- Begin R script 1 ---
#
x   = as.matrix(iris[,1:4])
tra = iris[,5]

man = manova(x ~ tra)

# Mahalanobis
E= summary(man)$SS[2] #Matrix E
S= as.matrix(E$Residuals)/man$df.residual
InvS = solve(S)
ms = matrix(unlist(by(x, tra, mean)), byrow=T, ncol=ncol(x))
colnames(ms) = names(iris[1:4])
rownames(ms) = c('Set', 'Ver', 'Vir')
D2.12 = (ms[1,] - ms[2,])%*%InvS%*%(ms[1,] - ms[2,])
print(D2.12)
D2.13 = (ms[1,] - ms[3,])%*%InvS%*%(ms[1,] - ms[3,])
print(D2.13)
D2.23 = (ms[2,] - ms[3,])%*%InvS%*%(ms[2,] - ms[3,])
print(D2.23)
#
# --- End R script 1 ---
#

Well, I would like to generalize a soluction to obtain
the matrices like 'Mah' (below) or a complete matrix like in the
Output 21.1.2. Somebody could help me?

#
# --- Begin R script 2 ---
#

Mah = c(0,
  89.86419,0,
 179.38471, 17.20107, 0)

n = 3
D = matrix(0, n, n)

nam = c('Set', 'Ver', 'Vir')
rownames(D) = nam
colnames(D) = nam

k = 0
for (i in 1:n) {
for (j in 1:i) {
   k  = k+1
   D[i,j] = Mah[k]
   D[j,i] = Mah[k]
}
}

D=sqrt(D) #D2 - D

library(mclust)
dendroS = hclust(as.dist(D), method='single')
dendroC = hclust(as.dist(D), method='complete')

win.graph(w = 3.5, h = 6)
split.screen(c(2, 1))
screen(1)
plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')

screen(2)
plot(dendroC, main='Complete', sub='', xlab='', col='red')
#
# --- End R script 2 ---
#

I always need of this type of analysis and I'm not founding how to make it in
the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva).

Regards,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

 
 
 *** NEW ADDRESS! ***
 Christian Hennig
 University College London, Department of Statistical Science
 Gower St., London WC1E 6BT, phone +44 207 679 1698
 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche
 
 Esta mensagem foi verificada pelo E-mail Protegido Terra.
 Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - 
 Dat 4531
 Proteja o seu e-mail Terra: http://mail.terra.com.br/
 
 


-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http

[R] Help to make a specific matrix

2005-07-09 Thread Jose Claudio Faria
Dear R users,

The solution is probably simple but I need someone to point me to it.
How can I to generate a matrix from a numeric sequence of 1:10 like 'A' or 'B' 
below:

A)
||
|  1  2  3  4  5 |
||
| 1 |  0 |
| 2 |  1  0  |
| 3 |  2  5  0   |
| 4 |  3  6  8  0|
| 5 |  4  7  9 10  0 |
||

B)
||
|  1  2  3  4  5 |
||
| 1 |  0  1  2  3  4 |
| 2 |  1  0  5  6  7 |
| 3 |  2  5  0  8  9 |
| 4 |  3  6  8  0 10 |
| 5 |  4  7  9 10  0 |
||

This question is related with the possible combinations of five objects two the 
two, i.e, C(5,2).

Any help would be greatly appreciated.

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Help to make a specific matrix

2005-07-09 Thread Jose Claudio Faria
Gabor Grothendieck wrote:
 On 7/9/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:
 
Dear R users,

The solution is probably simple but I need someone to point me to it.
How can I to generate a matrix from a numeric sequence of 1:10 like 'A' or 'B'
below:

A)
||
|  1  2  3  4  5 |
||
| 1 |  0 |
| 2 |  1  0  |
| 3 |  2  5  0   |
| 4 |  3  6  8  0|
| 5 |  4  7  9 10  0 |
||

B)
||
|  1  2  3  4  5 |
||
| 1 |  0  1  2  3  4 |
| 2 |  1  0  5  6  7 |
| 3 |  2  5  0  8  9 |
| 4 |  3  6  8  0 10 |
| 5 |  4  7  9 10  0 |
||

 
 
 Try this and see ?lower.tri
 
 A - matrix(0,5,5)
 A[lower.tri(A)] - 1:10
 B - A + t(A)

Dear Gabor,

Thank you, very much: this soluction is nice!

Best,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help with Mahalanobis

2005-07-08 Thread Jose Claudio Faria
Dear R list,

I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data
as obtained below:

Squared Distance to Species From Species:

   Setosa Versicolor Virginica
Setosa 0   89.86419 179.38471
Versicolor  89.86419  0  17.20107
Virginica  179.38471   17.20107 0

These distances were obtained with proc 'CANDISC' of SAS, please,
see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm

 From these distances my intention is to make a cluster analysis as below, using
the package 'mclust':

In prior mail, my basic question was: how to obtain this matrix with R
from 'iris' data?

Well, I think that the basic soluction to calculate this distances is:

#
# --- Begin R script 1 ---
#
x   = as.matrix(iris[,1:4])
tra = iris[,5]

man = manova(x ~ tra)

# Mahalanobis
E= summary(man)$SS[2] #Matrix E
S= as.matrix(E$Residuals)/man$df.residual
InvS = solve(S)
ms = matrix(unlist(by(x, tra, mean)), byrow=T, ncol=ncol(x))
colnames(ms) = names(iris[1:4])
rownames(ms) = c('Set', 'Ver', 'Vir')
D2.12 = (ms[1,] - ms[2,])%*%InvS%*%(ms[1,] - ms[2,])
print(D2.12)
D2.13 = (ms[1,] - ms[3,])%*%InvS%*%(ms[1,] - ms[3,])
print(D2.13)
D2.23 = (ms[2,] - ms[3,])%*%InvS%*%(ms[2,] - ms[3,])
print(D2.23)
#
# --- End R script 1 ---
#

Well, I would like to generalize a soluction to obtain
the matrices like 'Mah' (below) or a complete matrix like in the
Output 21.1.2. Somebody could help me?

#
# --- Begin R script 2 ---
#

Mah = c(0,
  89.86419,0,
 179.38471, 17.20107, 0)

n = 3
D = matrix(0, n, n)

nam = c('Set', 'Ver', 'Vir')
rownames(D) = nam
colnames(D) = nam

k = 0
for (i in 1:n) {
for (j in 1:i) {
   k  = k+1
   D[i,j] = Mah[k]
   D[j,i] = Mah[k]
}
}

D=sqrt(D) #D2 - D

library(mclust)
dendroS = hclust(as.dist(D), method='single')
dendroC = hclust(as.dist(D), method='complete')

win.graph(w = 3.5, h = 6)
split.screen(c(2, 1))
screen(1)
plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')

screen(2)
plot(dendroC, main='Complete', sub='', xlab='', col='red')
#
# --- End R script 2 ---
#

I always need of this type of analysis and I'm not founding how to make it in
the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva).

Regards,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Tables: Invitation to make a collective package

2005-07-07 Thread Jose Claudio Faria
, breaks, right)
 tmpList - c(tmpList, list(tbl))
   }
 }
 valCol - logCol[logCol]
 names(tmpList) - names(valCol)
 return(tmpList)
   }

   # User defines one factor
   else {
 namesdf - names(df)
 pos - which(namesdf == by)
 stopifnot(is.factor((df[[pos]])))
 numF- table(df[[pos]])
 for(i in 1:length(numF)) {
   tmpdf  - subset(df, df[[pos]] == names(numF[i]))
   logCol - sapply(tmpdf, is.numeric)
   for (j in 1:ncol(tmpdf)) {
 if (logCol[j]) {
   x- as.matrix(tmpdf[ ,j])
   tbl  - tb.make.table.II(x, k, breaks, right)
   newFY- list(tbl)
   nameF- names(numF[i])
   nameY- names(logCol[j])
   nameFY   - paste(nameF,'.', nameY, sep=)
   names(newFY) - sub(' +$', '', nameFY)
   tmpList  - c(tmpList, newFY)
 }
   }
 }
   }
   return(tmpList)
}


# Tables package   #
# to try   #


# 1.Tables
# 1.1. Tables from vectors

# Making a vector
set.seed(1)
x=rnorm(100, 5, 1)
#x=as.factor(rep(1:10, 10))  # to try

tbl - tb.table(x)
print(tbl); cat('\n')

# Equal to above
tbl - tb.table(x, breaks='Sturges')
print(tbl); cat('\n')

tbl - tb.table(x, breaks='Scott')
print(tbl); cat('\n')

tbl - tb.table(x, breaks='FD')
print(tbl); cat('\n')

tbl - tb.table(x, breaks='F', right=T)
print(tbl); cat('\n')

tbl - tb.table(x, k=4)
print(tbl); cat('\n')

tbl - tb.table(x, k=20)
print(tbl); cat('\n')

# Partial
tbl - tb.table(x, start=4, end=6)
print(tbl); cat('\n')

# Partial
tbl - tb.table(x, start=4.5, end=5.5)
print(tbl); cat('\n')

# Nonsense
tbl - tb.table(x, start=0, end=10, h=.5)
print(tbl); cat('\n')

# First and last class forced (fi=0)
tbl - tb.table(x, start=1, end=9, h=1)
print(tbl); cat('\n')

tbl - tb.table(x, start=1, end=10, h=2)
print(tbl); cat('\n')


# 1.2. Tables from data.frame

# 1.2.1. Making a data.frame
mdf=data.frame(X1=rep(LETTERS[1:4], 25),
X2=as.factor(rep(1:10, 10)),
Y1=c(NA, NA, rnorm(96, 10, 1), NA, NA),
Y2=rnorm(100, 58, 4),
Y3=c(NA, NA, rnorm(98, -20, 2)))

tbl - tb.table(mdf)
print(tbl)

# Equal to above
tbl - tb.table(mdf, breaks='Sturges')
print(tbl)

tbl - tb.table(mdf, breaks='Scott')
print(tbl)

tbl - tb.table(mdf, breaks='FD')
print(tbl)

tbl - tb.table(mdf, k=4)
print(tbl)

tbl - tb.table(mdf, k=10)
print(tbl)

levels(mdf$X1)
tbl=tb.table(mdf, k=5, by='X1')
length(tbl)
names(tbl)
print(tbl)

tbl=tb.table(mdf, breaks='FD', by='X1')
print(tbl)

# A 'big' result: X2 is a factor with 10 levels!
tbl=tb.table(mdf, breaks='FD', by='X2')
print(tbl)

# 1.2.2. Using 'iris'
tbl=tb.table(iris, k=5)
print(tbl)

levels(iris$Species)
tbl=tb.table(iris, k=5, by='Species')
length(tbl)
names(tbl)
print(tbl)

tbl=tb.table(iris, k=5, by='Species', right=T)
print(tbl)

tbl=tb.table(iris, breaks='FD', by='Species')
print(tbl)

library(MASS)
levels(Cars93$Origin)
tbl=tb.table(Cars93, k=5, by='Origin')
names(tbl)
print(tbl)

tbl=tb.table(Cars93, breaks='FD', by='Origin')
print(tbl)

I find that this package would be very useful and would like to hear the 
opinion 
of the interested parties in participating.

Best regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help: Mahalanobis distances between 'Species' from iris

2005-07-06 Thread Jose Claudio Faria
Dear R list,

I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data
as obtained below:

Squared Distance to Species From Species:

   Setosa Versicolor Virginica
Setosa 0   89.86419 179.38471
Versicolor  89.86419  0  17.20107
Virginica  179.38471   17.20107 0

This distances above were obtained with proc 'CANDISC' of SAS, please,
see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm

 From this distance my intention is to make a cluster analysis as below, using
the package 'mclust':

#
# --- Begin R script ---
#
# For units compatibility of 'iris' from R dataset and 'iris' data used in
# the SAS example:
Measures = iris[,1:4]*10
Species  = iris[,5]
irisSAS  = data.frame(Measures, Species)

n   = 3
Mah = c(0,
  89.86419,0,
 179.38471, 17.20107, 0)

# My Question is: how to obtain 'Mah' with R from 'irisSAS' data?

D = matrix(0, n, n)

nam = c('Set', 'Ver', 'Vir')
rownames(D) = nam
colnames(D) = nam

k = 0
for (i in 1:n) {
for (j in 1:i) {
   k  = k+1
   D[i,j] = Mah[k]
   D[j,i] = Mah[k]
}
}

D=sqrt(D) #D2 - D

library(mclust)
dendroS = hclust(as.dist(D), method='single')
dendroC = hclust(as.dist(D), method='complete')

win.graph(w = 3.5, h = 6)
split.screen(c(2, 1))
screen(1)
plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')

screen(2)
plot(dendroC, main='Complete', sub='', xlab='', col='red')
#
# --- End R script ---
#

I always need of this type of analysis and I'm not founding how to make it in 
the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva).

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Cluster of iris data set from Mahalanobis distances

2005-07-02 Thread Jose Claudio Faria
Dear R list,

My question:

I'm trying to calculate Mahalanobis distances for 'Species' from the iris data 
set as below:

cat('\n')
cat('Cluster analyse of iris data\n')
cat('from Mahalanobis distance obtained of SAS\n')
cat('\n')

n   = 3
dat = c(0,
   89.86419,   0,
  179.38471,17.20107,   0)

D = matrix(0, n, n)

nam = c('Set', 'Ver', 'Vir')
rownames(D) = nam
colnames(D) = nam

k = 0
for (i in 1:n) {
   for (j in 1:i) {
  k  = k+1
  D[i,j] = dat[k]
  D[j,i] = dat[k]
   }
}

D=sqrt(D) #D2  - D

dendroS = hclust(as.dist(D), method='single')
dendroC = hclust(as.dist(D), method='complete')

win.graph(w = 3.5, h = 6)
split.screen(c(2, 1))
screen(1)
plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')

screen(2)
plot(dendroC, main='Complete', sub='', xlab='', col='red')


I'm not fouding how to make it in the CRAN documentation (Archives, packages: 
mclust, cluster, fpc and mva).

Could help me?

Regards,
-- 
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
  [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Re: R GUI for Linux?

2005-06-01 Thread Jose Claudio Faria

Dears,

I know a little about Object Pascal language and I've been working (in the last 
two years) with the Tinn-R development (www.sciview.org/Tinn-R).


This work started adapting Tinn (a good frame, but with limitations) as an R 
script editor.


Tinn is a small ASCII file editor primarily intended as a better replacement
of the default Notepad.exe under Windows. Tinn is the recursive acronym
'Tinn is not Notepad'. Tinn-R is an extension of Tinn that provides additional
tools to control R (Rgui in MDI or SDI mode, see http://cran.r-project.org,
SciViews R console, see http://www.sciviews.org/SciViews-R) or S-Plus. As such, 
Tinn-R is a feature-rich replacement of the basic script editor provided with 
Rgui. It provides advanced syntax-highlighting, submission of code in whole, or 
line-by-line, and many other useful tools to ease writing and debugging of R 
code. Both Tinn and Tinn-R are distributed under the GPL 2 or above license.


So, I think (but I'm very suspicious), it's a nice R script editor running only 
under Windows.


Otherwise, I don't know nothing about C/C++ language.
My question is, why we not provide (starting from an open source editor like 
Bluefish, Kate, or another good editor under Linux) the resources of Tinn-R. I 
think is not very hard (or impossible) translate the Tinn-R functions/procedures 
from Object Pascal to C/C++.


I think (as coordinator of Tinn-R team) that we can help with this work. All we 
need is start it and that one person (C/C++ programmer) coordinate the works.


Emacs + ESS, in my opinion, is not adequate for beginning.

Best regards,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Contingency tables from data.frames

2005-05-27 Thread Jose Claudio Faria

The final version with the help of Gabor Grotendieck (thanks Gabor, very much!)

###
#   EasieR - Package  #
###

# Common function
er.make.table - function(x,
  start,
  end,
  h,
  right) {
  # Absolut frequency
  f - table(cut(x, br=seq(start, end, h), right=right))

  # Relative frequency
  fr - f/length(x)

  # Relative frequency, %
  frP - 100*(f/length(x))

  # Cumulative frequency
  fac - cumsum(f)

  # Cumulative frequency, %
  facP - 100*(cumsum(f/length(x)))

  fi   - round(f, 2)
  fr   - round(as.numeric(fr), 2)
  frP  - round(as.numeric(frP), 2)
  fac  - round(as.numeric(fac), 2)
  facP - round(as.numeric(facP),2)

  # Make final table
  res - data.frame(fi, fr, frP, fac, facP)
  names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
  return(res)

}

#With Gabor Grotendieck suggestions (thanks Gabor, very much!)
er.table - function(x, ...) UseMethod(er.table)

er.table.default - function(x,
 k,
 start,
 end,
 h,
 breaks=c('Sturges', 'Scott', 'FD'),
 right=FALSE) {

  #User define nothing or not 'x' isn't numeric - stop
  stopifnot(is.numeric(x))

  #User define only 'x'
  #(x, {k, start, end, h}, [breaks, right])
  if (missing(k)  missing(start)  missing(end)  missing(h) ){

x - na.omit(x)

brk - match.arg(breaks)
switch(brk,
   Sturges = k - nclass.Sturges(x),
   Scott   = k - nclass.scott(x),
   FD  = k - nclass.FD(x))

tmp   - range(x)
start - tmp[1] - abs(tmp[2])/100
end   - tmp[2] + abs(tmp[2])/100
R - end-start
h - R/k

  }

  #User define 'x' and 'k'
  #(x, k, {start, end, h}, [breaks, right])
  else if (missing(start)  missing(end)  missing(h)) {

stopifnot(length(k) = 1)

x - na.omit(x)

tmp   - range(x)
start - tmp[1] - abs(tmp[2])/100
end   - tmp[2] + abs(tmp[2])/100
R - end-start
h - R/abs(k)

  }

  #User define 'x', 'start' and 'end'
  #(x, {k,} start, end, {h,} [breaks, right])
  else if (missing(k)  missing(h)) {

stopifnot(length(start) = 1, length(end) =1)

x - na.omit(x)

tmp - range(x)
R   - end-start
k   - sqrt(abs(R))
if (k  5)  k - 5 #min value of k
h   - R/k

  }

  #User define 'x', 'start', 'end' and 'h'
  #(x, {k,} start, end, h, [breaks, right])
  else if (missing(k)) {

stopifnot(length(start) = 1, length(end) = 1, length(h) = 1)
x - na.omit(x)

  }

  else stop('Error, please, see the function sintax!')

  tbl - er.make.table(x, start, end, h, right)
  return(tbl)

}

er.table.data.frame - function(df,
k,
breaks=c('Sturges', 'Scott', 'FD'),
right=FALSE) {

  stopifnot(is.data.frame(df))

  tmpList - list()
  logCol  - sapply(df, is.numeric)

  for (i in 1:ncol(df)) {

if (logCol[i]) {

  x - as.matrix(df[ ,i])
  x - na.omit(x)

  #User define only x and/or 'breaks'
  #(x, {k,}[breaks, right])
  if (missing(k)) {

brk - match.arg(breaks)
switch(brk,
   Sturges = k - nclass.Sturges(x),
   Scott   = k - nclass.scott(x),
   FD  = k - nclass.FD(x))

tmp   - range(x)
start - tmp[1] - abs(tmp[2])/100
end   - tmp[2] + abs(tmp[2])/100
R - end-start
h - R/k

  }

  #User define 'x' and 'k'
  #(x, k,[breaks, right])
  else {

tmp   - range(x)
start - tmp[1] - abs(tmp[2])/100
end   - tmp[2] + abs(tmp[2])/100
R - end-start
h - R/abs(k)

  }

  tbl - er.make.table(x, start, end, h, right)
  tmpList - c(tmpList, list(tbl))

}

  }

  valCol - logCol[logCol]
  names(tmpList) - names(valCol)
  return(tmpList)

}

Best,
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
tel: 73-3634.2779

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Contingency tables from data.frames

2005-05-25 Thread Jose Claudio Faria

Gabor Grothendieck wrote:

On 5/24/05, Jose Claudio Faria [EMAIL PROTECTED] wrote:


Dear list,

I'm trying to do a set of generic functions do make contingency tables from
data.frames. It is just running nice (I'm learning R), but I think it can be
better.

I would like to filter the data.frame, i.e, eliminate all not numeric variables.
And I don't know how to make it: please, help me.

Below one of the my functions ('er' is a mention to EasieR, because I'm trying
to do a package for myself and the my students):

#2. Tables from data.frames
#2.1---er.table.df.br (User define breaks and right)
er.table.df.br - function(df,
  breaks = c('Sturges', 'Scott', 'FD'),
  right = FALSE) {

 if (is.data.frame(df) != 'TRUE')
   stop('need data.frame data')

 dim_df - dim(df)

 tmpList - list()

 for (i in 1:dim_df[2]) {

   x - as.matrix(df[ ,i])
   x - na.omit(x)

   k - switch(breaks[1],
   'Sturges' = nclass.Sturges(x),
   'Scott'   = nclass.scott(x),
   'FD'  = nclass.FD(x),
   stop('breaks' must be 'Sturges', 'Scott' or 'FD'))

   tmp  - range(x)
   classIni - tmp[1] - tmp[2]/100
   classEnd - tmp[2] + tmp[2]/100
   R- classEnd-classIni
   h- R/k

   # Absolut frequency
   f - table(cut(x, br = seq(classIni, classEnd, h), right = right))

   # Relative frequency
   fr - f/length(x)

   # Relative frequency, %
   frP - 100*(f/length(x))

   # Cumulative frequency
   fac - cumsum(f)

   # Cumulative frequency, %
   facP - 100*(cumsum(f/length(x)))

   fi   - round(f, 2)
   fr   - round(as.numeric(fr), 2)
   frP  - round(as.numeric(frP), 2)
   fac  - round(as.numeric(fac), 2)
   facP - round(as.numeric(facP),2)

   # Table
   res - data.frame(fi, fr, frP, fac, facP)
   names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
   tmpList - c(tmpList, list(res))
 }
 names(tmpList) - names(df)
 return(tmpList)
}

To try the function:

#a) runing nice
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
mydf=data.frame(y1, y2, y3)
#tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F)
#tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
tbdf=er.table.df.br (mydf, breaks = 'FD', right=F)
print(tbdf)


#b) One of the problems
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
y4=rep(letters[1:10], 10)
mydf=data.frame(y1, y2, y3, y4)
tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
print(tbdf)




Try this:

sapply(my.data.frame, is.numeric)

Also you might want to look up:

?match.arg
?stopifnot
?ncol
?sapply
?lapply



Thanks Gabor, you suggestion solve my basic problem.
I'm working is same basic (but I think useful) functions
for begginiers.

Below you can see the set of functions:

###
#   EasyeR - Package  #
###

# Common function---
er.make.table - function(x,
  classIni,
  classEnd,
  h,
  right) {
  # Absolut frequency
  f - table(cut(x, br = seq(classIni, classEnd, h), right = right))

  # Relative frequency
  fr - f/length(x)

  # Relative frequency, %
  frP - 100*(f/length(x))

  # Cumulative frequency
  fac - cumsum(f)

  # Cumulative frequency, %
  facP - 100*(cumsum(f/length(x)))

  fi   - round(f, 2)
  fr   - round(as.numeric(fr), 2)
  frP  - round(as.numeric(frP), 2)
  fac  - round(as.numeric(fac), 2)
  facP - round(as.numeric(facP),2)

  # Table
  res - data.frame(fi, fr, frP, fac, facP)
  names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
  return(res)
}


#1. Tables from vectors
#1.1---er.table.br (User define breaks and right)---
er.table.br - function(x,
breaks = c('Sturges', 'Scott', 'FD'),
right = FALSE) {

  if (is.factor(x) || mode(x) != 'numeric') stop('need numeric data')

  x - na.omit(x)

  k - switch(breaks[1],
  'Sturges' = nclass.Sturges(x),
  'Scott'   = nclass.scott(x),
  'FD'  = nclass.FD(x),
  stop('breaks' must be 'Sturges', 'Scott' or 'FD'))

  tmp  - range(x)
  classIni - tmp[1] - abs(tmp[2])/100
  classEnd - tmp[2] + abs(tmp[2])/100
  R- classEnd-classIni
  h- R/k

  tbl - er.make.table(x, classIni, classEnd, h, right)
  return(tbl)
}


#1.2---er.table.kr (User define the class number (k) and right)-
er.table.kr - function(x,
k,
right = FALSE) {

  if (is.factor(x) || mode(x) != 'numeric') stop('need numeric data')
  if ((k == '') || (k == ' ')) stop('k not defined')

  x - na.omit(x)

  tmp  - range(x)
  classIni - tmp[1] - abs(tmp[2])/100
  classEnd - tmp[2] + abs(tmp[2])/100
  R- classEnd-classIni
  h- R/k

  tbl - er.make.table(x, classIni, classEnd, h, right

[R] Contingency tables from data.frames

2005-05-24 Thread Jose Claudio Faria

Dear list,

I'm trying to do a set of generic functions do make contingency tables from 
data.frames. It is just running nice (I'm learning R), but I think it can be 
better.


I would like to filter the data.frame, i.e, eliminate all not numeric variables.
And I don't know how to make it: please, help me.

Below one of the my functions ('er' is a mention to EasieR, because I'm trying 
to do a package for myself and the my students):


#2. Tables from data.frames
#2.1---er.table.df.br (User define breaks and right)
er.table.df.br - function(df,
   breaks = c('Sturges', 'Scott', 'FD'),
   right = FALSE) {

  if (is.data.frame(df) != 'TRUE')
stop('need data.frame data')

  dim_df - dim(df)

  tmpList - list()

  for (i in 1:dim_df[2]) {

x - as.matrix(df[ ,i])
x - na.omit(x)

k - switch(breaks[1],
'Sturges' = nclass.Sturges(x),
'Scott'   = nclass.scott(x),
'FD'  = nclass.FD(x),
stop('breaks' must be 'Sturges', 'Scott' or 'FD'))

tmp  - range(x)
classIni - tmp[1] - tmp[2]/100
classEnd - tmp[2] + tmp[2]/100
R- classEnd-classIni
h- R/k

# Absolut frequency
f - table(cut(x, br = seq(classIni, classEnd, h), right = right))

# Relative frequency
fr - f/length(x)

# Relative frequency, %
frP - 100*(f/length(x))

# Cumulative frequency
fac - cumsum(f)

# Cumulative frequency, %
facP - 100*(cumsum(f/length(x)))

fi   - round(f, 2)
fr   - round(as.numeric(fr), 2)
frP  - round(as.numeric(frP), 2)
fac  - round(as.numeric(fac), 2)
facP - round(as.numeric(facP),2)

# Table
res - data.frame(fi, fr, frP, fac, facP)
names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)')
tmpList - c(tmpList, list(res))
  }
  names(tmpList) - names(df)
  return(tmpList)
}

To try the function:

#a) runing nice
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
mydf=data.frame(y1, y2, y3)
#tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F)
#tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
tbdf=er.table.df.br (mydf, breaks = 'FD', right=F)
print(tbdf)


#b) One of the problems
y1=rnorm(100, 10, 1)
y2=rnorm(100, 58, 4)
y3=rnorm(100, 500, 10)
y4=rep(letters[1:10], 10)
mydf=data.frame(y1, y2, y3, y4)
tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F)
print(tbdf)

Could anyone give me a hint how to work around this?

PS: Excuse my bad English ;-)
--
Jose Claudio Faria
Brasil/Bahia/UESC/DCET
Estatistica Experimental/Prof. Adjunto
mails:
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html