Re: [R] Block comments in R?
Hi all, Indeed, block comment is more clean and elegant than line by line. If the R interpreter will recognizes it (I'm not sure if already recognizes), we will spend no more than few hours to make the syntax of the main editors compatible, isn't it? Regards, -- Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Small help with plot.mca
Dear list, Please, how can I get all the levels of the same factor to have the same color and different color for each factor? Script: library(MASS) plot(mca(farms, abbrev = TRUE), rows=F) Thanks in advance, -- Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Doubt about Student t distribution simulation
Dear R list, I would like to illustrate the origin of the Student t distribution using R. So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t distribution with (sample.size - 1) degree free, what is wrong with the simulation below? I think that the theoretical curve should agree with the relative frequencies of the t values calculated: #== begin options= # parameters mu= 10 sigma = 5 # size of sample n = 3 # repetitions nsim = 1 # histogram parameter nchist = 150 #== end options=== t = numeric() pop = rnorm(1, mean = mu, sd = sigma) for (i in 1:nsim) { amo.i = sample(pop, n, replace = TRUE) t[i] = (mean(amo.i) - mu) / (sigma / sqrt(n)) } win.graph(w = 5, h = 7) split.screen(c(2,1)) screen(1) hist(t, main = histogram, breaks = nchist, col = lightgray, xlab = '', ylab = Fi, font.lab = 2, font = 2) screen(2) hist(t, probability = T, main= 'f.d.p and histogram', breaks = nchist, col = 'lightgray', xlab= 't', ylab = 'f(t)', font.lab= 2, font = 2) x = t curve(dt(x, df = n-1), add = T, col = red, lwd = 2) Many thanks for any help, ___ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt about Student t distribution simulation
Dears John, Peter and Sundar, Many thanks for the quick answers!!! .. and sorry for all.. []s ___ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] John Fox escreveu: Dear Jose, The problem is that you're using the population standard deviation (sigma) rather than the sample SD of each sample [i.e., t[i] = (mean(amo.i) - mu) / (sd(amo.i) / sqrt(n)) ], so your values should be normally distributed, as they appear to be. A couple of smaller points: (1) Even after this correction, you're sampling from a discrete population (albeit with replacement) and so the values won't be exactly t-distributed. You could draw the samples directly from N(mu, sigma) instead. (2) It would be preferable to make a quantile-comparison plot against the t-distribution, since you'd get a better picture of what's going on in the tails. I hope this helps, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jose Claudio Faria Sent: Friday, August 04, 2006 3:09 PM To: R-help@stat.math.ethz.ch Subject: [R] Doubt about Student t distribution simulation Dear R list, I would like to illustrate the origin of the Student t distribution using R. So, if (sample.mean - pop.mean) / standard.error(sample.mean) has t distribution with (sample.size - 1) degree free, what is wrong with the simulation below? I think that the theoretical curve should agree with the relative frequencies of the t values calculated: #== begin options= # parameters mu= 10 sigma = 5 # size of sample n = 3 # repetitions nsim = 1 # histogram parameter nchist = 150 #== end options=== t = numeric() pop = rnorm(1, mean = mu, sd = sigma) for (i in 1:nsim) { amo.i = sample(pop, n, replace = TRUE) t[i] = (mean(amo.i) - mu) / (sigma / sqrt(n)) } win.graph(w = 5, h = 7) split.screen(c(2,1)) screen(1) hist(t, main = histogram, breaks = nchist, col = lightgray, xlab = '', ylab = Fi, font.lab = 2, font = 2) screen(2) hist(t, probability = T, main= 'f.d.p and histogram', breaks = nchist, col = 'lightgray', xlab= 't', ylab = 'f(t)', font.lab= 2, font = 2) x = t curve(dt(x, df = n-1), add = T, col = red, lwd = 2) Many thanks for any help, ___ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Esta mensagem foi verificada pelo E-mail Protegido Terra. Scan engine: McAfee VirusScan / Atualizado em 04/08/2006 / Versão: 4.4.00/4822 Proteja o seu e-mail Terra: http://mail.terra.com.br/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axis Title in persp() Overlaps with Axis Labels
Dear Kilian, Also give a looked at: http://wiki.r-project.org/rwiki/doku.php?id=graph_gallery:new-graphics You will see a new and very flexible function to 3D plot. Regards, __ Jose Claudio Faria Brasil/Bahia/Ilheus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Paul Murrell p.murrell at auckland.ac.nz writes: Hi Kilian Plank wrote: Good morning, in a 3D plot based on persp() the axis title (of dimension z) overlaps with the axis labels. How can the distance (between axis labels and axis title) be increased? Paul Another way to do it: get the perspective matrix back from persp() and use trans3d() to redo essentially the same calculations that persp() does to decide where to put the label: x - seq(-10, 10, length= 30) y - x f - function(x,y) { r - sqrt(x^2+y^2); 10 * sin(r)/r } z - outer(x, y, f) z[is.na(z)] - 1 par(mfrow=c(2, 2)) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, ticktype=detailed) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, ticktype=detailed, zlab=\n\n\n\nz) p1 - persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, ticktype=detailed,zlab=) ranges - t(sapply(list(x,y,z),range)) means - rowMeans(ranges) ## label offset distance, as a fraction of the plot width labelspace - 0.12 ## tweak this until you like the result xpos - min(x)-(diff(range(x)))*labelspace ypos - min(y)-(diff(range(y)))*labelspace labelbot3d - c(xpos,ypos,min(z)) labeltop3d - c(xpos,ypos,max(z)) labelmid3d - c(xpos,ypos,mean(range(z))) trans3dfun - function(v) { trans3d(v[1],v[2],v[3],p1) } labelbot2d - trans3dfun(labelbot3d) labelmid2d - trans3dfun(labelmid3d) labeltop2d - trans3dfun(labeltop3d) labelang - 180/pi*atan2(labeltop2d$y-labelbot2d$y,labeltop2d$x-labelbot2d$x) par(xpd=NA,srt=labelang) ## disable clipping and set string rotation text(labelmid2d[1]$x,labelmid2d$y,z label) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANOVA: Help with SSQ decomposition and contrasts
# Dear R list, # # A have a doubt about SSQ decomposition and contrasts with ANOVAs. # So, I would like a tip from more advanced R users. # Below my data, the basic script and my doubts: # Data r = paste('r', gl(3, 8), sep='') e = paste('e', rep(gl(2, 4), 3), sep='') tra = sort(paste('t', rep(1:6, 4), sep='')) y = c(26.2, 26.0, 25.0, 25.4, 24.8, 24.6, 26.7, 25.2, 25.7, 26.3, 25.1, 26.4, 19.6, 21.1, 19.0, 18.6, 22.8, 19.4, 18.8, 19.2, 19.8, 21.4, 22.8, 21.3) dF = data.frame(r, e, tra, y) # Graphic par(mfrow=c(2,1)) interaction.plot(dF$r, dF$e, dF$y, col = 'blue', ylab = 'Y', xlab = 'R') interaction.plot(dF$e, dF$r, dF$y, col = 'blue', ylab = 'Y', xlab = 'R') # ANOVAs av0 = aov(y ~ tra, data=dF) summary(av0) av1 = aov(y ~ r*e, data=dF) summary(av1) av2 = aov(y ~ r/e, data=dF) e_r = summary(av2, split = list('r:e' = list( 'e1 vs e2/r1' = 1, 'e1 vs e2/r2' = 2, 'e1 vs e2/r3' = 3))) e_r av3 = aov(y ~ e/r, data=dF) r_e = summary(av3, split = list('e:r' = list( 'r/e1' = c(1,3), 'r/e2' = c(2,4 r_e # My Doubts # a) How to make SSQ decomposition to complete the ANOVA below? #Df Sum Sq Mean Sq F valuePr(F) # e1 19.082 19.082 14.875 0.001155 # e:r 4 156.622 39.155 30.524 8.438e-08 # e:r: r/e1 2 87.122 43.561 33.958 7.776e-07 # r1 vs (r2, r3) 1 ? ? ? ? # r2 vs r3 1 ? ? ? ? # e:r: r/e2 2 69.500 34.750 27.090 3.730e-06 # r1 vs (r2, r3) 1 ? ? ? ? # r2 vs r3 1 ? ? ? ? # Residuals 18 23.090 1.283 # b) How to make SSQ decomposition to complete the ANOVA below? #Df Sum Sq Mean Sq F valuePr(F) # e1 19.082 19.082 14.875 0.001155 # e:r 4 156.622 39.155 30.524 8.438e-08 # e:r: r/e1 2 87.122 43.561 33.958 7.776e-07 # r2 vs (r1, r3) 1 ? ? ? ? # r1 vs r3 1 ? ? ? ? # e:r: r/e2 2 69.500 34.750 27.090 3.730e-06 # r2 vs (r1, r3) 1 ? ? ? ? # r1 vs r3 1 ? ? ? ? # Residuals 18 23.090 1.283 # TIA, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] linear contrasts with anova
Date: Mon, 23 Jan 2006 13:25:33 +0100 From: Christoph Buser [EMAIL PROTECTED] Subject: Re: [R] linear contrasts with anova To: Posta Univ. Cagliari [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii Dear Marco If you are interested in a comparison of a reference level against each other level (in your case level 1 against level 2 and level 1 against level 3), you can use the summary.lm(), since this is the default contrast (see ?contr.treatment) ar - data.frame(GROUP = factor(rep(1:3, each = 8)), DIP = c(3.0, 3.0, 4.0, 5.0, 6.0, 7.0, 3.0, 2.0, 1.0, 6.0, 5.0, 7.0, 2.0, 3.0, 1.5, 1.7, 17.0, 12.0, 15.0, 16.0, 12.0, 23.0, 19.0, 21.0)) r.aov10 - aov(DIP ~ GROUP, data = ar) anova(r.aov10) summary.lm(r.aov10) As result you will get the comparison GROUP 2 against GROUP 1, denoted by GROUP2 and the comparison GROUP 3 against GROUP 1, denoted by GROUP3. Be careful. In your example you include both GROUP and C1 or C2, respectively in your model. This results in a over parameterized model and you get a warning that not all coefficients have been estimated, due to singularities. It is possible to use other contrasts than contr.treatment, contr.sum, contr.helmert, contr.poly, but then you have to specify the correctly in the model. Regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH (Federal Inst. Technology)8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Dear Marco Try also the the below: # Loading packages library(gplots) library(gmodels) # Testing the nature of dF is.data.frame(dF) is.factor(dF$GROUP) is.numeric(dF$DIP) #Plotting GROUPS win.graph(w = 4, h = 5) plotmeans(DIP ~ GROUP, data = dF, mean.labels = TRUE, digits = 3, col = 'blue', connect = FALSE, ylab = 'DIP', xlab = 'GROUP', pch='') # Contrasts attach(dF) #1 2 3 GROUP cmat = rbind('1 vs. 3' = c( 1, 0, -1,), '1 vs. 2' = c( 1, -1, 0,)) av = aov(DIP ~ GROUP, data = dF, contrasts = list(GROUP = make.contrasts(cmat))) sav = summary(av1, split = list(GROUP=list('1 vs. 3'=1, '1 vs. 2'=2))) sav detach(dF) # Another option attach(dF) #A B C fit.contrast(av, GROUP, c( 1, 0, -1)) # from gmodels fit.contrast(av, GROUP, c( 1, -1, 0)) detach(dF) HTH, []s, -- Jose Claudio Faria Brasil/Bahia/Ilhéus/UESC/DCET Estatística Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R (2.2.0), R-DCOM and Delphi
Dear Dieter, First, thank you for your work! Below the error message I got when trying to compile your RDCom.dpr [Warning] STATCONNECTORCLNTLib_TLB.pas(319): Unsafe type 'EventDispIDs: Pointer' [Warning] STATCONNECTORCLNTLib_TLB.pas(320): Unsafe type 'LicenseKey: Pointer' [Warning] STATCONNECTORCLNTLib_TLB.pas(324): Unsafe code '@ operator' [Warning] STATCONNECTORCLNTLib_TLB.pas(367): Unsafe type 'EventDispIDs: Pointer' [Warning] STATCONNECTORCLNTLib_TLB.pas(368): Unsafe type 'LicenseKey: Pointer' [Warning] STATCONNECTORCLNTLib_TLB.pas(372): Unsafe code '@ operator' [Warning] STATCONNECTORCLNTLib_TLB.pas(374): Unsafe code '@ operator' [Warning] STATCONNECTORSRVLib_TLB.pas(376): Unsafe type 'LicenseKey: Pointer' [Warning] STATCONNECTORSRVLib_TLB.pas(379): Unsafe code '@ operator' [Warning] RCom.pas(93): Unsafe code 'String index to var param' [Error] RCom.pas(119): Undeclared identifier: 'VarType' [Error] RCom.pas(124): Undeclared identifier: 'VarArrayDimCount' [Error] RCom.pas(127): Undeclared identifier: 'VarArrayHighBound' [Error] RCom.pas(140): Undeclared identifier: 'VarType' [Error] RCom.pas(145): Undeclared identifier: 'VarArrayDimCount' [Error] RCom.pas(148): Undeclared identifier: 'VarArrayHighBound' [Error] RCom.pas(161): Undeclared identifier: 'VarType' [Error] RCom.pas(166): Undeclared identifier: 'VarArrayDimCount' [Error] RCom.pas(169): Undeclared identifier: 'VarArrayHighBound' [Error] RCom.pas(181): Undeclared identifier: 'VarArrayCreate' [Error] RCom.pas(196): Undeclared identifier: 'VarArrayCreate' [Error] RCom.pas(211): Undeclared identifier: 'VarArrayCreate' [Error] RCom.pas(226): Undeclared identifier: 'VarArrayCreate' [Error] RCom.pas(253): Undeclared identifier: 'VarType' [Error] RCom.pas(258): Undeclared identifier: 'VarArrayDimCount' [Error] RCom.pas(261): Undeclared identifier: 'VarArrayHighBound' [Fatal Error] RDComMain.pas(14): Could not compile used unit 'RCom.pas' I'm using Delphi 7 under WinXP pro/SP2. Could you give me a tip? Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with lattice, regressions and respective lines
# Dear R list, # # I'm needing help with lattice, regression and respective lines. # My data is below: bra = gl(2, 24, label = c('c', 's')) em = rep(gl(3, 8, label = c('po', 'pov', 'ce')), 2) tem = rep(c(0, 0, 30, 30, 60, 60, 90, 90), 6) tem2 = tem^2 r= rep(1:2, 24) y= c(40.58, 44.85, 32.55, 35.68, 64.86, 51.95, 42.52, 52.21, 40.58, 44.85, 33.46, 46.09, 12.75, 18.01, 16.82, 13.69, 40.58, 44.85, 34.45, 29.89, 34.91, 28.10, 27.52, 22.24, 48.68, 47.25, 45.58, 45.33, 41.03, 51.20, 45.85, 54.45, 48.68, 47.25, 19.88, 19.67, 16.20, 13.49, 13.75, 18.80, 48.68, 47.25, 42.19, 39.91, 34.69, 34.11, 32.74, 34.24) Df = data.frame(bra, em, tem, tem2, r, y) # Regressions attach(Df) Dfs1=subset(Df, (bra=='s' em=='pov'), select=c(bra, em, tem, tem2, r, y)) Dfs1 rlin1=lm(y ~ tem + tem2, data=Dfs1) summary(rlin1) Dfs2=subset(Df, (bra=='s' em=='po'), select=c(bra, em, tem, r, y)) Dfs2 rlin2=lm(y ~ tem, data=Dfs2) summary(rlin2) Dfs3=subset(Df, (bra=='s' em=='ce'), select=c(bra, em, tem, tem2, r, y)) Dfs3 rlin3=lm(y ~ tem + tem2, data=Dfs3) summary(rlin3) detach(Df) # I would like to plot with lattice 'y ~ tem | em', # with the panels ('po', 'pov' and 'ce'), # and the its respective regressions lines: # a) linear for panel 'po' or better, without line; # b) quadratic for 'pov' and 'ce' # Is it possible? Could somebody hel me? # I'm trying: library(lattice) attach(Df) Dfs=subset(Df, bra=='s', select=c(bra, em, tem, y)) Dfs xyplot(y ~ tem | em, data = Dfs, ylim=c(10, 60), xlim=c(-10, 110), ylab='y', xlab='Time, days', layout = c(3,1)) detach(Df) TIA, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with a more flexible funtion for multiple comparisio n of means
Rogers, James A [PGRD Groton] wrote: Jose - Before implementing SNK and Duncan's, you may want to be aware of some criticisms of these methods: From Hsu (1996), Newman-Keuls multiple range test is not a confident inequalities method and cannot be recommended. Duncan's multiple range test is not a confident inequalities method and cannot be recommended either. In the words of Tukey (1991), Duncan's multiple range test was a 'distraction' in the history of multiple comparisons, amounting to 'talking 5% while using more than 5% simultaneous' @Book{Hsu1996, author = {Jason C. Hsu}, title = {Multiple Comparisons: Theory and Methods}, publisher = {Chapman Hall}, year = {1996} } -- Jim Ok Jim, I know these limitations, but these tests exists. BTW, I am making this function for academic reasons and to learning how to make more advanced R functions. Thanks for all. Best, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with one matrix
Gabor Grothendieck wrote: On 9/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Dear R-list, Could anybody tell me how to make one matrix as the below: [,1] [,2] [,3] [,4] [,5] [,6] [1,]-23456 [2,]2-2345 [3,]32-234 [4,]432-23 [5,]5432-2 [6,]65432- Assuming that - means NA dd - diag(NA, 6) abs(col(dd) - row(dd)) + 1 + dd or abs(outer(1:6, 1:6, -)) + 1 + diag(NA,6) or f - function(x,y) ifelse(x==y, NA, abs(x-y)+1) outer(1:6, 1:6, f) Hi, You are always solving (and teaching) my R doubts: thanks Gabor, very much! Because I need one, I've been trying to make a more flexible function for multiple comparison test of means (Tukey, SNK and Duncan). The matrix above is necessary for SNK and Duncan tests. So, when running I will to sent it for you for suggestions. Best, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with a more flexible funtion for multiple comparision of means
Dear R-list, Could anybody tell me (or give me a tip) of how to implement the Duncan distribution in R? I've been trying to make a new and more flexible function for multiple comparison of means: Tukey, SNK and Duncan, from 'aov' objects, like TukeyHSD function. For while, it is running nice (Tukey and SNK), for simple design (completely randomized, randomized block and Latin squares) and simple experimental schemes (one factor). I'm needing only two informations: 'qduncan' and 'pduncan', similar to already available in R 'qtukey' and 'ptukey'. The basic algorithm implemented with SNK test will be used for Duncan test. Below a sample: a) Generating data and calling the function: tra = gl(4, 5, label = c('A', 'B', 'C', 'D')) blo = rep(1:5, 4) pro = c(NA, 26, 20, 23, 21, 31, 25, 28, 27, 24, 22, 26, NA, 25, 29, 33, 29, 31, 34, NA) x = aov(pro ~ tra) #or x= aov(pro ~ blo + tra) res = mctm(x, which='tra', test='SNK', conf.level=0.95) print(res) b) The R output: $Table Tables of means Grand mean 26.70588 tra A BC D 22.5 27 25.5 31.75 rep 4.0 5 4.0 4.00 $Ordered means tra D B C A 31.75 27.00 25.50 22.50 $Result D B C A D - * * * B * - ns ns C * ns - ns A * ns ns - $Test [1] SNK $Conf.level [1] 0.95 $Mean differences DBCA D 0.00 4.75 6.25 9.25 B 4.75 0.00 1.50 4.50 C 6.25 1.50 0.00 3.00 A 9.25 4.50 3.00 0.00 $Minimum Significative Differences - MSD DBCA D 0.00 3.83 4.93 5.48 B 3.83 0.00 3.83 4.68 C 4.93 3.83 0.00 4.04 A 5.48 4.68 4.04 0.00 $Replicates number D B C A D - 4:5 4:4 4:4 B 5:4 - 5:4 5:4 C 4:4 4:5 - 4:4 A 4:4 4:5 4:4 - Thanks in advance, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with one matrix
Hi Jim, Many thanks for the function! -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help with one matrix
Dear R-list, Could anybody tell me how to make one matrix as the below: [,1] [,2] [,3] [,4] [,5] [,6] [1,]-23456 [2,]2-2345 [3,]32-234 [4,]432-23 [5,]5432-2 [6,]65432- Thanks in advance, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Statistics with R
Hi Vincent, Thanks for your good work! Best, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] An small suggestion for the R team
Hi all, I would like to suggest that all R functions/etc like: codes-deprecated grid-internal ns-alt ns-dblcolon ns-hooks ns-internals ns-lowlev ns-reflect.Rd tools-internal ts-defunct utils-deprecated utils-internal ... and another i.e, function/word separate for '-' were all substituted by codes_deprecated grid_internal ns_alt ns_dblcolon ns_hooks ns_internals ns_lowlev ns_reflect.Rd tools_internal ts_defunct utils_deprecated utils_internal ... and another i.e, by '_' Because it is impossible to make a good highlighter with the first one. How to differentiating myValue: varOne = 100 varTwo = 50 myValue = varOne-varTwo from codes-deprecated, or ns-alt, for example. TIA, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] An small suggestion for the R team
Gabor Grothendieck wrote: On 8/4/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: On 8/4/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Hi all, I would like to suggest that all R functions/etc like: codes-deprecated grid-internal ns-alt ns-dblcolon ns-hooks ns-internals ns-lowlev ns-reflect.Rd tools-internal ts-defunct utils-deprecated utils-internal ... and another i.e, function/word separate for '-' were all substituted by codes_deprecated grid_internal ns_alt ns_dblcolon ns_hooks ns_internals ns_lowlev ns_reflect.Rd tools_internal ts_defunct utils_deprecated utils_internal ... and another i.e, by '_' Because it is impossible to make a good highlighter with the first one. How to differentiating myValue: varOne = 100 varTwo = 50 myValue = varOne-varTwo from codes-deprecated, or ns-alt, for example. One can create a variable with a minus in it by using backquotes: `a-b` - 3 a - 10 b - 4 `a-b` + a-b [1] 9 # 3 + 10 - 4 One other point. The help system, i.e. ?, accommodates minus signs like this where the last two statements below give the same result: library(survival) ?survival-internal internal?survival Another example is, library(dyn) package?dyn Indeed, the above are very good examples of how to contour the original problem Gabor! But the central problem is to make Good R highlighter, as I'm trying to do for the programs Tinn-R and jEdit with this functions. Is not possible differentiate the signal minus as a simple character (word-word) from the mathematical operator minus (object-object). I think that is more easy to remove all this functions/reserved words. TIA, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] An small suggestion for the R team
Hi THomas, This is very a good information: thanks very much! So, I will just now to remove all from the highlighters. BTW, do you know if the signal minus is used in some function or reserved word? TIA, Jose Claudio Faria Thomas Lumley wrote: On Thu, 4 Aug 2005, Jose Claudio Faria wrote: I would like to suggest that all R functions/etc like: codes-deprecated grid-internal ns-alt ns-dblcolon ns-hooks ns-internals ns-lowlev ns-reflect.Rd tools-internal ts-defunct utils-deprecated utils-internal ... and another These are not R functions or reserved works. They are just the names of help pages. They should not occur in code (except in double-quoted strings). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] An small suggestion for the R team
Ok Duncan, Thank you very much for the tip! I will to remove all these identifiers. Regards, -- Jose Claudio Faria Duncan Murdoch wrote: Jose Claudio Faria wrote: Hi all, I would like to suggest that all R functions/etc like: codes-deprecated grid-internal ns-alt ns-dblcolon ns-hooks ns-internals ns-lowlev ns-reflect.Rd tools-internal ts-defunct utils-deprecated utils-internal ... and another i.e, function/word separate for '-' were all substituted by codes_deprecated grid_internal ns_alt ns_dblcolon ns_hooks ns_internals ns_lowlev ns_reflect.Rd tools_internal ts_defunct utils_deprecated utils_internal ... and another i.e, by '_' Because it is impossible to make a good highlighter with the first one. Your suggested names are all valid identifiers. The - gives strings that are not. That's intentional; the help system relies on it. You find utils-deprecated using deprecated?utils. If your highlighter has problems with the -, then don't bother highlighting. Those names aren't very common. Or... How to differentiating myValue: varOne = 100 varTwo = 50 myValue = varOne-varTwo from codes-deprecated, or ns-alt, for example. ...you could do it by context. codes-deprecated will only show up in *.Rd files as a possible search term: an alias, or a name. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with Mahalanobis
Well, as I did not get a satisfactory reply to the original question I tried to make a basic function that, I find, solve the question. I think it is not the better function, but it is working. So, perhaps it can be useful to other people. # # Calculate the matrix of Mahalanobis Distances between groups # from data.frames # # by: José Cláudio Faria # date: 10/7/05 13:23:48 # D2Mah = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) colnames(mds) = names(y) Objects = levels(x) rownames(mds) = Objects library(gtools) nObjects = nrow(mds) comb = combinations(nObjects, 2) tmpD2 = numeric() for (i in 1:dim(comb)[1]){ a = comb[i,1] b = comb[i,2] tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]) } # Thanks Gabor for the below tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects)) tmpMah[lower.tri(tmpMah)] = tmpD2 D2 = tmpMah + t(tmpMah) return(D2) } # # To try # D2M = D2Mah(iris[,1:4], iris[,5]) print(D2M) Thanks all for the complementary aid (specially to Gabor). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with Mahalanobis
Indeed, it is very nice Gabor (as always)! So, a doubt: how to preserve the 'rowname' and 'colname' of D2, like in the first function? I think it is useful to posterior analyzes (as cluster, for example). Regards, # A small correction (reference to gtools was eliminated) D2Mah2 = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) nObjects = nrow(mds) f = function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a, b) D2 = outer(seq(nObjects), seq(nObjects), f) } # # test # D2M2 = D2Mah2(iris[,1:4], iris[,5]) print(D2M2) Gabor Grothendieck wrote: I think you could simplify this by replacing everything after the nObjects = nrow(mds) line with just these two statements. f - function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b) D2 - outer(seq(nObjects), seq(nObjects), f) This also eliminates dependence on gtools and the complexity of dealing with triangular matrices. Regards. Here it is in full: D2Mah2 = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) library(gtools) nObjects = nrow(mds) ### changed part is next two statements f - function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b) D2 - outer(seq(nObjects), seq(nObjects), f) } # # test # D2M2 = D2Mah2(iris[,1:4], iris[,5]) print(D2M2) On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Well, as I did not get a satisfactory reply to the original question I tried to make a basic function that, I find, solve the question. I think it is not the better function, but it is working. So, perhaps it can be useful to other people. # # Calculate the matrix of Mahalanobis Distances between groups # from data.frames # # by: José Cláudio Faria # date: 10/7/05 13:23:48 # D2Mah = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) colnames(mds) = names(y) Objects = levels(x) rownames(mds) = Objects library(gtools) nObjects = nrow(mds) comb = combinations(nObjects, 2) tmpD2 = numeric() for (i in 1:dim(comb)[1]){ a = comb[i,1] b = comb[i,2] tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]) } # Thanks Gabor for the below tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects)) tmpMah[lower.tri(tmpMah)] = tmpD2 D2 = tmpMah + t(tmpMah) return(D2) } # # To try # D2M = D2Mah(iris[,1:4], iris[,5]) print(D2M) Thanks all for the complementary aid (specially to Gabor). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Esta mensagem foi verificada pelo E-mail Protegido Terra. Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - Dat 4531 Proteja o seu e-mail Terra: http://mail.terra.com.br/ -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with Mahalanobis
I think we now have a very good R function here. Thanks for all Gabor! JCFaria Gabor Grothendieck wrote: This one adds the labels: D2Mah4 = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) f - function(a,b) mapply(function(a,b) mahalanobis(mds[a,],mds[b,],InvS,TRUE), a, b) seq. - seq(length = nrow(mds)) names(seq.) - levels(x) D2 - outer(seq., seq., f) } # # test # D2M4 = D2Mah4(iris[,1:4], iris[,5]) print(D2M4) On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Indeed, it is very nice Gabor (as always)! So, a doubt: how to preserve the 'rowname' and 'colname' of D2, like in the first function? I think it is useful to posterior analyzes (as cluster, for example). Regards, # A small correction (reference to gtools was eliminated) D2Mah2 = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) nObjects = nrow(mds) f = function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a, b) D2 = outer(seq(nObjects), seq(nObjects), f) } # # test # D2M2 = D2Mah2(iris[,1:4], iris[,5]) print(D2M2) Gabor Grothendieck wrote: I think you could simplify this by replacing everything after the nObjects = nrow(mds) line with just these two statements. f - function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b) D2 - outer(seq(nObjects), seq(nObjects), f) This also eliminates dependence on gtools and the complexity of dealing with triangular matrices. Regards. Here it is in full: D2Mah2 = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) library(gtools) nObjects = nrow(mds) ### changed part is next two statements f - function(a,b) mapply(function(a,b) (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]), a,b) D2 - outer(seq(nObjects), seq(nObjects), f) } # # test # D2M2 = D2Mah2(iris[,1:4], iris[,5]) print(D2M2) On 7/10/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Well, as I did not get a satisfactory reply to the original question I tried to make a basic function that, I find, solve the question. I think it is not the better function, but it is working. So, perhaps it can be useful to other people. # # Calculate the matrix of Mahalanobis Distances between groups # from data.frames # # by: José Cláudio Faria # date: 10/7/05 13:23:48 # D2Mah = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) colnames(mds) = names(y) Objects = levels(x) rownames(mds) = Objects library(gtools) nObjects = nrow(mds) comb = combinations(nObjects, 2) tmpD2 = numeric() for (i in 1:dim(comb)[1]){ a = comb[i,1] b = comb[i,2] tmpD2[i] = (mds[a,] - mds[b,])%*%InvS%*%(mds[a,] - mds[b,]) } # Thanks Gabor for the below tmpMah = matrix(0, nObjects, nObjects, dimnames=list(Objects, Objects)) tmpMah[lower.tri(tmpMah)] = tmpD2 D2 = tmpMah + t(tmpMah) return(D2) } # # To try # D2M = D2Mah(iris[,1:4], iris[,5]) print(D2M) Thanks all for the complementary aid (specially to Gabor). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Esta mensagem foi verificada pelo E-mail Protegido Terra. Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - Dat 4531 Proteja o seu e-mail Terra: http://mail.terra.com.br/ -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Help: Mahalanobis distances between 'Species' from iris
Dear Mulholland, I'm sorry, I was not clearly and objective sufficiently. Below you can see what I'm was trying to do: # D2Mah by JCFaria and Gabor Grothiendieck: 10/7/05 20:46:41 D2Mah = function(y, x) { stopifnot(is.data.frame(y), !missing(x)) stopifnot(dim(y)[1] != dim(x)[1]) y= as.matrix(y) x= as.factor(x) man = manova(y ~ x) E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y)) f = function(a,b) mapply(function(a,b) mahalanobis(mds[a,], mds[b,], InvS, TRUE), a, b) seq. = seq(length = nrow(mds)) names(seq.) = levels(x) D2 = outer(seq., seq., f) } # # test # D2M = D2Mah(iris[,1:4], iris[,5]) print(D2M) dend = hclust(as.dist(sqrt(D2M)), method='complete') plot(dend) So, Thanks for the reply. Best, JCFaria Mulholland, Tom wrote: Why don't you use mahalanobis in the stats package. The function Returns the Mahalanobis distance of all rows in 'x' and the vector mu='center' with respect to Sigma='cov'. This is (for vector 'x') defined as D^2 = (x - mu)' Sigma^{-1} (x - mu) I don't have any idea what this does but it appears to be talking about the same topic. If it is not suitable package fpc has related functions and adehabitat has a function for Habitat Suitability Mapping with Mahalanobis Distances Tom -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jose Claudio Faria Sent: Thursday, 7 July 2005 5:29 AM To: r-help@stat.math.ethz.ch Subject: [R] Help: Mahalanobis distances between 'Species' from iris Dear R list, I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data as obtained below: Squared Distance to Species From Species: Setosa Versicolor Virginica Setosa 0 89.86419 179.38471 Versicolor 89.86419 0 17.20107 Virginica 179.38471 17.20107 0 This distances above were obtained with proc 'CANDISC' of SAS, please, see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/ chap21/sect19.htm From this distance my intention is to make a cluster analysis as below, using the package 'mclust': # # --- Begin R script --- # # For units compatibility of 'iris' from R dataset and 'iris' data used in # the SAS example: Measures = iris[,1:4]*10 Species = iris[,5] irisSAS = data.frame(Measures, Species) n = 3 Mah = c(0, 89.86419,0, 179.38471, 17.20107, 0) # My Question is: how to obtain 'Mah' with R from 'irisSAS' data? D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = Mah[k] D[j,i] = Mah[k] } } D=sqrt(D) #D2 - D library(mclust) dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') # # --- End R script --- # I always need of this type of analysis and I'm not founding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with Mahalanobis
Christian Hennig wrote: Dear Jose, normal mixture clustering (mclust) operates on points times variables data and not on a distance matrix. Therefore it doesn't make sense to compute Mahalanobis distances before using mclust. Furthermore, cluster analysis based on distance matrices (hclust or pam, say) operates on a point by point distance matrix (be it Mahalanobis or something else). You show a group by group matrix below, for which I don't see any purpose in cluster analysis. Have you looked at function mahalanobis? Christian Dear Christian, First of all, thanks for the reply! So, multivariate analysis is not my field of domain, I'm studying this because it is necessary in my works. I'm using 'iris' only as an example of my real problem, because I normally work with many response variables (5 or more), with replicates (10 or more) of many groups (20 or more). In these cases, I think, the final dendogram using 'mclust' package is not very good/clear. I learned, in these cases, that the generalized distance of Mahalanobis, obtained as in the prior example (see script), is one of the best choice to study the similarity between the groups. Do you agree? If yes, I need to cluster the objects from this matrix of distances between the groups. My option by 'mclust' package was because I'm studying also it, no more, and I think that, for the purpose, it works nice. Could you help me about another (and simple) choice of analyze? JCFaria On Fri, 8 Jul 2005, Jose Claudio Faria wrote: Dear R list, I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data as obtained below: Squared Distance to Species From Species: Setosa Versicolor Virginica Setosa 0 89.86419 179.38471 Versicolor 89.86419 0 17.20107 Virginica 179.38471 17.20107 0 These distances were obtained with proc 'CANDISC' of SAS, please, see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm From these distances my intention is to make a cluster analysis as below, using the package 'mclust': In prior mail, my basic question was: how to obtain this matrix with R from 'iris' data? Well, I think that the basic soluction to calculate this distances is: # # --- Begin R script 1 --- # x = as.matrix(iris[,1:4]) tra = iris[,5] man = manova(x ~ tra) # Mahalanobis E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) ms = matrix(unlist(by(x, tra, mean)), byrow=T, ncol=ncol(x)) colnames(ms) = names(iris[1:4]) rownames(ms) = c('Set', 'Ver', 'Vir') D2.12 = (ms[1,] - ms[2,])%*%InvS%*%(ms[1,] - ms[2,]) print(D2.12) D2.13 = (ms[1,] - ms[3,])%*%InvS%*%(ms[1,] - ms[3,]) print(D2.13) D2.23 = (ms[2,] - ms[3,])%*%InvS%*%(ms[2,] - ms[3,]) print(D2.23) # # --- End R script 1 --- # Well, I would like to generalize a soluction to obtain the matrices like 'Mah' (below) or a complete matrix like in the Output 21.1.2. Somebody could help me? # # --- Begin R script 2 --- # Mah = c(0, 89.86419,0, 179.38471, 17.20107, 0) n = 3 D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = Mah[k] D[j,i] = Mah[k] } } D=sqrt(D) #D2 - D library(mclust) dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') # # --- End R script 2 --- # I always need of this type of analysis and I'm not founding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html *** NEW ADDRESS! *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 [EMAIL PROTECTED], www.homepages.ucl.ac.uk/~ucakche Esta mensagem foi verificada pelo E-mail Protegido Terra. Scan engine: McAfee VirusScan / Atualizado em 08/07/2005 / Versão: 4.4.00 - Dat 4531 Proteja o seu e-mail Terra: http://mail.terra.com.br/ -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http
[R] Help to make a specific matrix
Dear R users, The solution is probably simple but I need someone to point me to it. How can I to generate a matrix from a numeric sequence of 1:10 like 'A' or 'B' below: A) || | 1 2 3 4 5 | || | 1 | 0 | | 2 | 1 0 | | 3 | 2 5 0 | | 4 | 3 6 8 0| | 5 | 4 7 9 10 0 | || B) || | 1 2 3 4 5 | || | 1 | 0 1 2 3 4 | | 2 | 1 0 5 6 7 | | 3 | 2 5 0 8 9 | | 4 | 3 6 8 0 10 | | 5 | 4 7 9 10 0 | || This question is related with the possible combinations of five objects two the two, i.e, C(5,2). Any help would be greatly appreciated. Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help to make a specific matrix
Gabor Grothendieck wrote: On 7/9/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Dear R users, The solution is probably simple but I need someone to point me to it. How can I to generate a matrix from a numeric sequence of 1:10 like 'A' or 'B' below: A) || | 1 2 3 4 5 | || | 1 | 0 | | 2 | 1 0 | | 3 | 2 5 0 | | 4 | 3 6 8 0| | 5 | 4 7 9 10 0 | || B) || | 1 2 3 4 5 | || | 1 | 0 1 2 3 4 | | 2 | 1 0 5 6 7 | | 3 | 2 5 0 8 9 | | 4 | 3 6 8 0 10 | | 5 | 4 7 9 10 0 | || Try this and see ?lower.tri A - matrix(0,5,5) A[lower.tri(A)] - 1:10 B - A + t(A) Dear Gabor, Thank you, very much: this soluction is nice! Best, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help with Mahalanobis
Dear R list, I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data as obtained below: Squared Distance to Species From Species: Setosa Versicolor Virginica Setosa 0 89.86419 179.38471 Versicolor 89.86419 0 17.20107 Virginica 179.38471 17.20107 0 These distances were obtained with proc 'CANDISC' of SAS, please, see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm From these distances my intention is to make a cluster analysis as below, using the package 'mclust': In prior mail, my basic question was: how to obtain this matrix with R from 'iris' data? Well, I think that the basic soluction to calculate this distances is: # # --- Begin R script 1 --- # x = as.matrix(iris[,1:4]) tra = iris[,5] man = manova(x ~ tra) # Mahalanobis E= summary(man)$SS[2] #Matrix E S= as.matrix(E$Residuals)/man$df.residual InvS = solve(S) ms = matrix(unlist(by(x, tra, mean)), byrow=T, ncol=ncol(x)) colnames(ms) = names(iris[1:4]) rownames(ms) = c('Set', 'Ver', 'Vir') D2.12 = (ms[1,] - ms[2,])%*%InvS%*%(ms[1,] - ms[2,]) print(D2.12) D2.13 = (ms[1,] - ms[3,])%*%InvS%*%(ms[1,] - ms[3,]) print(D2.13) D2.23 = (ms[2,] - ms[3,])%*%InvS%*%(ms[2,] - ms[3,]) print(D2.23) # # --- End R script 1 --- # Well, I would like to generalize a soluction to obtain the matrices like 'Mah' (below) or a complete matrix like in the Output 21.1.2. Somebody could help me? # # --- Begin R script 2 --- # Mah = c(0, 89.86419,0, 179.38471, 17.20107, 0) n = 3 D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = Mah[k] D[j,i] = Mah[k] } } D=sqrt(D) #D2 - D library(mclust) dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') # # --- End R script 2 --- # I always need of this type of analysis and I'm not founding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Tables: Invitation to make a collective package
, breaks, right) tmpList - c(tmpList, list(tbl)) } } valCol - logCol[logCol] names(tmpList) - names(valCol) return(tmpList) } # User defines one factor else { namesdf - names(df) pos - which(namesdf == by) stopifnot(is.factor((df[[pos]]))) numF- table(df[[pos]]) for(i in 1:length(numF)) { tmpdf - subset(df, df[[pos]] == names(numF[i])) logCol - sapply(tmpdf, is.numeric) for (j in 1:ncol(tmpdf)) { if (logCol[j]) { x- as.matrix(tmpdf[ ,j]) tbl - tb.make.table.II(x, k, breaks, right) newFY- list(tbl) nameF- names(numF[i]) nameY- names(logCol[j]) nameFY - paste(nameF,'.', nameY, sep=) names(newFY) - sub(' +$', '', nameFY) tmpList - c(tmpList, newFY) } } } } return(tmpList) } # Tables package # # to try # # 1.Tables # 1.1. Tables from vectors # Making a vector set.seed(1) x=rnorm(100, 5, 1) #x=as.factor(rep(1:10, 10)) # to try tbl - tb.table(x) print(tbl); cat('\n') # Equal to above tbl - tb.table(x, breaks='Sturges') print(tbl); cat('\n') tbl - tb.table(x, breaks='Scott') print(tbl); cat('\n') tbl - tb.table(x, breaks='FD') print(tbl); cat('\n') tbl - tb.table(x, breaks='F', right=T) print(tbl); cat('\n') tbl - tb.table(x, k=4) print(tbl); cat('\n') tbl - tb.table(x, k=20) print(tbl); cat('\n') # Partial tbl - tb.table(x, start=4, end=6) print(tbl); cat('\n') # Partial tbl - tb.table(x, start=4.5, end=5.5) print(tbl); cat('\n') # Nonsense tbl - tb.table(x, start=0, end=10, h=.5) print(tbl); cat('\n') # First and last class forced (fi=0) tbl - tb.table(x, start=1, end=9, h=1) print(tbl); cat('\n') tbl - tb.table(x, start=1, end=10, h=2) print(tbl); cat('\n') # 1.2. Tables from data.frame # 1.2.1. Making a data.frame mdf=data.frame(X1=rep(LETTERS[1:4], 25), X2=as.factor(rep(1:10, 10)), Y1=c(NA, NA, rnorm(96, 10, 1), NA, NA), Y2=rnorm(100, 58, 4), Y3=c(NA, NA, rnorm(98, -20, 2))) tbl - tb.table(mdf) print(tbl) # Equal to above tbl - tb.table(mdf, breaks='Sturges') print(tbl) tbl - tb.table(mdf, breaks='Scott') print(tbl) tbl - tb.table(mdf, breaks='FD') print(tbl) tbl - tb.table(mdf, k=4) print(tbl) tbl - tb.table(mdf, k=10) print(tbl) levels(mdf$X1) tbl=tb.table(mdf, k=5, by='X1') length(tbl) names(tbl) print(tbl) tbl=tb.table(mdf, breaks='FD', by='X1') print(tbl) # A 'big' result: X2 is a factor with 10 levels! tbl=tb.table(mdf, breaks='FD', by='X2') print(tbl) # 1.2.2. Using 'iris' tbl=tb.table(iris, k=5) print(tbl) levels(iris$Species) tbl=tb.table(iris, k=5, by='Species') length(tbl) names(tbl) print(tbl) tbl=tb.table(iris, k=5, by='Species', right=T) print(tbl) tbl=tb.table(iris, breaks='FD', by='Species') print(tbl) library(MASS) levels(Cars93$Origin) tbl=tb.table(Cars93, k=5, by='Origin') names(tbl) print(tbl) tbl=tb.table(Cars93, breaks='FD', by='Origin') print(tbl) I find that this package would be very useful and would like to hear the opinion of the interested parties in participating. Best regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help: Mahalanobis distances between 'Species' from iris
Dear R list, I'm trying to calculate Mahalanobis distances for 'Species' of 'iris' data as obtained below: Squared Distance to Species From Species: Setosa Versicolor Virginica Setosa 0 89.86419 179.38471 Versicolor 89.86419 0 17.20107 Virginica 179.38471 17.20107 0 This distances above were obtained with proc 'CANDISC' of SAS, please, see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/chap21/sect19.htm From this distance my intention is to make a cluster analysis as below, using the package 'mclust': # # --- Begin R script --- # # For units compatibility of 'iris' from R dataset and 'iris' data used in # the SAS example: Measures = iris[,1:4]*10 Species = iris[,5] irisSAS = data.frame(Measures, Species) n = 3 Mah = c(0, 89.86419,0, 179.38471, 17.20107, 0) # My Question is: how to obtain 'Mah' with R from 'irisSAS' data? D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = Mah[k] D[j,i] = Mah[k] } } D=sqrt(D) #D2 - D library(mclust) dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') # # --- End R script --- # I always need of this type of analysis and I'm not founding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Cluster of iris data set from Mahalanobis distances
Dear R list, My question: I'm trying to calculate Mahalanobis distances for 'Species' from the iris data set as below: cat('\n') cat('Cluster analyse of iris data\n') cat('from Mahalanobis distance obtained of SAS\n') cat('\n') n = 3 dat = c(0, 89.86419, 0, 179.38471,17.20107, 0) D = matrix(0, n, n) nam = c('Set', 'Ver', 'Vir') rownames(D) = nam colnames(D) = nam k = 0 for (i in 1:n) { for (j in 1:i) { k = k+1 D[i,j] = dat[k] D[j,i] = dat[k] } } D=sqrt(D) #D2 - D dendroS = hclust(as.dist(D), method='single') dendroC = hclust(as.dist(D), method='complete') win.graph(w = 3.5, h = 6) split.screen(c(2, 1)) screen(1) plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue') screen(2) plot(dendroC, main='Complete', sub='', xlab='', col='red') I'm not fouding how to make it in the CRAN documentation (Archives, packages: mclust, cluster, fpc and mva). Could help me? Regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: R GUI for Linux?
Dears, I know a little about Object Pascal language and I've been working (in the last two years) with the Tinn-R development (www.sciview.org/Tinn-R). This work started adapting Tinn (a good frame, but with limitations) as an R script editor. Tinn is a small ASCII file editor primarily intended as a better replacement of the default Notepad.exe under Windows. Tinn is the recursive acronym 'Tinn is not Notepad'. Tinn-R is an extension of Tinn that provides additional tools to control R (Rgui in MDI or SDI mode, see http://cran.r-project.org, SciViews R console, see http://www.sciviews.org/SciViews-R) or S-Plus. As such, Tinn-R is a feature-rich replacement of the basic script editor provided with Rgui. It provides advanced syntax-highlighting, submission of code in whole, or line-by-line, and many other useful tools to ease writing and debugging of R code. Both Tinn and Tinn-R are distributed under the GPL 2 or above license. So, I think (but I'm very suspicious), it's a nice R script editor running only under Windows. Otherwise, I don't know nothing about C/C++ language. My question is, why we not provide (starting from an open source editor like Bluefish, Kate, or another good editor under Linux) the resources of Tinn-R. I think is not very hard (or impossible) translate the Tinn-R functions/procedures from Object Pascal to C/C++. I think (as coordinator of Tinn-R team) that we can help with this work. All we need is start it and that one person (C/C++ programmer) coordinate the works. Emacs + ESS, in my opinion, is not adequate for beginning. Best regards, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Contingency tables from data.frames
The final version with the help of Gabor Grotendieck (thanks Gabor, very much!) ### # EasieR - Package # ### # Common function er.make.table - function(x, start, end, h, right) { # Absolut frequency f - table(cut(x, br=seq(start, end, h), right=right)) # Relative frequency fr - f/length(x) # Relative frequency, % frP - 100*(f/length(x)) # Cumulative frequency fac - cumsum(f) # Cumulative frequency, % facP - 100*(cumsum(f/length(x))) fi - round(f, 2) fr - round(as.numeric(fr), 2) frP - round(as.numeric(frP), 2) fac - round(as.numeric(fac), 2) facP - round(as.numeric(facP),2) # Make final table res - data.frame(fi, fr, frP, fac, facP) names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)') return(res) } #With Gabor Grotendieck suggestions (thanks Gabor, very much!) er.table - function(x, ...) UseMethod(er.table) er.table.default - function(x, k, start, end, h, breaks=c('Sturges', 'Scott', 'FD'), right=FALSE) { #User define nothing or not 'x' isn't numeric - stop stopifnot(is.numeric(x)) #User define only 'x' #(x, {k, start, end, h}, [breaks, right]) if (missing(k) missing(start) missing(end) missing(h) ){ x - na.omit(x) brk - match.arg(breaks) switch(brk, Sturges = k - nclass.Sturges(x), Scott = k - nclass.scott(x), FD = k - nclass.FD(x)) tmp - range(x) start - tmp[1] - abs(tmp[2])/100 end - tmp[2] + abs(tmp[2])/100 R - end-start h - R/k } #User define 'x' and 'k' #(x, k, {start, end, h}, [breaks, right]) else if (missing(start) missing(end) missing(h)) { stopifnot(length(k) = 1) x - na.omit(x) tmp - range(x) start - tmp[1] - abs(tmp[2])/100 end - tmp[2] + abs(tmp[2])/100 R - end-start h - R/abs(k) } #User define 'x', 'start' and 'end' #(x, {k,} start, end, {h,} [breaks, right]) else if (missing(k) missing(h)) { stopifnot(length(start) = 1, length(end) =1) x - na.omit(x) tmp - range(x) R - end-start k - sqrt(abs(R)) if (k 5) k - 5 #min value of k h - R/k } #User define 'x', 'start', 'end' and 'h' #(x, {k,} start, end, h, [breaks, right]) else if (missing(k)) { stopifnot(length(start) = 1, length(end) = 1, length(h) = 1) x - na.omit(x) } else stop('Error, please, see the function sintax!') tbl - er.make.table(x, start, end, h, right) return(tbl) } er.table.data.frame - function(df, k, breaks=c('Sturges', 'Scott', 'FD'), right=FALSE) { stopifnot(is.data.frame(df)) tmpList - list() logCol - sapply(df, is.numeric) for (i in 1:ncol(df)) { if (logCol[i]) { x - as.matrix(df[ ,i]) x - na.omit(x) #User define only x and/or 'breaks' #(x, {k,}[breaks, right]) if (missing(k)) { brk - match.arg(breaks) switch(brk, Sturges = k - nclass.Sturges(x), Scott = k - nclass.scott(x), FD = k - nclass.FD(x)) tmp - range(x) start - tmp[1] - abs(tmp[2])/100 end - tmp[2] + abs(tmp[2])/100 R - end-start h - R/k } #User define 'x' and 'k' #(x, k,[breaks, right]) else { tmp - range(x) start - tmp[1] - abs(tmp[2])/100 end - tmp[2] + abs(tmp[2])/100 R - end-start h - R/abs(k) } tbl - er.make.table(x, start, end, h, right) tmpList - c(tmpList, list(tbl)) } } valCol - logCol[logCol] names(tmpList) - names(valCol) return(tmpList) } Best, -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] tel: 73-3634.2779 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Contingency tables from data.frames
Gabor Grothendieck wrote: On 5/24/05, Jose Claudio Faria [EMAIL PROTECTED] wrote: Dear list, I'm trying to do a set of generic functions do make contingency tables from data.frames. It is just running nice (I'm learning R), but I think it can be better. I would like to filter the data.frame, i.e, eliminate all not numeric variables. And I don't know how to make it: please, help me. Below one of the my functions ('er' is a mention to EasieR, because I'm trying to do a package for myself and the my students): #2. Tables from data.frames #2.1---er.table.df.br (User define breaks and right) er.table.df.br - function(df, breaks = c('Sturges', 'Scott', 'FD'), right = FALSE) { if (is.data.frame(df) != 'TRUE') stop('need data.frame data') dim_df - dim(df) tmpList - list() for (i in 1:dim_df[2]) { x - as.matrix(df[ ,i]) x - na.omit(x) k - switch(breaks[1], 'Sturges' = nclass.Sturges(x), 'Scott' = nclass.scott(x), 'FD' = nclass.FD(x), stop('breaks' must be 'Sturges', 'Scott' or 'FD')) tmp - range(x) classIni - tmp[1] - tmp[2]/100 classEnd - tmp[2] + tmp[2]/100 R- classEnd-classIni h- R/k # Absolut frequency f - table(cut(x, br = seq(classIni, classEnd, h), right = right)) # Relative frequency fr - f/length(x) # Relative frequency, % frP - 100*(f/length(x)) # Cumulative frequency fac - cumsum(f) # Cumulative frequency, % facP - 100*(cumsum(f/length(x))) fi - round(f, 2) fr - round(as.numeric(fr), 2) frP - round(as.numeric(frP), 2) fac - round(as.numeric(fac), 2) facP - round(as.numeric(facP),2) # Table res - data.frame(fi, fr, frP, fac, facP) names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)') tmpList - c(tmpList, list(res)) } names(tmpList) - names(df) return(tmpList) } To try the function: #a) runing nice y1=rnorm(100, 10, 1) y2=rnorm(100, 58, 4) y3=rnorm(100, 500, 10) mydf=data.frame(y1, y2, y3) #tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F) #tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F) tbdf=er.table.df.br (mydf, breaks = 'FD', right=F) print(tbdf) #b) One of the problems y1=rnorm(100, 10, 1) y2=rnorm(100, 58, 4) y3=rnorm(100, 500, 10) y4=rep(letters[1:10], 10) mydf=data.frame(y1, y2, y3, y4) tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F) print(tbdf) Try this: sapply(my.data.frame, is.numeric) Also you might want to look up: ?match.arg ?stopifnot ?ncol ?sapply ?lapply Thanks Gabor, you suggestion solve my basic problem. I'm working is same basic (but I think useful) functions for begginiers. Below you can see the set of functions: ### # EasyeR - Package # ### # Common function--- er.make.table - function(x, classIni, classEnd, h, right) { # Absolut frequency f - table(cut(x, br = seq(classIni, classEnd, h), right = right)) # Relative frequency fr - f/length(x) # Relative frequency, % frP - 100*(f/length(x)) # Cumulative frequency fac - cumsum(f) # Cumulative frequency, % facP - 100*(cumsum(f/length(x))) fi - round(f, 2) fr - round(as.numeric(fr), 2) frP - round(as.numeric(frP), 2) fac - round(as.numeric(fac), 2) facP - round(as.numeric(facP),2) # Table res - data.frame(fi, fr, frP, fac, facP) names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)') return(res) } #1. Tables from vectors #1.1---er.table.br (User define breaks and right)--- er.table.br - function(x, breaks = c('Sturges', 'Scott', 'FD'), right = FALSE) { if (is.factor(x) || mode(x) != 'numeric') stop('need numeric data') x - na.omit(x) k - switch(breaks[1], 'Sturges' = nclass.Sturges(x), 'Scott' = nclass.scott(x), 'FD' = nclass.FD(x), stop('breaks' must be 'Sturges', 'Scott' or 'FD')) tmp - range(x) classIni - tmp[1] - abs(tmp[2])/100 classEnd - tmp[2] + abs(tmp[2])/100 R- classEnd-classIni h- R/k tbl - er.make.table(x, classIni, classEnd, h, right) return(tbl) } #1.2---er.table.kr (User define the class number (k) and right)- er.table.kr - function(x, k, right = FALSE) { if (is.factor(x) || mode(x) != 'numeric') stop('need numeric data') if ((k == '') || (k == ' ')) stop('k not defined') x - na.omit(x) tmp - range(x) classIni - tmp[1] - abs(tmp[2])/100 classEnd - tmp[2] + abs(tmp[2])/100 R- classEnd-classIni h- R/k tbl - er.make.table(x, classIni, classEnd, h, right
[R] Contingency tables from data.frames
Dear list, I'm trying to do a set of generic functions do make contingency tables from data.frames. It is just running nice (I'm learning R), but I think it can be better. I would like to filter the data.frame, i.e, eliminate all not numeric variables. And I don't know how to make it: please, help me. Below one of the my functions ('er' is a mention to EasieR, because I'm trying to do a package for myself and the my students): #2. Tables from data.frames #2.1---er.table.df.br (User define breaks and right) er.table.df.br - function(df, breaks = c('Sturges', 'Scott', 'FD'), right = FALSE) { if (is.data.frame(df) != 'TRUE') stop('need data.frame data') dim_df - dim(df) tmpList - list() for (i in 1:dim_df[2]) { x - as.matrix(df[ ,i]) x - na.omit(x) k - switch(breaks[1], 'Sturges' = nclass.Sturges(x), 'Scott' = nclass.scott(x), 'FD' = nclass.FD(x), stop('breaks' must be 'Sturges', 'Scott' or 'FD')) tmp - range(x) classIni - tmp[1] - tmp[2]/100 classEnd - tmp[2] + tmp[2]/100 R- classEnd-classIni h- R/k # Absolut frequency f - table(cut(x, br = seq(classIni, classEnd, h), right = right)) # Relative frequency fr - f/length(x) # Relative frequency, % frP - 100*(f/length(x)) # Cumulative frequency fac - cumsum(f) # Cumulative frequency, % facP - 100*(cumsum(f/length(x))) fi - round(f, 2) fr - round(as.numeric(fr), 2) frP - round(as.numeric(frP), 2) fac - round(as.numeric(fac), 2) facP - round(as.numeric(facP),2) # Table res - data.frame(fi, fr, frP, fac, facP) names(res) - c('Class limits', 'fi', 'fr', 'fr(%)', 'fac', 'fac(%)') tmpList - c(tmpList, list(res)) } names(tmpList) - names(df) return(tmpList) } To try the function: #a) runing nice y1=rnorm(100, 10, 1) y2=rnorm(100, 58, 4) y3=rnorm(100, 500, 10) mydf=data.frame(y1, y2, y3) #tbdf=er.table.df.br (mydf, breaks = 'Sturges', right=F) #tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F) tbdf=er.table.df.br (mydf, breaks = 'FD', right=F) print(tbdf) #b) One of the problems y1=rnorm(100, 10, 1) y2=rnorm(100, 58, 4) y3=rnorm(100, 500, 10) y4=rep(letters[1:10], 10) mydf=data.frame(y1, y2, y3, y4) tbdf=er.table.df.br (mydf, breaks = 'Scott', right=F) print(tbdf) Could anyone give me a hint how to work around this? PS: Excuse my bad English ;-) -- Jose Claudio Faria Brasil/Bahia/UESC/DCET Estatistica Experimental/Prof. Adjunto mails: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html