[R] cbind(NULL,zoo.object)?
Folks, I often build up R objects starting from NULL and then repeatedly using rbind() or cbind(). This yields code like: a - NULL for () { onerow - craft one more row a - rbind(a, onerow) } This works because rbind() and cbind() are forgiving when presented with a NULL arg: they act like nothing happened, and you get all.equal(x,rbind(NULL,x)) or all.equal(x,cbind(NULL,x)). This same idea is useful in building up a zoo matrix, by cbinding one column after another. I find this quite elegant, though I'm conscious that it leads to a lot of expensive memory management. This approach breaks around zoo because cbind(NULL,z) doesn't work: tmp - structure(c(2.65917577210146, 1.17190441781228, 0.838363890267146, -0.293979039853021, -0.132437820890186, 0.262002985990861, 2.16601367541420, 0.375245019370496, 0.0108451048014047, 1.56555812127923), index = structure(c(12509, 12510, 12513, 12514, 12515, 12516, 12521, 12524, 12528, 12529), class = Date), class = zoo) tmp 2004-04-01 2004-04-02 2004-04-05 2004-04-06 2004-04-07 2004-04-08 2.65917577 1.17190442 0.83836389 -0.29397904 -0.13243782 0.26200299 2004-04-13 2004-04-16 2004-04-20 2004-04-21 2.16601368 0.37524502 0.01084510 1.56555812 cbind(NULL,tmp) 12509 12510 12513 12514 12515 12516 2.65917577 1.17190442 0.83836389 -0.29397904 -0.13243782 0.26200299 12521 12524 12528 12529 2.16601368 0.37524502 0.01084510 1.56555812 Warning message: In merge.zoo(..., all = all, fill = fill, suffixes = suffixes, retclass = zoo) : Index vectors are of different classes: integer Date Is there an easy workaround; or am I deeply confused; is there a way out? :-) -- Ajay Shah http://www.mayin.org/ajayshah ajays...@mayin.org http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Please help me to understand environments
I want to write: zap - function(v) { if (exists(v)) {rm(v)} } But of course this doesn't work because the rm() acts on v which is within zap() and doesn't affect the parent environment: x - 1 zap(x) x [1] 1 How would I make this zap() function work? -- Ajay Shah http://www.mayin.org/ajayshah ajays...@mayin.org http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. The data are all 5 point Likert scale scores by respondents(N=397). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? Thanks, ### #data example, partial # 1 1 11 id o1 o2 o3 o4 o5 o6 o7 o8 o9 o10 o11 o12 o13 v5 v13 v14 v16 v17 14602 2 2 4 4 5 5 2 3 2 4 3 4 2 5 2 2 4 2 14601 2 4 5 4 5 5 2 5 3 4 5 4 5 5 3 4 4 2 14606 1 3 5 5 5 5 3 3 5 3 5 5 5 5 5 5 5 3 14610 2 1 4 5 4 5 3 4 4 2 4 2 1 5 3 5 5 5 14609 4 3 2 2 5 5 2 5 2 4 4 2 2 4 2 4 4 4 #correlated model, three scales corrlated to each other model.correlated - specify.model() weber-tp,e.webertp,NA tp-tr,e.tptr,NA tr-weber,e.trweber,NA weber-weber,NA,1 tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,alpha2,NA weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber - prob_bungee_jumping,alpha8,NA weber - prob_tornado_tracking,alpha9,NA weber - prob_ski,alpha10,NA prob_wild_camp - prob_wild_camp, ep2,NA prob_book_hotel_in_short_time - prob_book_hotel_in_short_time,ep3,NA prob_safari_Kenia - prob_safari_Kenia, ep4, NA prob_sail_wild_water - prob_sail_wild_water,ep5,NA prob_dangerous_sport - prob_dangerous_sport,ep7,NA prob_bungee_jumping - prob_bungee_jumping,ep8,NA prob_tornado_tracking - prob_tornado_tracking,ep9,NA prob_ski - prob_ski,ep10,NA tp - o1,NA,1 tp - o3,beta3,NA tp - o4,beta4,NA tp - o5,beta5,NA tp - o6,beta6,NA tp - o7,beta7,NA tp - o9,beta9,NA tp - o10,beta10,NA tp - o11,beta11,NA tp - o12,beta12,NA o1 - o1,eo1,NA o3 - o3,eo3,NA o4 - o4,eo4,NA o5 - o5,eo5,NA o6 - o6,eo6,NA o7 - o7,eo7,NA o9 - o9,eo9,NA o10 - o10,eo10,NA o11 - o11,eo11,NA o12 - o12,eo12,NA tr - v5, NA,1 tr - v13, gamma2,NA tr - v14, gamma3,NA tr - v16,gamma4,NA tr - v17,gamma5,NA v5 - v5,ev1,NA v13 - v13,ev2,NA v14 - v14,ev3,NA v16 - v16, ev4, NA v17 - v17,ev5,NA sem.correlated - sem(model.correlated, cov(riskninfo_s), 397) summary(sem.correlated) samelist = c('weber','tp','tr') minlist=c(names(rk),names(tp)) maxlist = NULL path.diagram(sem2,out.file = e:/sem2.dot,same.rank=samelist,min.rank=minlist,max.rank = maxlist,edge.labels=values,rank.direction='LR') # #high level latent scale, a high level factor exist ## model.rsk - specify.model() rsk-tp,e.rsktp,NA rsk-tr,e.rsktr,NA rsk-weber,e.rskweber,NA rsk-rsk, NA,1 weber-weber, e.weber,NA tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,NA,1 weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber - prob_bungee_jumping,alpha8,NA weber - prob_tornado_tracking,alpha9,NA weber - prob_ski,alpha10,NA prob_wild_camp - prob_wild_camp, ep2,NA prob_book_hotel_in_short_time - prob_book_hotel_in_short_time,ep3,NA prob_safari_Kenia - prob_safari_Kenia, ep4, NA prob_sail_wild_water - prob_sail_wild_water,ep5,NA prob_dangerous_sport - prob_dangerous_sport,ep7,NA prob_bungee_jumping - prob_bungee_jumping,ep8,NA prob_tornado_tracking - prob_tornado_tracking,ep9,NA prob_ski -
Re: [R] read.xls question
On Sat, 14 Mar 2009, Mathew, Abraham T wrote: I'm an R newbie and had a question about the read.xls function. I've heard that this is often not a reliable function to use for importing data. However, I have created numerous xls files which contain information about voter turnout and macroeconomic indicators in India. I'm writing a paper on the relationship between economic growth and voter turnout. This is the command I use: data - read.xls(India.xls, header=TRUE) I get the following error message: Error: could not find function read.xls Anyone have ideas? Thanks, Abraham Since you are a beginner it is possible you missed a couple of steps. read.xls is part of the package xlsReadWrite so you need to first install that package, which is only possible if you are using Windows. Then you need to load the package with the command library(xlsReadWrite) Just checking on CRAN xlsReadWrite is not currently available. There is an archived version available however. David Scott _ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 85055 Fax: +64 9 373 7018 Email: d.sc...@auckland.ac.nz Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dispcrepancy between aov F test and tukey contrasts results with mixed effects model
lba...@montana.edu wrote: Hello, I have some conflicting output from an aov summary and tukey contrasts with a mixed effects model I was hoping someone could clarify. I am comparing the abundance of a species across three willow stand types. Since I have 2 or 3 sites within a habitat I have included site as a random effect in the lme model. My confusion is that the F test given by aov(model) indicates there is no difference between habitats, but the tukey contrasts using the multcomp package shows that one pair of habits is significantly different from each other. Why is there a discrepancy? Have I specified my model correctly? I included the code and output below. Thank you. Looks like glht() is ignoring degrees of freedom. So what it does is wrong but it is not easy to do it right (whatever right is in these cases). If I understand correctly, what you have is that stand is strictly coarser than site, presumably the stands representing each 2, 2, and 3 sites, with a varying number of replications within each site. Since the between-site variation is considered random, you end up with a comparison of stands based on essentially only 7 pieces of information. (The latter statement requires some qualification, but let's not go there to day.) If you have roughly equal replications within each site, I'd be strongly tempted to reduce the analysis to a simple 1-way ANOVA of the site averages. co.lme=lme(coye~stand,data=t,random=~1|site) summary (co.lme) Linear mixed-effects model fit by REML Data: R AIC BIClogLik 53.76606 64.56047 -21.88303 Random effects: Formula: ~1 | site (Intercept) Residual StdDev: 0.3122146 0.2944667 Fixed effects: coye ~ stand Value Std.Error DFt-value p-value (Intercept) 0.4936837 0.2305072 60 2.1417277 0.0363 stand2 0.4853222 0.3003745 4 1.6157240 0.1815 stand3 -0.3159230 0.3251201 4 -0.9717117 0.3862 Correlation: (Intr) stand2 stand2 -0.767 stand3 -0.709 0.544 Standardized Within-Group Residuals: Min Q1Med Q3Max -2.4545673 -0.5495609 -0.3148274 0.7527378 2.5151476 Number of Observations: 67 Number of Groups: 7 anova(co.lme) numDF denDF F-value p-value (Intercept) 160 23.552098 .0001 stand 2 4 3.738199 0.1215 summary(glht(co.lme,linfct=mcp(stand=Tukey))) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: lme.formula(fixed = coye ~ stand, data = R, random = ~1 | site) Linear Hypotheses: Estimate Std. Error z value Pr(|z|) 2 - 1 == 0 0.4853 0.3004 1.616 0.2385 3 - 1 == 0 -0.3159 0.3251 -0.972 0.5943 3 - 2 == 0 -0.8012 0.2994 -2.676 0.0202 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) Lisa Baril Masters Candidate Department of Ecology Montana State University - Bozeman 406.994.2670 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stuck on building a package
Folks, I have a personal package which used to build fine. Today when I tried to build the package again, some errors popped up. Could you please help? When I paste the offending function into an R it works correctly. But the package build breaks. $ R CMD check ansecon * checking for working pdflatex ... OK * using log directory '/Users/ajayshah/L/R/packages/ansecon.Rcheck' * using R version 2.8.1 (2008-12-22) * using session charset: ASCII * checking for file 'ansecon/DESCRIPTION' ... OK * this is package 'ansecon' version '0.1' * checking package dependencies ... OK * checking if this is a source package ... OK * checking for executable files ... OK * checking whether package 'ansecon' can be installed ... ERROR Installation failed. See '/Users/ajayshah/L/R/packages/ansecon.Rcheck/00install.out' for details. make: *** [check] Error 1 And $ more /Users/ajayshah/L/R/packages/ansecon.Rcheck/00install.out * Installing *source* package 'ansecon' ... ** R ** preparing package for lazy loading Loading required package: zoo Attaching package: 'zoo' The following object(s) are masked from package:base : as.Date.numeric Error in parse(n = -1, file = file) : unexpected end of input at 63: ft=winsorised.left, winsorised.right=winsorised.right) 64: } Calls: Anonymous - code2LazyLoadDB - sys.source - parse Execution halted ERROR: lazy loading failed for package 'ansecon' ** Removing '/Users/ajayshah/L/R/packages/ansecon.Rcheck/ansecon' -- Ajay Shah http://www.mayin.org/ajayshah ajays...@mayin.org http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] full screen graph, and close x11
thx , i will look that. I'm both on windows and linux 2009/3/12 Eik Vettorazzi e.vettora...@uke.uni-hamburg.de Hi, see ?dev.off but I think (guessing you use a windows system), you would be better of in using win.metafile() instead of x11() in initiating the graph, see ?win.metafile There is no need to show all 100+ graphs on your display if you actually want them in files. hth. BaKaLeGuM schrieb: Hi everybody! I'm looking for a function or option to make a full screen plot.. can you help me? and I would like to know if it's possible to automaticaly close a x11 windows.. because i have more than 100 graph to generate and what i want is: 1 - make a full screen graph 2 - save this plot in emf format 3 - close the plot (goto 1) thanks for your advices [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cbind(NULL,zoo.object)?
I've just committed a fix to the zoo svn repository. You can grab it like this: source(http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/*checkout*/pkg/R/merge.zoo.R?rev=575root=zoo;) On Sun, Mar 15, 2009 at 2:54 AM, Ajay Shah ajays...@mayin.org wrote: Folks, I often build up R objects starting from NULL and then repeatedly using rbind() or cbind(). This yields code like: a - NULL for () { onerow - craft one more row a - rbind(a, onerow) } This works because rbind() and cbind() are forgiving when presented with a NULL arg: they act like nothing happened, and you get all.equal(x,rbind(NULL,x)) or all.equal(x,cbind(NULL,x)). This same idea is useful in building up a zoo matrix, by cbinding one column after another. I find this quite elegant, though I'm conscious that it leads to a lot of expensive memory management. This approach breaks around zoo because cbind(NULL,z) doesn't work: tmp - structure(c(2.65917577210146, 1.17190441781228, 0.838363890267146, -0.293979039853021, -0.132437820890186, 0.262002985990861, 2.16601367541420, 0.375245019370496, 0.0108451048014047, 1.56555812127923), index = structure(c(12509, 12510, 12513, 12514, 12515, 12516, 12521, 12524, 12528, 12529), class = Date), class = zoo) tmp 2004-04-01 2004-04-02 2004-04-05 2004-04-06 2004-04-07 2004-04-08 2.65917577 1.17190442 0.83836389 -0.29397904 -0.13243782 0.26200299 2004-04-13 2004-04-16 2004-04-20 2004-04-21 2.16601368 0.37524502 0.01084510 1.56555812 cbind(NULL,tmp) 12509 12510 12513 12514 12515 12516 2.65917577 1.17190442 0.83836389 -0.29397904 -0.13243782 0.26200299 12521 12524 12528 12529 2.16601368 0.37524502 0.01084510 1.56555812 Warning message: In merge.zoo(..., all = all, fill = fill, suffixes = suffixes, retclass = zoo) : Index vectors are of different classes: integer Date Is there an easy workaround; or am I deeply confused; is there a way out? :-) -- Ajay Shah http://www.mayin.org/ajayshah ajays...@mayin.org http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.xls question
To turn the function avaliable you have to load the library first, to do so: library(xlsReadWrite) Then you use your command read.xls. I'm suposing that you already had installed the xlsReadWrite package. A better alternative to this package is RODBC, which also read excel 2007 files. Take a look on it at the CRAN web site. Hope it helps. Best wishes. __ Rodrigo Aluizio Em 15/03/2009, às 00:27, Mathew, Abraham T amat...@ku.edu escreveu: I'm an R newbie and had a question about the read.xls function. I've heard that this is often not a reliable function to use for importing data. However, I have created numerous xls files which contain information about voter turnout and macroeconomic indicators in India. I'm writing a paper on the relationship between economic growth and voter turnout. This is the command I use: data - read.xls(India.xls, header=TRUE) I get the following error message: Error: could not find function read.xls Anyone have ideas? Thanks, Abraham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.xls question
There are two read.xls functions: one is in gdata and one is in xlsReadWrite packages and neither is unreliable. You've simply not loaded the package. There is more info here and interfacing to Excel: http://wiki.r-project.org/rwiki/doku.php?id=tips:data-io:ms_windows You should read the Introduction manual and see ?library for loading a package. On Sat, Mar 14, 2009 at 11:27 PM, Mathew, Abraham T amat...@ku.edu wrote: I'm an R newbie and had a question about the read.xls function. I've heard that this is often not a reliable function to use for importing data. However, I have created numerous xls files which contain information about voter turnout and macroeconomic indicators in India. I'm writing a paper on the relationship between economic growth and voter turnout. This is the command I use: data - read.xls(India.xls, header=TRUE) I get the following error message: Error: could not find function read.xls Anyone have ideas? Thanks, Abraham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Please help me to understand environments
You might want to look at rm to see how that function handles the problem. -- David Winsemius On Mar 15, 2009, at 3:14 AM, Ajay Shah wrote: I want to write: zap - function(v) { if (exists(v)) {rm(v)} } But of course this doesn't work because the rm() acts on v which is within zap() and doesn't affect the parent environment: x - 1 zap(x) x [1] 1 How would I make this zap() function work? -- Ajay Shah David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help_Unable to run glmer
Can anyone help with this - I have been running glmer code for linear mixed models on my works pc, and am now working on my laptop (which is vista) and when I load the lme4 package i get the message below and I cannot run any models - any one have any ideas? Emma trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.8/Matrix_0.999375-21.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.8/Matrix_0.999375-21.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '404 Not Found' Warning in download.packages(p0, destdir = tmpd, available = available, : download of package 'Matrix' failed trying URL 'http://cran.uk.r-project.org/bin/windows/contrib/2.8/lme4_0.999375-28.zip' Content type 'application/zip' length 978644 bytes (955 Kb) opened URL downloaded 955 Kb -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help_Unable to run glmer
It's a 404 message. Try a repository that does have it. The CMU repository has a copy but the one you picked does not seem to have one at the moment. Maybe there was a run on the British bit and the bit banks are a wee short? http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/2.8/ -- David Winsemius On Mar 15, 2009, at 6:45 AM, Emma Stone wrote: http://cran.uk.r-project.org/bin/windows/contrib/2.8/Matrix_0.999375-21.zip' cannot open: HTTP status was '404 Not Found' David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persp plot + plotting grid lines
This wiki page has the answer. Draw the plot first to establish the coordinate system and create the viewing transformation matrix. Draw the grid lines with lines(trans3d()), and then put a: par(new=T) ... and then redraw the plot over the gridlines. http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics- base:add_grid x - seq(-10, 10, length= 30) y - x f - function(x,y) { r - sqrt(x^2+y^2); 10 * sin(r)/r } z - outer(x, y, f) z[is.na(z)] - 1 op - par(bg = white) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, ltheta = 120, shade = 0.75, ticktype = detailed, xlab = X, ylab = Y, zlab = Sinc( r ) ) - res for (ix in seq(-10,10, by=5)) lines (trans3d(x=ix, y=seq(-10,10, by=5), z= -10, pmat = res), col = red, lty=dotted) for (iy in seq(-10,10, by=5)) lines (trans3d(x=seq(-10,10, by=5), y=iy, z= -10, pmat = res), col = red, lty=dotted) for (ix in seq(-10,10, by=5)) lines (trans3d(x=ix, y=10, z= seq(-10,10, by=5), pmat = res), col = red, lty=dotted) for (iz in seq(-10,10, by=5)) lines (trans3d(x=seq(-10,10, by=5), y=10, z= iz, pmat = res), col = red, lty=dotted) for (iy in seq(-10,10, by=5)) {lines (trans3d(x=-10, y=iy, z= seq(-10,10, by=5), pmat = res), col = red, lty=dotted)} for (iz in seq(-10,10, by=5)) {lines (trans3d(x=-10, y=seq(-10,10, by=5), z= iz, pmat = res), col = red, lty=dotted)} par(new=T) persp(x, y, z, theta = 30, phi = 30, expand = 0.5, col = lightblue, ltheta = 120, shade = 0.75, ticktype = detailed, xlab = X, ylab = Y, zlab = Sinc( r )) par(op) -- David Winsemius, MD Heritage Laboratories West Hartford, CT On Mar 14, 2009, at 12:02 PM, Pedro Mardones wrote: Dear all; Does anyone know how to add grid lines to a persp plot? I've tried using lines(trans3d..) but the lines of course are superimposed into the actual 3d surface and what I need is something like the plot shown in the following link: http://thermal.gg.utah.edu/tutorials/matlab/matlab_tutorial.html I'll appreciate any ideas Thanks PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stuck on building a package
On 15/03/2009 4:46 AM, Ajay Shah wrote: Folks, I have a personal package which used to build fine. Today when I tried to build the package again, some errors popped up. Could you please help? Generally questions about building packages are better in R-devel. When I paste the offending function into an R it works correctly. But the package build breaks. $ R CMD check ansecon * checking for working pdflatex ... OK * using log directory '/Users/ajayshah/L/R/packages/ansecon.Rcheck' * using R version 2.8.1 (2008-12-22) * using session charset: ASCII * checking for file 'ansecon/DESCRIPTION' ... OK * this is package 'ansecon' version '0.1' * checking package dependencies ... OK * checking if this is a source package ... OK * checking for executable files ... OK * checking whether package 'ansecon' can be installed ... ERROR Installation failed. See '/Users/ajayshah/L/R/packages/ansecon.Rcheck/00install.out' for details. make: *** [check] Error 1 And $ more /Users/ajayshah/L/R/packages/ansecon.Rcheck/00install.out * Installing *source* package 'ansecon' ... ** R ** preparing package for lazy loading Loading required package: zoo Attaching package: 'zoo' The following object(s) are masked from package:base : as.Date.numeric Error in parse(n = -1, file = file) : unexpected end of input at 63: ft=winsorised.left, winsorised.right=winsorised.right) 64: } That's a syntax error in one of your files. Usually an unexpected end of input means you opened more braces than you closed, but there are other ways to get the same error: it just means the statement it was parsing was incomplete when it ran out of input. The line numbers 63 and 64 are probably not helpful; the package check process manipulates the code a bit before it passes it to the parser. (I think R-devel will give better error messages in this regard.) Duncan Murdoch Calls: Anonymous - code2LazyLoadDB - sys.source - parse Execution halted ERROR: lazy loading failed for package 'ansecon' ** Removing '/Users/ajayshah/L/R/packages/ansecon.Rcheck/ansecon' __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the best package for large data cleaning (not statistical analysis)?
Dear Jim: Thanks for your reply. Looks to me, you were using batching. I used batching to digest large data in Matlab before. Still wonder the answers to the two specifics questions without resorting to batching. Thanks. -Sean On Sat, Mar 14, 2009 at 10:13 PM, jim holtman jholt...@gmail.com wrote: Exactly what type of cleaning do you want to do on them? Can you read in the data a block at a time (e.g., 1M records), clean them up and then write them back out? You would have the choice of putting them back as a text file or possibly storing them using 'filehash'. I have used that technique to segment a year's worth of data that was probably 3GB of text into monthly objects that were about 70MB dataframes that I stored using filehash. These I then read back in to do processing where I could summarize by month. So it all depends on what you want to do. You could read in the chunks, clean them and then reshape them into dataframes that you could process later. You will still probably have the problem that all the data still won't fit in memory. Now one thing I did was that since the dataframes were stored as binary objects in filehash, it was pretty fast to retrieve them, pick out the data I needed from each month and create a subset of just the data I needed that would now fit in memory. So it all depends ... On Sat, Mar 14, 2009 at 8:46 PM, Sean Zhang seane...@gmail.com wrote: Dear R helpers: I am a newbie to R and have a question related to cleaning large data frames in R. So far, I have been using SAS for data cleaning because my data sets are relatively large (handling multiple files, each could be as large as 5-10 G). I am not a fan of SAS at all and am eager to move data cleaning tasks into R completely. Seems to me, there are 3 options. Using SQL, ff or filehash. I do not want to learn sql. so my question is more related to ff and filehash. In specifics, (1) for merging two large data frames, which one is better, ff vs. filehash? (2) for reshaping a large data frame (say from long to wide or the opposite) which one is better, ff vs. filehash? If you can provide examples, that will be even better. Many thanks in advance. -Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tukey, planned contrasts or t-test without ANOVA? What is correct?
Dear R community, I compare mean monthly body temperature between two age classes of turtles overwintering underground. lm(body_tem ~ Month*Year*Age_Class) TukeyHSD(aov(body_tem ~ Month*Year*Age_Class, a)) The Tukey HSD as well as the planned contrasts method showed significant differences between the two age classes, but insignificant differences between the two age classes at the same levels of months. In the opposite, using a t-test for comparison of independent means (i.e. without using the ANOVA) it demonstrated insignificant differences between the two age classes. What result is correct? Thanks, Julia _ cns!503D1D86EBB2B53C!2285.entry?ocid=TXT_TAGLM_WL_UGC_Contacts_032009 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
Dear hyena, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 4:25 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. Both models are very odd. In the first, each of tr, weber, and tp has direct effects on different subsets of the endogenous variables. The implicit claim of these models is that, e.g., prob_* are conditionally independent of tr and tp given weber, and that the correlations among prob_* are entirely accounted for by their dependence on weber. The structural coefficients are just the simple regressions of each prob_* on weber. The second model is the same except that the variances and covariances among weber, tr, and tp are parametrized differently. I'm not sure why you set the models up in this manner, and why your research requires a structural-equation model. I would have expected that each of the prob_*, v*, and o* variables would have comprised indicators of a latent variable (risk-taking, etc.). The models that you specified seem so strange that I think that you'd do well to try to find competent local help to sort out what you're doing in relationship to the goals of the research. Of course, maybe I'm just having a failure of imagination. The data are all 5 point Likert scale scores by respondents(N=397). It's problematic to treat ordinal variables if they were metric (and to fit SEMs of this complexity to a small sample). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? No. Because the models necessarily fit the same, you'd have to decide between them on grounds of plausibility. Moreover both models fit very badly. Regards, John Thanks, ### #data example, partial # 1 1 11 id o1 o2 o3 o4 o5 o6 o7 o8 o9 o10 o11 o12 o13 v5 v13 v14 v16 v17 14602 2 2 4 4 5 5 2 3 2 4 3 4 2 5 2 2 4 2 14601 2 4 5 4 5 5 2 5 3 4 5 4 5 5 3 4 4 2 14606 1 3 5 5 5 5 3 3 5 3 5 5 5 5 5 5 5 3 14610 2 1 4 5 4 5 3 4 4 2 4 2 1 5 3 5 5 5 14609 4 3 2 2 5 5 2 5 2 4 4 2 2 4 2 4 4 4 #correlated model, three scales corrlated to each other model.correlated - specify.model() weber-tp,e.webertp,NA tp-tr,e.tptr,NA tr-weber,e.trweber,NA weber-weber,NA,1 tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,alpha2,NA weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber - prob_bungee_jumping,alpha8,NA weber - prob_tornado_tracking,alpha9,NA weber - prob_ski,alpha10,NA prob_wild_camp - prob_wild_camp, ep2,NA prob_book_hotel_in_short_time - prob_book_hotel_in_short_time,ep3,NA prob_safari_Kenia - prob_safari_Kenia, ep4, NA prob_sail_wild_water - prob_sail_wild_water,ep5,NA prob_dangerous_sport - prob_dangerous_sport,ep7,NA prob_bungee_jumping - prob_bungee_jumping,ep8,NA prob_tornado_tracking - prob_tornado_tracking,ep9,NA prob_ski - prob_ski,ep10,NA tp - o1,NA,1 tp - o3,beta3,NA tp - o4,beta4,NA tp - o5,beta5,NA tp - o6,beta6,NA tp - o7,beta7,NA tp - o9,beta9,NA tp - o10,beta10,NA tp - o11,beta11,NA tp - o12,beta12,NA o1 - o1,eo1,NA o3 - o3,eo3,NA o4 - o4,eo4,NA o5 - o5,eo5,NA o6 - o6,eo6,NA o7 - o7,eo7,NA o9 - o9,eo9,NA o10 - o10,eo10,NA o11 - o11,eo11,NA o12 - o12,eo12,NA tr - v5, NA,1 tr - v13, gamma2,NA tr - v14, gamma3,NA tr - v16,gamma4,NA tr - v17,gamma5,NA
Re: [R] [OT] two question about color space.
I've put together a rough R port of that C code [*] in a package on r- forge: http://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/spectral/?root=photonics http://r-forge.r-project.org/R/?group_id=160 ( package spectral, will be built overnight ) There is a lot of optimization and cleaning to do: it seems much slower than it should be, and the documentation is empty at the moment, but it reproduces the result for the black body radiation. library(spectral) data(wavelength) data(cie_colour_match) data(colourSystem) test - function(bbTemp=5000){ mxyz - spectrum_to_xyz(specfun=bb_spectrum, bbTemp=bbTemp) rgb - xyz_to_rgb(colourSystem[3, ], mxyz) norm_rgb(constrain_rgb(rgb)) } temperatures - seq(1000, 1, by=500) # temperatures - seq(1000, 1, length=300) res - abs(sapply(temperatures, test)) grid.newpage() start - seq(0.1, 0.9, length=dim(res)[2]) end - start + mean(diff(start)) grid.segments(x0=start,x1=end, y0=0*start+0.5, y1=0*temperatures+0.5, gp=gpar(col=rgb(res[1, ], res[2, ], res[3, ], alpha=1), lwd=10, lineend=3)) Best wishes, baptiste [*] from http://www.fourmilab.ch/documents/specrend/ On 14 Mar 2009, at 12:33, baptiste auguie wrote: Hi, For a good discussion of the link between colour and spectra I would suggest, http://www.fourmilab.ch/documents/specrend/ which provides an open-source C code to perform the conversion you ask for. I asked for some advice on how to wrap a R function around this code last week but sadly I didn't get anywhere. Do let me know if you succeed. (alternatively, one could port the implementation in pure R as the code is not too complicated or computationally demanding). Hope this helps, baptiste On 14 Mar 2009, at 12:09, Jinsong Zhao wrote: Hi there, I try to plot visible light spectrum (380nm~780nm) with color corresponding to the specific wavelength. However, I don't find a function that could do this. Another question, it's possible to plot a color space chromaticity diagram like this: http://upload.wikimedia.org/wikipedia/commons/thumb/0/02/CIExy1931.svg/300px-CIExy1931.svg.png Thanks in advance! Jinsong __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] primitives again
Edna Bell wrote: How do I find the functions which are primitives, please? you can scan the whole search path for functions that are primitives: primitives = sapply(search(), function(path) with(as.environment(path), Filter(is.primitive, lapply(ls(), get primitives is a list of named lists of primitives, one sublist for each attached package (most sublists will be empty, i guess). vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] builtin vs. closure
On 15/03/2009 12:00 AM, Edna Bell wrote: Dear R Gurus: I'm working slowly through R Programming for Bioinformatics, which is really interesting! Anyway, my question now is: what determines if a function is a builtin vs. a closure, please? Closure is the normal type of function written in R code. The other special types are built in to R; users can't create them except by modifying the R source code. For details see the R Internals manual. Duncan Murdoch For instance: typeof(sqrt) [1] builtin typeof(mean) [1] closure Thanks, Edna Bell __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help plase with simple loop
Dear All, Im a new R user. How can I write code below more efficiently (using a loop for example)? f1 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x1)} f2 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ;(1+x2)} f3 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x3)} f4 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x4)} Each function has four parameters independently on whether a specific variable is explicitly in the function or not. Many thanks for your help! Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the best package for large data cleaning (not statistical analysis)?
Hi Sean, you should think about storing the data externally in a sql database. this makes you very flexible and you can do a lot of manipultaion directly in the db. with the help of stored procedures for example in a postgreSQL db you can use almost any preferred languege to manipulate the data before loading it into R. there's also a procedural language based on R with which you can do a lot of things already inside postgresql databases. and keep in mind: learning sql isn't more difficult than R. best, josuah Am 15.03.2009 um 13:13 schrieb Sean Zhang: Dear Jim: Thanks for your reply. Looks to me, you were using batching. I used batching to digest large data in Matlab before. Still wonder the answers to the two specifics questions without resorting to batching. Thanks. -Sean On Sat, Mar 14, 2009 at 10:13 PM, jim holtman jholt...@gmail.com wrote: Exactly what type of cleaning do you want to do on them? Can you read in the data a block at a time (e.g., 1M records), clean them up and then write them back out? You would have the choice of putting them back as a text file or possibly storing them using 'filehash'. I have used that technique to segment a year's worth of data that was probably 3GB of text into monthly objects that were about 70MB dataframes that I stored using filehash. These I then read back in to do processing where I could summarize by month. So it all depends on what you want to do. You could read in the chunks, clean them and then reshape them into dataframes that you could process later. You will still probably have the problem that all the data still won't fit in memory. Now one thing I did was that since the dataframes were stored as binary objects in filehash, it was pretty fast to retrieve them, pick out the data I needed from each month and create a subset of just the data I needed that would now fit in memory. So it all depends ... On Sat, Mar 14, 2009 at 8:46 PM, Sean Zhang seane...@gmail.com wrote: Dear R helpers: I am a newbie to R and have a question related to cleaning large data frames in R. So far, I have been using SAS for data cleaning because my data sets are relatively large (handling multiple files, each could be as large as 5-10 G). I am not a fan of SAS at all and am eager to move data cleaning tasks into R completely. Seems to me, there are 3 options. Using SQL, ff or filehash. I do not want to learn sql. so my question is more related to ff and filehash. In specifics, (1) for merging two large data frames, which one is better, ff vs. filehash? (2) for reshaping a large data frame (say from long to wide or the opposite) which one is better, ff vs. filehash? If you can provide examples, that will be even better. Many thanks in advance. -Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help plase with simple loop
I don't think you have gotten the basic idea of functions in R. Those functions would return only the value of the last evaluated expression, e.g. (1+par[1]) in the case of f1, but most of the code appears to have no effect. See this simple example: x1 - 4 f1 - function(x) x1 - x f1(1) x1 [1] 4 # nothing happens to the x1 in the enclosing environment,; the one inside the function is gone. The x1 inside the function f1 is local to the function and has no persistency after the evaluations are completed.. f4 - function(par) {1+par[4]} # would have same effect as your f1 f4(1:4) [1] 5 You should tell us what you actually want to do. You generally call functions with a particular set of arguments for which you have provided no examples and then expect those functions to return some sort of object which you have also not defined. -- David Winsemius On Mar 15, 2009, at 9:36 AM, Axel Urbiz wrote: Dear All, Im a new R user. How can I write code below more efficiently (using a loop for example)? f1 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x1)} f2 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ;(1+x2)} f3 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x3)} f4 -function(par){x1 - par[1]; x2 - par[2];x3 - par[3] ; x4 - par[4] ; (1+x4)} Each function has four parameters independently on whether a specific variable is explicitly in the function or not. Many thanks for your help! Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] builtin vs. closure
Thank you On Sun, Mar 15, 2009 at 8:36 AM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 15/03/2009 12:00 AM, Edna Bell wrote: Dear R Gurus: I'm working slowly through R Programming for Bioinformatics, which is really interesting! Anyway, my question now is: what determines if a function is a builtin vs. a closure, please? Closure is the normal type of function written in R code. The other special types are built in to R; users can't create them except by modifying the R source code. For details see the R Internals manual. Duncan Murdoch For instance: typeof(sqrt) [1] builtin typeof(mean) [1] closure Thanks, Edna Bell __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] primitives again
Thank you On Sun, Mar 15, 2009 at 8:23 AM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Edna Bell wrote: How do I find the functions which are primitives, please? you can scan the whole search path for functions that are primitives: primitives = sapply(search(), function(path) with(as.environment(path), Filter(is.primitive, lapply(ls(), get primitives is a list of named lists of primitives, one sublist for each attached package (most sublists will be empty, i guess). vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persp plot + plotting grid lines
On 15/03/2009 10:07 AM, Kingsford Jones wrote: Building on Duncan's code, here's an approximation to the Matlab 'peaks' plot referred to by Pedro: peaks - function(x, y) { 3 * (1-x)^2 * exp(-(x^2)-(y+1)^2) - 10 * (x/5-x^3-y^5) * exp(-x^2-y^2) - 1/3*exp(-(x+1)^2-y^2)} x - y - seq(-3,3,.1) z - outer(x,y, peaks) z2 - 10 * round(c(z) + abs(min(z)) + 1) jet.colors = colorRampPalette(c(#7F, blue, #007FFF, cyan, #7FFF7F, yellow, #FF7F00, red, #7F)) color - jet.colors(160)[z2] library(rgl) persp3d(x,y,z, color=color, smooth=FALSE) surface3d(x,y,z+0.001, front=lines, back=culled) Using the textured grid as in my second option looks a bit better here. The other big difference between rgl output and the Matlab output is that Matlab's looks flat, whereas rgl does some fairly complicated lighting calculations. But if you don't want the shiny plastic look, you can partially turn it off by including the following: persp3d(x,y,z, color=color, smooth=FALSE, specular=black) Or turn it off completely with persp3d(x,y,z, color=color, smooth=FALSE, lit = FALSE) Kingsford Jones On Sat, Mar 14, 2009 at 3:51 PM, Duncan Murdoch murd...@stats.uwo.ca wrote: On 14/03/2009 12:02 PM, Pedro Mardones wrote: Dear all; Does anyone know how to add grid lines to a persp plot? I've tried using lines(trans3d..) but the lines of course are superimposed into the actual 3d surface and what I need is something like the plot shown in the following link: http://thermal.gg.utah.edu/tutorials/matlab/matlab_tutorial.html I'll appreciate any ideas I just posted a couple of demos of this using the rgl function persp3d, in the thread Re: [R] can I draw 3D plot like this using R?. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
Dear John, Thanks for the reply. Maybe I had used wrong terminology, as you pointed out, in fact, variables prob*, o* and v* are indicators of three latent variables(scales): weber, tp, and tr respectively. So variables prob*, o* and v* are exogenous variables. e.g., variable prob_dangerous_sport is the answers of question how likely do you think you will engage a dangerous sport? (1-very unlikely to 5- very likely). Variables weber, tr, tp are latent variables representing risk attitudes in different domains(recreation, planned behaviour, travel choice ). Hope this make sense of the models. By exploratory analysis, it had shown consistencies(Cronbach alpha) in each scale(latent variable tr, tp, weber), and significant correlations among these three scales. The two models mentioned in previous posts are the efforts to find out if there is a more general factor that can account for the correlations and make the three scales its sub scales. In this sense, SEM is used more of a CFA (sem is the only packages I know to do so, i did not search very hard of course). And Indeed the model fit is quite bad. regards, John Fox wrote: Dear hyena, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 4:25 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. Both models are very odd. In the first, each of tr, weber, and tp has direct effects on different subsets of the endogenous variables. The implicit claim of these models is that, e.g., prob_* are conditionally independent of tr and tp given weber, and that the correlations among prob_* are entirely accounted for by their dependence on weber. The structural coefficients are just the simple regressions of each prob_* on weber. The second model is the same except that the variances and covariances among weber, tr, and tp are parametrized differently. I'm not sure why you set the models up in this manner, and why your research requires a structural-equation model. I would have expected that each of the prob_*, v*, and o* variables would have comprised indicators of a latent variable (risk-taking, etc.). The models that you specified seem so strange that I think that you'd do well to try to find competent local help to sort out what you're doing in relationship to the goals of the research. Of course, maybe I'm just having a failure of imagination. The data are all 5 point Likert scale scores by respondents(N=397). It's problematic to treat ordinal variables if they were metric (and to fit SEMs of this complexity to a small sample). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? No. Because the models necessarily fit the same, you'd have to decide between them on grounds of plausibility. Moreover both models fit very badly. Regards, John Thanks, ### #data example, partial # 1 1 11 id o1 o2 o3 o4 o5 o6 o7 o8 o9 o10 o11 o12 o13 v5 v13 v14 v16 v17 14602 2 2 4 4 5 5 2 3 2 4 3 4 2 5 2 2 4 2 14601 2 4 5 4 5 5 2 5 3 4 5 4 5 5 3 4 4 2 14606 1 3 5 5 5 5 3 3 5 3 5 5 5 5 5 5 5 3 14610 2 1 4 5 4 5 3 4 4 2 4 2 1 5 3 5 5 5 14609 4 3 2 2 5 5 2 5 2 4 4 2 2 4 2 4 4 4 #correlated model, three scales corrlated to each other model.correlated - specify.model() weber-tp,e.webertp,NA tp-tr,e.tptr,NA tr-weber,e.trweber,NA weber-weber,NA,1 tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,alpha2,NA weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber -
Re: [R] SEM model testing with identical goodness of fits (2)
Dear Hyena, Your model is of three correlated factors accounting for the observed variables. Those three correlations may be accounted for equally well by correlations (loadings) of the lower order factors with a general factor. Those two models are indeed equivalent models and will, as a consequence have exactly equal fits and dfs. Call the three correlations rab, rac, rbc. Then a higher order factor model will have loadings of fa, fb and fc, where fa*fb = rab, fa*bc = rac, and fb*fc = rbc. You can solve for fa, fb and fc in terms of factor inter-correlations. You can not compare the one to the other, for they are equivalent models. You can examine how much of the underlying variance of the original items is due to the general factor by considering a bi-factor solution where the general factor loads on each of the observed variables and a set of residual group factors account for the covariances within your three domains. This can be done in an Exploratory Factor Analysis (EFA) context using the omega function in the psych package. It is possible to then take that model and test it using John Fox's sem package to evaluate the size of each of the general and group factor loadings. (A discussion of how to do that is at http://www.personality-project.org/r/book/psych_for_sem.pdf ). Bill At 4:25 PM +0800 3/15/09, hyena wrote: Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. The data are all 5 point Likert scale scores by respondents(N=397). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? Thanks, ### #data example, partial # 1 1 11 id o1 o2 o3 o4 o5 o6 o7 o8 o9 o10 o11 o12 o13 v5 v13 v14 v16 v17 14602 2 2 4 4 5 5 2 3 2 4 3 4 2 5 2 2 4 2 14601 2 4 5 4 5 5 2 5 3 4 5 4 5 5 3 4 4 2 14606 1 3 5 5 5 5 3 3 5 3 5 5 5 5 5 5 5 3 14610 2 1 4 5 4 5 3 4 4 2 4 2 1 5 3 5 5 5 14609 4 3 2 2 5 5 2 5 2 4 4 2 2 4 2 4 4 4 #correlated model, three scales corrlated to each other model.correlated - specify.model() weber-tp,e.webertp,NA tp-tr,e.tptr,NA tr-weber,e.trweber,NA weber-weber,NA,1 tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,alpha2,NA weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber - prob_bungee_jumping,alpha8,NA weber - prob_tornado_tracking,alpha9,NA weber - prob_ski,alpha10,NA prob_wild_camp - prob_wild_camp, ep2,NA prob_book_hotel_in_short_time - prob_book_hotel_in_short_time,ep3,NA prob_safari_Kenia - prob_safari_Kenia, ep4, NA prob_sail_wild_water - prob_sail_wild_water,ep5,NA prob_dangerous_sport - prob_dangerous_sport,ep7,NA prob_bungee_jumping - prob_bungee_jumping,ep8,NA prob_tornado_tracking - prob_tornado_tracking,ep9,NA prob_ski - prob_ski,ep10,NA tp - o1,NA,1 tp - o3,beta3,NA tp - o4,beta4,NA tp - o5,beta5,NA tp - o6,beta6,NA tp - o7,beta7,NA tp - o9,beta9,NA tp - o10,beta10,NA tp - o11,beta11,NA tp - o12,beta12,NA o1 - o1,eo1,NA o3 - o3,eo3,NA o4 - o4,eo4,NA o5 - o5,eo5,NA o6 - o6,eo6,NA o7 - o7,eo7,NA o9 - o9,eo9,NA o10 - o10,eo10,NA o11 - o11,eo11,NA o12 - o12,eo12,NA tr - v5, NA,1 tr - v13, gamma2,NA tr - v14, gamma3,NA tr - v16,gamma4,NA tr - v17,gamma5,NA v5 - v5,ev1,NA v13 - v13,ev2,NA v14 - v14,ev3,NA v16 - v16, ev4, NA v17 - v17,ev5,NA sem.correlated - sem(model.correlated, cov(riskninfo_s), 397) summary(sem.correlated) samelist =
[R] Contour plots of four two-dimensional matrices
I have four large two-dimensional matrices of which I want to create contour plots. Something like filled.contour(matrix) contourplot(matrix) works but only gives me one plot at a time. If I combine the four matrices into one three-dimensional matrix, which I'll name seven, there should be a way of doing something like this contourplot(seven[,,k] for k in 1 to 4) such that they come out as one plot rather than four. I couldn't figure out how to do this, so I tried a disgusting alternative that involved generating x,y and k vectors, but I'd rather do it properly. Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot of a specific ID number
Dear R list members, I have a question regarding xyplot. I managed to make a xyplot of all the IDs by using the syntax below: xyplot(PA ~ CRPC + CRPT | ID, data = redinteract) Now, I'd like to make a graph of a specific ID number (e.g., only ID number 301). I thought I could use subset, but it seems to be not working. Could anyone let me know how I can make a graph of a specific ID? Thank you for your help in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour plots of four two-dimensional matrices
What is it that you want to do with these 4 plots? Overlay them with different color contours or plot them side-by-side on the same page? ?par # for filled.contour but the implementation will be different for those two options. contourplot is is a lattice plotting function. See Figure 6.10 on Sarkar's Lattice book pages. levelplot is the closest analog to filled contour in lattice. -- David Winsemius On Mar 15, 2009, at 12:22 PM, Thomas Levine wrote: I have four large two-dimensional matrices of which I want to create contour plots. Something like filled.contour(matrix) contourplot(matrix) works but only gives me one plot at a time. If I combine the four matrices into one three-dimensional matrix, which I'll name seven, there should be a way of doing something like this contourplot(seven[,,k] for k in 1 to 4) such that they come out as one plot rather than four. I couldn't figure out how to do this, so I tried a disgusting alternative that involved generating x,y and k vectors, but I'd rather do it properly. Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot of a specific ID number
Would have help if you had offered complete output for your efforts at using subset. This should work (on the assumption that ID is of type numeric): xyplot(PA ~ CRPC + CRPT , data = subset(redinteract, ID == 301) ) On Mar 15, 2009, at 12:35 PM, Sachi Ito wrote: Dear R list members, I have a question regarding xyplot. I managed to make a xyplot of all the IDs by using the syntax below: xyplot(PA ~ CRPC + CRPT | ID, data = redinteract) Now, I'd like to make a graph of a specific ID number (e.g., only ID number 301). I thought I could use subset, but it seems to be not working. Could anyone let me know how I can make a graph of a specific ID? Thank you for your help in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
Dear Hyena, OK -- I see that what you're trying to do is simply to fit a confirmatory factor-analysis model. The two models that you're considering aren't really different -- they are, as I said, observationally equivalent, and fit the data poorly. You can *assume* a common higher-level factor and estimate the three loadings on it for the lower-level factors, but you can't test this model against the first model. I'm not sure what you gain from the CFA beyond what you learned from an exploratory factor analysis. Using the same data first in an EFA and then for a CFA essentially invalidates the CFA, which is no longer confirmatory. One would, then, expect a CFA following an EFA to fit the data well, since the CFA was presumably specified to do so, but I suspect that a closer examination of the EFA will show that the items don't divide so neatly into the three sets. Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 12:00 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the reply. Maybe I had used wrong terminology, as you pointed out, in fact, variables prob*, o* and v* are indicators of three latent variables(scales): weber, tp, and tr respectively. So variables prob*, o* and v* are exogenous variables. e.g., variable prob_dangerous_sport is the answers of question how likely do you think you will engage a dangerous sport? (1-very unlikely to 5- very likely). Variables weber, tr, tp are latent variables representing risk attitudes in different domains(recreation, planned behaviour, travel choice ). Hope this make sense of the models. By exploratory analysis, it had shown consistencies(Cronbach alpha) in each scale(latent variable tr, tp, weber), and significant correlations among these three scales. The two models mentioned in previous posts are the efforts to find out if there is a more general factor that can account for the correlations and make the three scales its sub scales. In this sense, SEM is used more of a CFA (sem is the only packages I know to do so, i did not search very hard of course). And Indeed the model fit is quite bad. regards, John Fox wrote: Dear hyena, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 4:25 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. Both models are very odd. In the first, each of tr, weber, and tp has direct effects on different subsets of the endogenous variables. The implicit claim of these models is that, e.g., prob_* are conditionally independent of tr and tp given weber, and that the correlations among prob_* are entirely accounted for by their dependence on weber. The structural coefficients are just the simple regressions of each prob_* on weber. The second model is the same except that the variances and covariances among weber, tr, and tp are parametrized differently. I'm not sure why you set the models up in this manner, and why your research requires a structural-equation model. I would have expected that each of the prob_*, v*, and o* variables would have comprised indicators of a latent variable (risk-taking, etc.). The models that you specified seem so strange that I think that you'd do well to try to find competent local help to sort out what you're doing in relationship to the goals of the research. Of course, maybe I'm just having a failure of imagination. The data are all 5 point Likert scale scores by respondents(N=397). It's problematic to treat ordinal variables if they were metric (and to fit SEMs of this complexity to a small sample). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? No. Because
Re: [R] xyplot of a specific ID number
Using built in data frame CO2 try this where Type is a 2 level factor. library(lattice) xyplot(uptake ~ conc | Type, CO2) xyplot(uptake ~ conc | Type, CO2)[1] xyplot(uptake ~ conc | Type, CO2)[2] On Sun, Mar 15, 2009 at 12:35 PM, Sachi Ito s.ito@gmail.com wrote: Dear R list members, I have a question regarding xyplot. I managed to make a xyplot of all the IDs by using the syntax below: xyplot(PA ~ CRPC + CRPT | ID, data = redinteract) Now, I'd like to make a graph of a specific ID number (e.g., only ID number 301). I thought I could use subset, but it seems to be not working. Could anyone let me know how I can make a graph of a specific ID? Thank you for your help in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dispcrepancy between aov F test and tukey contrasts results with mixed effects model
Thanks Peter for the advice and quick response. I just want to clarify what you suggest. I should average values within a site then do a one-way anova to test for differnces between sites based on the 2 to 3 new samples per stand type -- and not use random effects for site? Or, because I've reduced the data I've 'corrected' the problem with the glht multiple comparisons and can use the p-values from that summary if I include site as a random effect? Thanks again for your advice. lba...@montana.edu wrote: Hello, I have some conflicting output from an aov summary and tukey contrasts with a mixed effects model I was hoping someone could clarify. I am comparing the abundance of a species across three willow stand types. Since I have 2 or 3 sites within a habitat I have included site as a random effect in the lme model. My confusion is that the F test given by aov(model) indicates there is no difference between habitats, but the tukey contrasts using the multcomp package shows that one pair of habits is significantly different from each other. Why is there a discrepancy? Have I specified my model correctly? I included the code and output below. Thank you. Looks like glht() is ignoring degrees of freedom. So what it does is wrong but it is not easy to do it right (whatever right is in these cases). If I understand correctly, what you have is that stand is strictly coarser than site, presumably the stands representing each 2, 2, and 3 sites, with a varying number of replications within each site. Since the between-site variation is considered random, you end up with a comparison of stands based on essentially only 7 pieces of information. (The latter statement requires some qualification, but let's not go there to day.) If you have roughly equal replications within each site, I'd be strongly tempted to reduce the analysis to a simple 1-way ANOVA of the site averages. co.lme=lme(coye~stand,data=t,random=~1|site) summary (co.lme) Linear mixed-effects model fit by REML Data: R AIC BIClogLik 53.76606 64.56047 -21.88303 Random effects: Formula: ~1 | site (Intercept) Residual StdDev: 0.3122146 0.2944667 Fixed effects: coye ~ stand Value Std.Error DFt-value p-value (Intercept) 0.4936837 0.2305072 60 2.1417277 0.0363 stand2 0.4853222 0.3003745 4 1.6157240 0.1815 stand3 -0.3159230 0.3251201 4 -0.9717117 0.3862 Correlation: (Intr) stand2 stand2 -0.767 stand3 -0.709 0.544 Standardized Within-Group Residuals: Min Q1Med Q3Max -2.4545673 -0.5495609 -0.3148274 0.7527378 2.5151476 Number of Observations: 67 Number of Groups: 7 anova(co.lme) numDF denDF F-value p-value (Intercept) 160 23.552098 .0001 stand 2 4 3.738199 0.1215 summary(glht(co.lme,linfct=mcp(stand=Tukey))) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: lme.formula(fixed = coye ~ stand, data = R, random = ~1 | site) Linear Hypotheses: Estimate Std. Error z value Pr(|z|) 2 - 1 == 0 0.4853 0.3004 1.616 0.2385 3 - 1 == 0 -0.3159 0.3251 -0.972 0.5943 3 - 2 == 0 -0.8012 0.2994 -2.676 0.0202 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Adjusted p values reported -- single-step method) Lisa Baril Masters Candidate Department of Ecology Montana State University - Bozeman 406.994.2670 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Lisa Baril Masters Candidate Department of Ecology Montana State University - Bozeman 406.994.2670 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
Thanks for the clear clarification. The suggested bi-factor solution sounds attractive. I am going to check it in details. regards, William Revelle wrote: Dear Hyena, Your model is of three correlated factors accounting for the observed variables. Those three correlations may be accounted for equally well by correlations (loadings) of the lower order factors with a general factor. Those two models are indeed equivalent models and will, as a consequence have exactly equal fits and dfs. Call the three correlations rab, rac, rbc. Then a higher order factor model will have loadings of fa, fb and fc, where fa*fb = rab, fa*bc = rac, and fb*fc = rbc. You can solve for fa, fb and fc in terms of factor inter-correlations. You can not compare the one to the other, for they are equivalent models. You can examine how much of the underlying variance of the original items is due to the general factor by considering a bi-factor solution where the general factor loads on each of the observed variables and a set of residual group factors account for the covariances within your three domains. This can be done in an Exploratory Factor Analysis (EFA) context using the omega function in the psych package. It is possible to then take that model and test it using John Fox's sem package to evaluate the size of each of the general and group factor loadings. (A discussion of how to do that is at http://www.personality-project.org/r/book/psych_for_sem.pdf ). Bill At 4:25 PM +0800 3/15/09, hyena wrote: Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. The data are all 5 point Likert scale scores by respondents(N=397). The example listed bellow did not show prob_ variables(their names are too long). Given the following model structure, if they are indeed observationally indistinguishable, is there some possible adjustments to test the higher level factor effects? Thanks, ### #data example, partial # 1 1 11 id o1 o2 o3 o4 o5 o6 o7 o8 o9 o10 o11 o12 o13 v5 v13 v14 v16 v17 14602 2 2 4 4 5 5 2 3 2 4 3 4 2 5 2 2 4 2 14601 2 4 5 4 5 5 2 5 3 4 5 4 5 5 3 4 4 2 14606 1 3 5 5 5 5 3 3 5 3 5 5 5 5 5 5 5 3 14610 2 1 4 5 4 5 3 4 4 2 4 2 1 5 3 5 5 5 14609 4 3 2 2 5 5 2 5 2 4 4 2 2 4 2 4 4 4 #correlated model, three scales corrlated to each other model.correlated - specify.model() weber-tp,e.webertp,NA tp-tr,e.tptr,NA tr-weber,e.trweber,NA weber-weber,NA,1 tp-tp,e.tp,NA tr -tr,e.trv,NA weber - prob_wild_camp,alpha2,NA weber - prob_book_hotel_in_short_time,alpha3,NA weber - prob_safari_Kenia, alpha4, NA weber - prob_sail_wild_water,alpha5,NA weber - prob_dangerous_sport,alpha7,NA weber - prob_bungee_jumping,alpha8,NA weber - prob_tornado_tracking,alpha9,NA weber - prob_ski,alpha10,NA prob_wild_camp - prob_wild_camp, ep2,NA prob_book_hotel_in_short_time - prob_book_hotel_in_short_time,ep3,NA prob_safari_Kenia - prob_safari_Kenia, ep4, NA prob_sail_wild_water - prob_sail_wild_water,ep5,NA prob_dangerous_sport - prob_dangerous_sport,ep7,NA prob_bungee_jumping - prob_bungee_jumping,ep8,NA prob_tornado_tracking - prob_tornado_tracking,ep9,NA prob_ski - prob_ski,ep10,NA tp - o1,NA,1 tp - o3,beta3,NA tp - o4,beta4,NA tp - o5,beta5,NA tp - o6,beta6,NA tp - o7,beta7,NA tp - o9,beta9,NA tp - o10,beta10,NA tp - o11,beta11,NA tp - o12,beta12,NA o1 - o1,eo1,NA o3 - o3,eo3,NA o4 - o4,eo4,NA o5 - o5,eo5,NA o6 - o6,eo6,NA o7 - o7,eo7,NA o9 - o9,eo9,NA o10 - o10,eo10,NA o11 - o11,eo11,NA o12 - o12,eo12,NA tr - v5, NA,1 tr - v13, gamma2,NA tr - v14, gamma3,NA tr - v16,gamma4,NA tr - v17,gamma5,NA v5 - v5,ev1,NA v13 - v13,ev2,NA v14 - v14,ev3,NA v16 - v16, ev4, NA v17 - v17,ev5,NA sem.correlated - sem(model.correlated, cov(riskninfo_s), 397) summary(sem.correlated) samelist = c('weber','tp','tr')
[R] Yet another large file question.
I am using the XML package to form a rather large XML file. It seems to go OK untill the file length gets larger than about 30Mb then it seems that the last tokens of the xml file are unmatched. It is almost like the file hasn't been flushed because the file is OK with the exception of the last lines. I was wondering two things 1) Has anyone else run into similar limitations with saveXML in the XML package and if there is a work-around? 2) Am I better off using more pimitive write methods to get the data to a file? If so any suggestions? Thank you. Kevin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour plots of four two-dimensional matrices
I want to plot them side by side. On Sun, Mar 15, 2009 at 12:41 PM, David Winsemius dwinsem...@comcast.netwrote: What is it that you want to do with these 4 plots? Overlay them with different color contours or plot them side-by-side on the same page? ?par # for filled.contour but the implementation will be different for those two options. contourplot is is a lattice plotting function. See Figure 6.10 on Sarkar's Lattice book pages. levelplot is the closest analog to filled contour in lattice. -- David Winsemius On Mar 15, 2009, at 12:22 PM, Thomas Levine wrote: I have four large two-dimensional matrices of which I want to create contour plots. Something like filled.contour(matrix) contourplot(matrix) works but only gives me one plot at a time. If I combine the four matrices into one three-dimensional matrix, which I'll name seven, there should be a way of doing something like this contourplot(seven[,,k] for k in 1 to 4) such that they come out as one plot rather than four. I couldn't figure out how to do this, so I tried a disgusting alternative that involved generating x,y and k vectors, but I'd rather do it properly. Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Axes crossing at origin
Hi, I'm having a bit of trouble with the axes in my plots. I don't like the way R does not have them cross in the origin. Is there another plot/axis function? i tried using abline as suggested by someone from this list, but in my case this gives no satisfactory result, as the abline does sometimes lie on top of the y axis and sometimes not, depending on how i scale the image... #bild16 umkehrfunktion source(normpdf.r) png(umkehrfunktion.png,width=18,height=18,units=cm,res=600,pointsize=16) newdata=data.frame(x=seq(0,0.6*max(fit$df$x),length=200)) newy=predict(fit,newdata) plot.new() plot.window(xlim=c(0,0.6*max(fit$df$x)),ylim=c(15.7,17.5)) lines(newdata$x, newy) axis(1) axis(2) abline(v=0, h=0) mtext(FITC-insulin [mol],1,2.4) mtext(log intensity [log(LAU)],2,2.4) myydata=c(16.35,16.5,16.65,16.95,17.1,17.25) source(predict_amount.r) myxdata=predict_amount(fit,myydata,uselog=TRUE) for(i in 1:length(myydata)){ lines(c(myxdata[i],myxdata[i],max(newdata)),c(min(newy),myydata[i],myydata[i]),lty=3) } y1=seq(16.95,17.25,0.003) y2=seq(16.35,16.65,0.003) x1=predict_amount(fit,y1,TRUE) x2=predict_amount(fit,y2,TRUE) norm1=normpdf(y1,17.1,0.045) norm1=min(newy)+norm1/max(norm1)*.5 norm2=normpdf(y2,16.5,0.045) norm2=min(newy)+norm2/max(norm2)*.5 lines(x1,norm1,col=#FF) lines(x2,norm2,col=#FF) normy1=normpdf(y1,17.1,0.045) normy1=normy1/max(normy1)*-.3e-10+max(newdata) normy2=normpdf(y2,16.5,0.045) normy2=normy2/max(normy2)*-.3e-10+max(newdata) lines(normy1,y1,col=#FF) lines(normy2,y2,col=#FF) dev.off() Thanks, Markus Nenniger [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour plots of four two-dimensional matrices
You would use layout to set up the page in base graphics. It sets up the page to receive multiple plots. Unfortunately, this will *not* give you side by side plots because filled.contour is restricted to a full page per its help page layout(matrix(c(1,2,3,4), 2,2 byrow=TRUE) for (i in 1:4) { filled.contour(seven[ , , i] } Lattice graphics looks to be your only option: levelplot( in package lattice) has methods for arrays. This is what its help page says: Both levelplot and wireframe have methods for matrix, array, and table objects, in which case x provides the z vector described above, while its rows and columns are interpreted as the x and y vectors respectively. This is similar to the form used in filled.contour and image. For higher-dimensional arrays and tables, further dimensions are used as conditioning variables. Note that the matrix type is limited to 2 dimensions and you would need to use an array rather than a matrix. I just tested contourplot with the three example below and got encouraging results as well, so I think you are in luck. I would try simply this: library(lattice) contourplot(seven) # can it really be this simple ?!?! So your your data arrangement is in accord with that description. The desired 2 x 2 plot might happen automagically with your third dimension of the array = 4. The other more typical way to do it would be with a dataframe object that had x,y,z and grouping variables and to specify a formula like z ~ x + y | group. There is an example in the help page. To that form with as.data.frame.table. Run this demo: three - array(1:27, c(3,3,3)) three three.long - as.data.frame.table(three) # would need to relabel variable names names(three.long) - c(row, col, instance, Z) HTH; David Winsemius On Mar 15, 2009, at 2:15 PM, Thomas Levine wrote: I want to plot them side by side. On Sun, Mar 15, 2009 at 12:41 PM, David Winsemius dwinsem...@comcast.net wrote: What is it that you want to do with these 4 plots? Overlay them with different color contours or plot them side-by-side on the same page? ?par # for filled.contour but the implementation will be different for those two options. contourplot is is a lattice plotting function. See Figure 6.10 on Sarkar's Lattice book pages. levelplot is the closest analog to filled contour in lattice. -- David Winsemius On Mar 15, 2009, at 12:22 PM, Thomas Levine wrote: I have four large two-dimensional matrices of which I want to create contour plots. Something like filled.contour(matrix) contourplot(matrix) works but only gives me one plot at a time. If I combine the four matrices into one three-dimensional matrix, which I'll name seven, there should be a way of doing something like this contourplot(seven[,,k] for k in 1 to 4) such that they come out as one plot rather than four. I couldn't figure out how to do this, so I tried a disgusting alternative that involved generating x,y and k vectors, but I'd rather do it properly. Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SEM model testing with identical goodness of fits (2)
The purpose of carrying this CFA is to test the validity of a new developed scale tr with v* items, other two scales weber and tp are existing scales that measures specific risk attitudes. I am not sure if a simple correlation analysis is adequate to this purpose or not, thus the CFA test. Further, although a PCA has tested the dimensionality of all items, they are not divided as PCA result suggested, rather, their original grouping remains. The indicators are indeed not very well divided in PCA, mainly, o* items are located in two components. Originally, the EFA has been carried out on the first half of the sample and CFA on the second half. Due to the low fit indices from CFA of the partial sample, the full sample is tested in CFA to see if sample size affects much, and the results is as poor as before. It seems the time to read more about scale developing. And thanks for all these inputs. regards, John Fox wrote: Dear Hyena, OK -- I see that what you're trying to do is simply to fit a confirmatory factor-analysis model. The two models that you're considering aren't really different -- they are, as I said, observationally equivalent, and fit the data poorly. You can *assume* a common higher-level factor and estimate the three loadings on it for the lower-level factors, but you can't test this model against the first model. I'm not sure what you gain from the CFA beyond what you learned from an exploratory factor analysis. Using the same data first in an EFA and then for a CFA essentially invalidates the CFA, which is no longer confirmatory. One would, then, expect a CFA following an EFA to fit the data well, since the CFA was presumably specified to do so, but I suspect that a closer examination of the EFA will show that the items don't divide so neatly into the three sets. Regards, John -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 12:00 PM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the reply. Maybe I had used wrong terminology, as you pointed out, in fact, variables prob*, o* and v* are indicators of three latent variables(scales): weber, tp, and tr respectively. So variables prob*, o* and v* are exogenous variables. e.g., variable prob_dangerous_sport is the answers of question how likely do you think you will engage a dangerous sport? (1-very unlikely to 5- very likely). Variables weber, tr, tp are latent variables representing risk attitudes in different domains(recreation, planned behaviour, travel choice ). Hope this make sense of the models. By exploratory analysis, it had shown consistencies(Cronbach alpha) in each scale(latent variable tr, tp, weber), and significant correlations among these three scales. The two models mentioned in previous posts are the efforts to find out if there is a more general factor that can account for the correlations and make the three scales its sub scales. In this sense, SEM is used more of a CFA (sem is the only packages I know to do so, i did not search very hard of course). And Indeed the model fit is quite bad. regards, John Fox wrote: Dear hyena, -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of hyena Sent: March-15-09 4:25 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] SEM model testing with identical goodness of fits (2) Dear John, Thanks for the prompt reply! Sorry did not supply with more detailed information. The target model consists of three latent factors, general risk scale from Weber's domain risk scales, time perspective scale from Zimbardo(only future time oriented) and a travel risk attitude scale. Variables with prob_ prefix are items of general risk scale, variables of o1 to o12 are items of future time perspective and v5 to v13 are items of travel risk scale. The purpose is to explore or find a best fit model that correctly represent the underlining relationship of three scales. So far, the correlated model has the best fit indices, so I 'd like to check if there is a higher level factor that govern all three factors, thus the second model. Both models are very odd. In the first, each of tr, weber, and tp has direct effects on different subsets of the endogenous variables. The implicit claim of these models is that, e.g., prob_* are conditionally independent of tr and tp given weber, and that the correlations among prob_* are entirely accounted for by their dependence on weber. The structural coefficients are just the simple regressions of each prob_* on weber. The second model is the same except that the variances and covariances among weber, tr, and tp are parametrized differently. I'm not sure why you set the models up in this manner, and why your research requires a structural-equation model. I would have
[R] mvpart error - is.leaf
Hello, When trying to run mvpart either specifying my own parameters or using the defaults, I get the following error: Error in all(is.leaf) : unused argument(s) (c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE)) As far as I can tell, is.leaf is part of the dendrogam package, so I'm assuming there's some problem with the graphical parameters. However running same formula and data through rpart yields no errors, and I don't remember seeing this when I ran the same data through mvpart a couple years ago. Any suggestions as to where a solution might lie? Full output: summary(thesis.mvp) Call: mvpart(form = bat_sp ~ ., data = alltrees.df, size = 6, xval = 8, xvmult = 1000, plot.add = TRUE, text.add = TRUE, all.leaves = TRUE, bars = TRUE, legend = TRUE, bord = TRUE, prn = TRUE) n= 78 CP nsplit rel errorxerror xstd 1 0.23076923 0 1.000 1.000 0.1132277 2 0.1667 1 0.7692308 1.0940256 0.1122883 3 0.05128205 3 0.4358974 0.9313077 0.1125267 Error in all(is.leaf) : unused argument(s) (c(FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE)) Thanks in advance, Joshua Stumpf [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Testing for Inequality à la select case
Using R 2.7.0 under WinXP. I need to write a function that takes a non-negative vector and returns the parallell maximum between a percentage of this argument and a fixed value. Both the percentages and the fixed values depend on which interval x falls in. Intervals are as follows: From | To | % of x | Minimum --- 0 | 2| 65| 0 2 | 10 | 40| 14000 10 | 25 | 30 | 4 25 | 70 | 25 | 75000 70 | 100 | 20 | 175000 100 | inf | -- | 25 Once the interval is determined, the values in x are multiplied by the percentages applying to the range in the 3rd column. If the result is less than the fourth column, then the latter is used. For values of x falling in the last interval, 250,000 must be used. My best attempt at it in R: MyRange - function(x){ range_aux = ifelse(x=2, 1, ifelse(x=10, 2, ifelse(x=25, 3, ifelse(x=70, 4, ifelse(x=100, 5,6) percent = c(0.65, 0.4, 0.3, 0.25, 0.2, 0) minimum = c(0, 14000, 4, 75000, 175000, 25) pmax(x * percent[range_aux], minimum[range_aux]) } This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell), with a VBA function as follows: Function MyRange(x as Double) as Double Select Case x Case Is = 2 MyRange = 0.65 * x Case Is = 10 RCJuiProfDet = IIf(0.40 * x 14000, 14000, 0.4 * x) Case Is = 25 RCJuiProfDet = IIf(0.3 * x 4, 4, 0.3 * x) Case Is = 70 RCJuiProfDet = IIf(0.25 * x 75000, 75000, 0.25 * x) Case Is = 100 RCJuiProfDet = IIf(0.2 * x 175000, 175000, 0.2 * x) Case Else ' This is always 25. I left it this way so it is analogous to the R function RCJuiProfDet = IIf(0 * x 25, 25, 0 * x) End Select End Function Any way to improve my R function? I have searched the help archive and the closest I have found is the switch function, which tests for equality only. Thank you in advance for reading this. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22527465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
Hi, I think you could get a cleaner solution using ?cut to split your data in given ranges (the break argument), and then using this factor to give the appropriate percentage. Hope this helps, baptiste On 15 Mar 2009, at 20:12, diegol wrote: Using R 2.7.0 under WinXP. I need to write a function that takes a non-negative vector and returns the parallell maximum between a percentage of this argument and a fixed value. Both the percentages and the fixed values depend on which interval x falls in. Intervals are as follows: From | To | % of x | Minimum --- 0 | 2| 65| 0 2 | 10 | 40| 14000 10 | 25 | 30 | 4 25 | 70 | 25 | 75000 70 | 100 | 20 | 175000 100 | inf | -- | 25 Once the interval is determined, the values in x are multiplied by the percentages applying to the range in the 3rd column. If the result is less than the fourth column, then the latter is used. For values of x falling in the last interval, 250,000 must be used. My best attempt at it in R: MyRange - function(x){ range_aux = ifelse(x=2, 1, ifelse(x=10, 2, ifelse(x=25, 3, ifelse(x=70, 4, ifelse(x=100, 5,6) percent = c(0.65, 0.4, 0.3, 0.25, 0.2, 0) minimum = c(0, 14000, 4, 75000, 175000, 25) pmax(x * percent[range_aux], minimum[range_aux]) } This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell), with a VBA function as follows: Function MyRange(x as Double) as Double Select Case x Case Is = 2 MyRange = 0.65 * x Case Is = 10 RCJuiProfDet = IIf(0.40 * x 14000, 14000, 0.4 * x) Case Is = 25 RCJuiProfDet = IIf(0.3 * x 4, 4, 0.3 * x) Case Is = 70 RCJuiProfDet = IIf(0.25 * x 75000, 75000, 0.25 * x) Case Is = 100 RCJuiProfDet = IIf(0.2 * x 175000, 175000, 0.2 * x) Case Else ' This is always 25. I left it this way so it is analogous to the R function RCJuiProfDet = IIf(0 * x 25, 25, 0 * x) End Select End Function Any way to improve my R function? I have searched the help archive and the closest I have found is the switch function, which tests for equality only. Thank you in advance for reading this. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22527465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
On Sun, Mar 15, 2009 at 4:12 PM, diegol diego...@gmail.com wrote: ...This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell)... If you'd do it element-by-element in Excel, why not do it element-by-element in R? Create a table with the limits of the ranges range= c(20,100,250,700,1000,Inf)*1000 and then find the index of the appropriate case using something like idx - which(x=range)[1] Then the formula becomes simply pmax( x*perc[idx], min[idx] ) Putting it all together: mr - local({ # Local constants range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 function(x) { idx - which(x=range)[1] pmax( x*perc[idx], min[idx] ) } }) Make sense? -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey, planned contrasts or t-test without ANOV A? What is correct?
J S yulya258 at hotmail.com writes: I compare mean monthly body temperature between two age classes of turtles overwintering underground. lm(body_tem ~ Month*Year*Age_Class) TukeyHSD(aov(body_tem ~ Month*Year*Age_Class, a)) The Tukey HSD as well as the planned contrasts method showed significant differences between the two age classes, but insignificant differences between the two age classes at the same levels of months. In the opposite, using a t-test for comparison of independent means (i.e. without using the ANOVA) it demonstrated insignificant differences between the two age classes. What result is correct? If the more detailed breakdown shows an effect, use it, because it tells you that grouping was over different classes. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
Hello Stavros, If you'd do it element-by-element in Excel, why not do it element-by-element in R? Well, actually I was hoping for a vectorized solution so as to avoid looping. I need to use this formula on rather lengthy vectors and I wanted to put R's efficiency to some good use. In any case, I had not come up with your solution. For now, I'd stick to my ugly version. Make sense? Perfectly. x - 1:150 * 1 y - numeric(150) for (i in 1:150) y[i] - mr(x[i]) identical(MyRange(x), y) TRUE I would however use max instead of pmax, since the argument for mr() must be a vector of length 1. The final version looks like this (also added a line to avoid vectors of length 1): mr - local({ # Local constants range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 function(x) {if (length(x) 1) stop(x must have length 1) idx - which(x=range)[1] max( x*perc[idx], min[idx] ) } }) Thank you very much for your help. Diego Stavros Macrakis-2 wrote: On Sun, Mar 15, 2009 at 4:12 PM, diegol diego...@gmail.com wrote: ...This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell)... If you'd do it element-by-element in Excel, why not do it element-by-element in R? Create a table with the limits of the ranges range= c(20,100,250,700,1000,Inf)*1000 and then find the index of the appropriate case using something like idx - which(x=range)[1] Then the formula becomes simply pmax( x*perc[idx], min[idx] ) Putting it all together: mr - local({ # Local constants range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 function(x) { idx - which(x=range)[1] pmax( x*perc[idx], min[idx] ) } }) Make sense? -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22529091.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
Hello Baptiste, I am not very sure how I'd go about that. Taking the range, perc and min vectors from Stavros' response: range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 For range to work as the breaks argument to cut, I think an additional first element is needed: range = c(0, range) Now I create a dummy vector x and apply cut to create a factor z: x - 1:150 * 1 z - cut(x = x, breaks = range) The thing is, I cannot seem to figure out how to use this z factor to create vectors of the same length as x with the corresponding elements of percent and min defined above. Admittedly I have never felt very comfortable with factors. Could you please give me some advice? Thank you very much. baptiste auguie-2 wrote: Hi, I think you could get a cleaner solution using ?cut to split your data in given ranges (the break argument), and then using this factor to give the appropriate percentage. Hope this helps, baptiste On 15 Mar 2009, at 20:12, diegol wrote: Using R 2.7.0 under WinXP. I need to write a function that takes a non-negative vector and returns the parallell maximum between a percentage of this argument and a fixed value. Both the percentages and the fixed values depend on which interval x falls in. Intervals are as follows: From | To | % of x | Minimum --- 0 | 2| 65| 0 2 | 10 | 40| 14000 10 | 25 | 30 | 4 25 | 70 | 25 | 75000 70 | 100 | 20 | 175000 100 | inf | -- | 25 Once the interval is determined, the values in x are multiplied by the percentages applying to the range in the 3rd column. If the result is less than the fourth column, then the latter is used. For values of x falling in the last interval, 250,000 must be used. My best attempt at it in R: MyRange - function(x){ range_aux = ifelse(x=2, 1, ifelse(x=10, 2, ifelse(x=25, 3, ifelse(x=70, 4, ifelse(x=100, 5,6) percent = c(0.65, 0.4, 0.3, 0.25, 0.2, 0) minimum = c(0, 14000, 4, 75000, 175000, 25) pmax(x * percent[range_aux], minimum[range_aux]) } This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell), with a VBA function as follows: Function MyRange(x as Double) as Double Select Case x Case Is = 2 MyRange = 0.65 * x Case Is = 10 RCJuiProfDet = IIf(0.40 * x 14000, 14000, 0.4 * x) Case Is = 25 RCJuiProfDet = IIf(0.3 * x 4, 4, 0.3 * x) Case Is = 70 RCJuiProfDet = IIf(0.25 * x 75000, 75000, 0.25 * x) Case Is = 100 RCJuiProfDet = IIf(0.2 * x 175000, 175000, 0.2 * x) Case Else ' This is always 25. I left it this way so it is analogous to the R function RCJuiProfDet = IIf(0 * x 25, 25, 0 * x) End Select End Function Any way to improve my R function? I have searched the help archive and the closest I have found is the switch function, which tests for equality only. Thank you in advance for reading this. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22527465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context:
Re: [R] Testing for Inequality à la select case
Hi, I don't use ?cut and ?split very much either, so this may not be good advice. From what I understood of your problem, I would try something along those lines, range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 range = c(0, range) x - 1:150 * 1 percent - x minimum - x z - cut(x = x, breaks = range) levs - levels(z) split(percent, z, drop = FALSE) - perc split(minimum, z, drop = FALSE) - min mydf - data.frame(x, range= z, percent, minimum) mydf - within(mydf, product - x * percent) mydf$result - with(mydf, ifelse(product minimum, minimum, product)) str(mydf) head(mydf) but it's getting late here so i may well be missing an important thing. Hope this helps, baptiste On 15 Mar 2009, at 23:19, diegol wrote: Hello Baptiste, I am not very sure how I'd go about that. Taking the range, perc and min vectors from Stavros' response: range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 For range to work as the breaks argument to cut, I think an additional first element is needed: range = c(0, range) Now I create a dummy vector x and apply cut to create a factor z: x - 1:150 * 1 z - cut(x = x, breaks = range) The thing is, I cannot seem to figure out how to use this z factor to create vectors of the same length as x with the corresponding elements of percent and min defined above. Admittedly I have never felt very comfortable with factors. Could you please give me some advice? Thank you very much. baptiste auguie-2 wrote: Hi, I think you could get a cleaner solution using ?cut to split your data in given ranges (the break argument), and then using this factor to give the appropriate percentage. Hope this helps, baptiste On 15 Mar 2009, at 20:12, diegol wrote: Using R 2.7.0 under WinXP. I need to write a function that takes a non-negative vector and returns the parallell maximum between a percentage of this argument and a fixed value. Both the percentages and the fixed values depend on which interval x falls in. Intervals are as follows: From | To | % of x | Minimum --- 0 | 2| 65| 0 2 | 10 | 40| 14000 10 | 25 | 30 | 4 25 | 70 | 25 | 75000 70 | 100 | 20 | 175000 100 | inf | -- | 25 Once the interval is determined, the values in x are multiplied by the percentages applying to the range in the 3rd column. If the result is less than the fourth column, then the latter is used. For values of x falling in the last interval, 250,000 must be used. My best attempt at it in R: MyRange - function(x){ range_aux = ifelse(x=2, 1, ifelse(x=10, 2, ifelse(x=25, 3, ifelse(x=70, 4, ifelse(x=100, 5,6) percent = c(0.65, 0.4, 0.3, 0.25, 0.2, 0) minimum = c(0, 14000, 4, 75000, 175000, 25) pmax(x * percent[range_aux], minimum[range_aux]) } This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell), with a VBA function as follows: Function MyRange(x as Double) as Double Select Case x Case Is = 2 MyRange = 0.65 * x Case Is = 10 RCJuiProfDet = IIf(0.40 * x 14000, 14000, 0.4 * x) Case Is = 25 RCJuiProfDet = IIf(0.3 * x 4, 4, 0.3 * x) Case Is = 70 RCJuiProfDet = IIf(0.25 * x 75000, 75000, 0.25 * x) Case Is = 100 RCJuiProfDet = IIf(0.2 * x 175000, 175000, 0.2 * x) Case Else ' This is always 25. I left it this way so it is analogous to the R function RCJuiProfDet = IIf(0 * x 25, 25, 0 * x) End Select End Function Any way to improve my R function? I have searched the help archive and the closest I have found is the switch function, which tests for equality only. Thank you in advance for reading this. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22527465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] dispcrepancy between aov F test and tukey contrasts results with mixed effects model
lba...@montana.edu wrote: Thanks Peter for the advice and quick response. I just want to clarify what you suggest. I should average values within a site then do a one-way anova to test for differnces between sites based on the 2 to 3 new samples per stand type -- and not use random effects for site? Or, because I've reduced the data I've 'corrected' the problem with the glht multiple comparisons and can use the p-values from that summary if I include site as a random effect? Thanks again for your advice. This is tricky to say in a few lines, but the basic idea of a random effects model is that the site averages vary more than they should according to within-site variability. In the balanced case (equal number of observations per site), it turns out that the mixed-effects analysis is _equivalent_ to modeling the site averages. This is not ignoring the random effects of site; rather, it is coalescing it with the residual since the variance of a site average is v_site + 1/k v_res where k is the number of within-site observations. In the unbalanced case it is not strictly correct to analyze averages, because thy have different variances. However, the differences can be slight (when the k's are similar or v_site dominates in the above formula). A side effect of looking at averages is that you are fitting a plain lm model rather than lme and that glht in that case knows how to handle the degrees of freedom adjustment. (Assuming that the averages are normally distributed, which is as always dubious; but presumably, it is better than not correcting at all.) lba...@montana.edu wrote: Hello, I have some conflicting output from an aov summary and tukey contrasts with a mixed effects model I was hoping someone could clarify. I am comparing the abundance of a species across three willow stand types. Since I have 2 or 3 sites within a habitat I have included site as a random effect in the lme model. My confusion is that the F test given by aov(model) indicates there is no difference between habitats, but the tukey contrasts using the multcomp package shows that one pair of habits is significantly different from each other. Why is there a discrepancy? Have I specified my model correctly? I included the code and output below. Thank you. Looks like glht() is ignoring degrees of freedom. So what it does is wrong but it is not easy to do it right (whatever right is in these cases). If I understand correctly, what you have is that stand is strictly coarser than site, presumably the stands representing each 2, 2, and 3 sites, with a varying number of replications within each site. Since the between-site variation is considered random, you end up with a comparison of stands based on essentially only 7 pieces of information. (The latter statement requires some qualification, but let's not go there to day.) If you have roughly equal replications within each site, I'd be strongly tempted to reduce the analysis to a simple 1-way ANOVA of the site averages. co.lme=lme(coye~stand,data=t,random=~1|site) summary (co.lme) Linear mixed-effects model fit by REML Data: R AIC BIClogLik 53.76606 64.56047 -21.88303 Random effects: Formula: ~1 | site (Intercept) Residual StdDev: 0.3122146 0.2944667 Fixed effects: coye ~ stand Value Std.Error DFt-value p-value (Intercept) 0.4936837 0.2305072 60 2.1417277 0.0363 stand2 0.4853222 0.3003745 4 1.6157240 0.1815 stand3 -0.3159230 0.3251201 4 -0.9717117 0.3862 Correlation: (Intr) stand2 stand2 -0.767 stand3 -0.709 0.544 Standardized Within-Group Residuals: Min Q1Med Q3Max -2.4545673 -0.5495609 -0.3148274 0.7527378 2.5151476 Number of Observations: 67 Number of Groups: 7 anova(co.lme) numDF denDF F-value p-value (Intercept) 160 23.552098 .0001 stand 2 4 3.738199 0.1215 summary(glht(co.lme,linfct=mcp(stand=Tukey))) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: lme.formula(fixed = coye ~ stand, data = R, random = ~1 | site) Linear Hypotheses: Estimate Std. Error z value Pr(|z|) 2 - 1 == 0 0.4853 0.3004 1.616 0.2385 3 - 1 == 0 -0.3159 0.3251 -0.972 0.5943 3 - 2 == 0 -0.8012 0.2994 -2.676 0.0202 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) Lisa Baril Masters Candidate Department of Ecology Montana State University - Bozeman 406.994.2670 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO
Re: [R] Testing for Inequality à la select case
Hello Baptiste, Thanks so much for your help. This function which is basically your input wrapped with curly brackets seems to work alright: mr_2 - function(x){ range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 range = c(0, range) percent - x minimum - x z - cut(x = x, breaks = range) levs - levels(z) split(percent, z, drop = FALSE) - perc split(minimum, z, drop = FALSE) - min mydf - data.frame(x, range= z, percent, minimum) mydf - within(mydf, product - x * percent) mydf$result - with(mydf, ifelse(product minimum, minimum, product)) mydf$result } # Basic Test x - 1:150 * 1 identical(MyRange(x), mr_2(x)) [1] TRUE # Yet another test # (I will have a more in depth look at split, with and within to feel more comfortable) x - 150:1 * 1 identical(TramosAutos(x), mr_2(x)) [1] TRUE Again, thank you very much to both of you. Have a great week. Diego baptiste auguie-2 wrote: Hi, I don't use ?cut and ?split very much either, so this may not be good advice. From what I understood of your problem, I would try something along those lines, range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 range = c(0, range) x - 1:150 * 1 percent - x minimum - x z - cut(x = x, breaks = range) levs - levels(z) split(percent, z, drop = FALSE) - perc split(minimum, z, drop = FALSE) - min mydf - data.frame(x, range= z, percent, minimum) mydf - within(mydf, product - x * percent) mydf$result - with(mydf, ifelse(product minimum, minimum, product)) str(mydf) head(mydf) but it's getting late here so i may well be missing an important thing. Hope this helps, baptiste On 15 Mar 2009, at 23:19, diegol wrote: Hello Baptiste, I am not very sure how I'd go about that. Taking the range, perc and min vectors from Stavros' response: range= c(20,100,250,700,1000,Inf)*1000 perc = c(65,40,30,25,20,0)/100 min = c(0,14,40,75,175,250)*1000 For range to work as the breaks argument to cut, I think an additional first element is needed: range = c(0, range) Now I create a dummy vector x and apply cut to create a factor z: x - 1:150 * 1 z - cut(x = x, breaks = range) The thing is, I cannot seem to figure out how to use this z factor to create vectors of the same length as x with the corresponding elements of percent and min defined above. Admittedly I have never felt very comfortable with factors. Could you please give me some advice? Thank you very much. baptiste auguie-2 wrote: Hi, I think you could get a cleaner solution using ?cut to split your data in given ranges (the break argument), and then using this factor to give the appropriate percentage. Hope this helps, baptiste On 15 Mar 2009, at 20:12, diegol wrote: Using R 2.7.0 under WinXP. I need to write a function that takes a non-negative vector and returns the parallell maximum between a percentage of this argument and a fixed value. Both the percentages and the fixed values depend on which interval x falls in. Intervals are as follows: From | To | % of x | Minimum --- 0 | 2| 65| 0 2 | 10 | 40| 14000 10 | 25 | 30 | 4 25 | 70 | 25 | 75000 70 | 100 | 20 | 175000 100 | inf | -- | 25 Once the interval is determined, the values in x are multiplied by the percentages applying to the range in the 3rd column. If the result is less than the fourth column, then the latter is used. For values of x falling in the last interval, 250,000 must be used. My best attempt at it in R: MyRange - function(x){ range_aux = ifelse(x=2, 1, ifelse(x=10, 2, ifelse(x=25, 3, ifelse(x=70, 4, ifelse(x=100, 5,6) percent = c(0.65, 0.4, 0.3, 0.25, 0.2, 0) minimum = c(0, 14000, 4, 75000, 175000, 25) pmax(x * percent[range_aux], minimum[range_aux]) } This could be done in Excel much tidier in my opinion (especially the range_aux part), element by element (cell by cell), with a VBA function as follows: Function MyRange(x as Double) as Double Select Case x Case Is = 2 MyRange = 0.65 * x Case Is = 10 RCJuiProfDet = IIf(0.40 * x 14000, 14000, 0.4 * x) Case Is = 25 RCJuiProfDet = IIf(0.3 * x 4, 4, 0.3 * x) Case Is =
Re: [R] Problem with figure size when embedding fonts
On Sat, Mar 14, 2009 at 1:51 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Dear Colleagues: I need to make a graphic that uses the Nimbus rather than Helvetica font family so that the font can be embedded in the encapsulated postscript file. This is to satisfy a requirement from a journal for electronic submission of figures. I do the following: postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family='NimbusSan') spar(mfrow=c(3,2)) . . . dev.off() At this point lowess.eps looks fine. When I run: embedFonts('lowess.eps') or embedFonts('lowess.eps', options='-sPAPERSIZE=letter') the figures are wider and the right panel of the 3x2 matrix of plots expands past the paper edge. Advice welcomed. Hello Frank: This is an interesting post. I've not played with this until you brought it to my attention. I made a little progress. It appears to me either that there is something wrong with embedFonts as applied to postscript files or you and I need to learn a lot of ghostscript options. I created a small test and I see the same problem you do--the margins bump if you embed the fonts. I think the problem may be worse than that, it looks like the embedFonts wrecks the bounding box of the eps file. Do this: x - rnorm(100) y - rnorm(100) ml - loess(y~x) plot(x,y) lines(ml) postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family=NimbusSan) plot(x,y) lines(ml) text (-1,1,Really big writing) dev.off() Copy lowess.eps to a safe place, then run embedFonts(lowess.eps) when you compare the two eps files, the margins are quite different. I see the difference more sharply when I add the features I usually have in postscript: postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family=NimbusSan, height=6,width=6,paper=special) plot(x,y) lines(ml) text(-1,1, Really big writing dev.off() If you run that, save a copy of lowess.eps, then run embedFonts, you see the new lowess.eps no longer has the proper bounding box. Here's the top of lowess-orig.eps %!PS-Adobe-3.0 EPSF-3.0 %%DocumentNeededResources: font NimbusSanL-Regu %%+ font NimbusSanL-Bold %%+ font NimbusSanL-ReguItal %%+ font NimbusSanL-BoldItal %%+ font StandardSymL %%Title: R Graphics Output %%Creator: R Software %%Pages: (atend) %%BoundingBox: 0 0 432 432 %%EndComments %%BeginProlog On the other hand, the one that results from the embedFonts is not EPS anymore, and the bounding box is, well, confusing. %!PS-Adobe-3.0 %%Pages: (atend) %%BoundingBox: 0 0 0 0 %%HiResBoundingBox: 0.00 0.00 0.00 0.00 %. %%Creator: GPL Ghostscript 863 (pswrite) %%CreationDate: 2009/03/15 20:00:36 %%DocumentData: Clean7Bit %%LanguageLevel: 2 %%EndComments %%BeginProlog It appears to me that running the eps output file through ps2epsi will re-create a tight bounding box, perhaps it is sufficient for your need. Rather than wrestle with the postscript ghostscript, I re-did this on a pdf output device. It appears to me that the margin placement is not affected by embedFonts. pdf('lowess.pdf', onefile=FALSE, pointsize=18, family=NimbusSan,height=6,width=6,paper=special) plot(x,y) lines(ml) text (-1,1,Really big writing) dev.off() I believe it is important to specify paper=special here. I wondered about your experience with par(mfcol---). Lately, I've had more luck printing out the smaller separate components and putting them on the same LaTeX figure. That way, I have a little more control, and the fonts in the figure captions are identical to my text, which I like. PJ -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
Using cut/split seems like gross overkill here. Among other things, you don't need to generate labels for all the different ranges. which(x=range)[1] seems straightforward enough to me, but you could also use the built-in function findInterval. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] BOOTSTRAP_CROSS VALIDATION
I need a script that works for Bootstrap validation. Could someone explain me when should I use the Bootstrap technical or Jackknife? Thanks -- View this message in context: http://www.nabble.com/BOOTSTRAP_CROSS-VALIDATION-tp22529500p22529500.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] primitives again
On Sun, 15 Mar 2009 14:23:40 +0100 Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Edna Bell wrote: How do I find the functions which are primitives, please? you can scan the whole search path for functions that are primitives: primitives = sapply(search(), function(path) with(as.environment(path), Filter(is.primitive, lapply(ls(), get primitives is a list of named lists of primitives, one sublist for each attached package (most sublists will be empty, i guess). The code above will miss some primitives in package:base, namely those that start with a dot: R primitives - sapply(search(), function(path) + with(as.environment(path), +Filter(is.primitive, lapply(ls(), get R sapply(primitives, length) .GlobalEnv package:stats package:graphics package:grDevices 0 0 0 0 package:utils package:datasets package:methods Autoloads 0 0 2 0 package:base 176 R primitives - sapply(search(), function(path) + with(as.environment(path), +Filter(is.primitive, lapply(ls(all=TRUE), get R sapply(primitives, length) .GlobalEnv package:stats package:graphics package:grDevices 0 0 0 0 package:utils package:datasets package:methods Autoloads 0 0 2 0 package:base 188 Also, but that is a matter of taste, it could be preferable to use sapply instead of lapply: R primitives$`package:methods` [[1]] function (expr) .Primitive(quote) [[2]] .Primitive([[-) R head(primitives$`package:base`) [[1]] function (x) .Primitive(!) [[2]] function (e1, e2) .Primitive(!=) [[3]] .Primitive($) [[4]] .Primitive($-) [[5]] function (e1, e2) .Primitive(%%) [[6]] function (x, y) .Primitive(%*%) R primitives - sapply(search(), function(path) +with(as.environment(path), + Filter(is.primitive, sapply(ls(all=TRUE), get R primitives$`package:methods` $Quote function (expr) .Primitive(quote) $`el-` .Primitive([[-) R head(primitives$`package:base`) $`!` function (x) .Primitive(!) $`!=` function (e1, e2) .Primitive(!=) $`$` .Primitive($) $`$-` .Primitive($-) $`%%` function (e1, e2) .Primitive(%%) $`%*%` function (x, y) .Primitive(%*%) Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transfer function in R
I want to forecast a time series Y using a model that includes previous values of Y and an exogenous time series X using a transfer function. The standard procedure as described in Box and Jenkins and numerous other references is to first fit an ARIMA model to X. Use the ARIMA model to computer residuals for X and then apply the same ARIMA function to Y to compute residuals for Y. The cross correlation between these two sets of residuals then should allow discovery of the structure of the transfer function that relates X to Y. My question is how to identify this transfer function model using R. any help would be highly appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with figure size when embedding fonts
Paul Johnson wrote: On Sat, Mar 14, 2009 at 1:51 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Dear Colleagues: I need to make a graphic that uses the Nimbus rather than Helvetica font family so that the font can be embedded in the encapsulated postscript file. This is to satisfy a requirement from a journal for electronic submission of figures. I do the following: postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family='NimbusSan') spar(mfrow=c(3,2)) . . . dev.off() At this point lowess.eps looks fine. When I run: embedFonts('lowess.eps') or embedFonts('lowess.eps', options='-sPAPERSIZE=letter') the figures are wider and the right panel of the 3x2 matrix of plots expands past the paper edge. Advice welcomed. Hello Frank: This is an interesting post. I've not played with this until you brought it to my attention. I made a little progress. It appears to me either that there is something wrong with embedFonts as applied to postscript files or you and I need to learn a lot of ghostscript options. I created a small test and I see the same problem you do--the margins bump if you embed the fonts. I think the problem may be worse than that, it looks like the embedFonts wrecks the bounding box of the eps file. Do this: x - rnorm(100) y - rnorm(100) ml - loess(y~x) plot(x,y) lines(ml) postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family=NimbusSan) plot(x,y) lines(ml) text (-1,1,Really big writing) dev.off() Copy lowess.eps to a safe place, then run embedFonts(lowess.eps) when you compare the two eps files, the margins are quite different. I see the difference more sharply when I add the features I usually have in postscript: postscript('lowess.eps', onefile=FALSE, pointsize=18, horizontal=FALSE, family=NimbusSan, height=6,width=6,paper=special) plot(x,y) lines(ml) text(-1,1, Really big writing dev.off() If you run that, save a copy of lowess.eps, then run embedFonts, you see the new lowess.eps no longer has the proper bounding box. Here's the top of lowess-orig.eps %!PS-Adobe-3.0 EPSF-3.0 %%DocumentNeededResources: font NimbusSanL-Regu %%+ font NimbusSanL-Bold %%+ font NimbusSanL-ReguItal %%+ font NimbusSanL-BoldItal %%+ font StandardSymL %%Title: R Graphics Output %%Creator: R Software %%Pages: (atend) %%BoundingBox: 0 0 432 432 %%EndComments %%BeginProlog On the other hand, the one that results from the embedFonts is not EPS anymore, and the bounding box is, well, confusing. %!PS-Adobe-3.0 %%Pages: (atend) %%BoundingBox: 0 0 0 0 %%HiResBoundingBox: 0.00 0.00 0.00 0.00 %. %%Creator: GPL Ghostscript 863 (pswrite) %%CreationDate: 2009/03/15 20:00:36 %%DocumentData: Clean7Bit %%LanguageLevel: 2 %%EndComments %%BeginProlog It appears to me that running the eps output file through ps2epsi will re-create a tight bounding box, perhaps it is sufficient for your need. Rather than wrestle with the postscript ghostscript, I re-did this on a pdf output device. It appears to me that the margin placement is not affected by embedFonts. pdf('lowess.pdf', onefile=FALSE, pointsize=18, family=NimbusSan,height=6,width=6,paper=special) plot(x,y) lines(ml) text (-1,1,Really big writing) dev.off() I believe it is important to specify paper=special here. I wondered about your experience with par(mfcol---). Lately, I've had more luck printing out the smaller separate components and putting them on the same LaTeX figure. That way, I have a little more control, and the fonts in the figure captions are identical to my text, which I like. PJ Paul, Thank you very much. Your pdf fix did the trick. What is your favorite way of composing the multiple plots into a LaTeX figure? ps2epsi resulted in a good bounding box but a huge graphics file. Thanks, Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing for Inequality à la select case
That's what I meant by element-by -element. A vector in R corresponds to a row or a column in Excel, and a vector operation in R corresponds to a row or column of formulae, e.g. Excel A B C 1) 5 10 a1+b1 (= 15) 2) 3 2 a2+b2 (= 5) etc. R A - c(5,3) B - c(10,2) C - A + B Steve, I still don't understand the analogy. I agree that in this case the R approach is vectorized. However, your function just as you first proposed it will not work without a loop. max and pmax are equivalent in this case. I just use pmax as my default because it acts like other arithmetic operators (+, *, etc.) which perform pointwise (element-by-element) operations. It's true. I changed it because I had applied your original version of mr() to the entire vector x, which gave an incorrect result (perhaps range was recycled in idx - which(x=range)[1]). If I used max instead of pmax, and ever happened to use mr() without a loop, the length of the result would be strange enough for me to realise the error. But then again, I added the if (length(x) 1) stop(x must have length 1) line, so using max or pmax now doesn't really make a difference, apart perhaps from run time. Using cut/split seems like gross overkill here. Among other things, you don't need to generate labels for all the different ranges. which(x=range)[1] seems straightforward enough to me, I could edit the mr_2() function a little bit to make it more efficient. I left it mostly unchanged for the thread to be easier to follow. For example I could replace the last four lines for only: product - x*percent ifelse(product minimum, minimum, product) But I believe you refer to the cut/split functions rather. I agree that which(x=range)[1] is straighforward, but using such expression will require a loop to pull the trick, which I don't intend. Am I missing something? Regards, Diego Stavros Macrakis-2 wrote: Using cut/split seems like gross overkill here. Among other things, you don't need to generate labels for all the different ranges. which(x=range)[1] seems straightforward enough to me, but you could also use the built-in function findInterval. -s __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - ~~ Diego Mazzeo Actuarial Science Student Facultad de Ciencias Económicas Universidad de Buenos Aires Buenos Aires, Argentina -- View this message in context: http://www.nabble.com/Testing-for-Inequality-%C3%A0-la-%22select-case%22-tp22527465p22531513.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Fitting GUMBEL Distribution - CDF function and P P Plot
Dera R Helpers, I am re-posting my query. Please guide me. Maithili --- On Fri, 3/13/09, Maithili Shiva maithili_sh...@yahoo.com wrote: I am trying to fit the Gumbel distribution to a data. I am using lmom package. I am getting problem in Cumulative Distribution Function of Gumbel distribution as I am getting it as a series of 0's and 1's thereby affecting the P P Plot. My R code is as follows. library(quantreg) library(RODBC) library(MASS) library(actuar) library(lmom) x - c(986.78,1067.76,1046.47,1034.71,1004.53,1007.89,964.94,1060.24,1188.07,1085.63,988.33,972.71,1177.71,972.48,1203.20,1047.27,1062.95,1113.65,995.97,1093.98) #Estimating the parameters for GUMBEL distribution N- length(x) lmom - samlmu(x); lmom parameters_of_GUMBEL - pelgum(lmom); parameters_of_GUMBEL # Parameters are xi = 1019.4003 alpha = 59.5327 # _ P - P Plot e- c(1:N) f- c((e-.5)/N) Fx - cdfgum(x, para = parameters_of_GUMBEL) g- sort(Fx) png(filename = GUMBEL_P-P.png) a - par('bg'= #CC) plot (f,g,bg=a,fg= #804000,main =P-P Plot, ylab= Cumulative Distribution Function, xlab=i, font.main=2, cex.main=1,col=#66,bty = o,col.main=black,col.axis=black,col.lab =black) abline(rq(g ~ f, tau = .5),col=red) dev.off() # Fx RETURNS0 1 1 1 0 0 0 1 1 1 0 0 1 0 1 1 1 1 0 1 and Thus plot is not proper Please guide me as where I am going wrong. With regards Maithili __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.