[R] Putting R Based open source analytics for collobrative spreadsheet working on the Cloud
Dear All, I just posted an interview with Karim Chine of http://www.biocep.net/ who has successfully built a latform for on demand data mining enabled by the cloud through R. Here is an except BIOCEP is built on top of R and Scilab and anything that you can do within those environments is accessible through BIOCEP. Here is what you have uniquely with this new R/Scilab-based e-platform: - *High productivity* via the most advanced cross-platform workbench available *for the R environment*. - *Advanced Graphics: with BIOCEP, a graphic transducer allows the rendering on client side of graphics produced on server s*ide and enables advanced capabilities like zooming/unzooming/scrolling for R graphics. a client side mouse tracker allows to display dynamically information related to the graphics and depending on coordinates. Several virtual R Devices showing different data can be coupled in zooming/scrolling and this helps comparing visually complex graphics. - *Extensibility with plug-ins:* new views (IDE-like views, analytical interfaces...) can be created very easily either programmatically or via drag-and-drop GUI designers. - *Extensibility with server-side extensions: any java code can be packaged and used on server side.* The code can interact seamlessly with R and Scilab or provide generic bridges to other software. For example, I provide an extension that allows you to use openoffice as a universal converter between various files formats on server side. - *Seamless High Performance Computing:* working with an R or Scilab on clusters/grids/clouds becomes as simple as working with them locally. Distributed computing becomes seamless, creating a large number R and Scilab remote engines and using them to solve large scale problems becomes easier than ever. From the R console the user can create logical links to existing R engines or to newly created ones and use those logical links to pilot the remote workers from within his R session. R functions enable using the logical links to import/export variables from the R session to the different workers and vice versa. R commands/scripts can be executed by the R workers synchronously or asynchronously. Many logical R links can be aggregated into one logical cluster variable that can be used to pilot the R workers in a coordinated way. A cluster.apply function allows the usage of the logical cluster to apply a function to a big data structure by slicing it and sending elementary execution commands to the workers. The workers apply the user's function to the slices in parallel. The elementary results are aggregated to compose the final result that becomes available within the R session. - *Collaboration:* your R/scilab server running in the cloud can be accessed simultaneously by you and your collaborators. Everything gets broadcasted including Graphics. A spreadsheet enables to view and edit data collaboratively. Anyone can write plug-ins to take advantage of the collaborative capabilities of the frameworks. If your IP address is public, you can provide a URL to anyone and get him connect to your locally running R. *- Powerful frameworks for Java developers:* BIOCEP provides Frameworks and tools to use R as if it was an Object Oriented Java Toolkit or a Web Toolkit for R-based dynamic application. - *Webservices for C#, Perl, Python users/developers:* Most of the capabilities of BIOCEP including piloting of R/Scilab engines on the cloud for distributed computing or for building scalable analytical web application are accessible from most of the programming languages thanks to the SOAP front-end. - *RESTful API:* simple URLs can perform computing using R/Scilab engines and return the result as an XML or as graphics in any format. This works like google charts and has all the power of R since the graphic is described with an R script provided as a parameter of the URL. The same API can be exposed on demand by the workbench. This allow for example to integrate a Cloud-R with Excel or OpenOffice. The workbench works as a bridge between the cloud and those applications. While a screenshot is attached- you can read the rest of the interview at http://tr.im/Rcloud or http://www.decisionstats.com/ Thanks, Ajay Ohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with checking wether file is present or not
Hi all, I have a problem with checking File is present in the directory or not like I have a sequence of files in one folder I have to take each file in order and have to caliculate on those files data but in order some files are missing for that I have to check and load those files for that I am using condition like if(file.exists(findings)==TRUE){} its giving results for files which are present in that folder but for files which are not present in that folder its giving error but it have to skip and run for remaining files can any one suggest any other way to make it work for all those files thanks in advance kiran [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Floating point precision / guard digits? (PR#13771)
(re-posting to the appropriate list) On 6/20/09, Dr. D. P. Kreil dpkr...@gmail.com wrote: you can suggest an online resource to help me use the right vocabulary and better understand the fundamental concepts, I am of course There is in R the accuracy [1] package. It has a vignette (and paper) dealing with various computational errors (in R). Liviu [1] http://cran.r-project.org/web/packages/accuracy/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error when using step
I have two questions about the built-in function step. Ultimately I want to apply a lm fitting and subsequent step procedure to thousands of data sets groups by a factor defined as a unique ID. Q1. The code below creates a data.frame comprising three marginally noisy surfaces. The code below fails when I use step in a function but summary seems to show the model fits are legitimate. Any ideas on what I'm doing wrong? # y = x1 + x2 for grp A # y = 2 + 2x1 + 4x2 for grp B # y = 3 + 2x1 + 4x2 + 6x1x2 for grp C ind - matrix(runif(200), ncol=2, dimnames=list(c(), c(x1,x2))) d1 - data.frame(ind, y=ind[,x1]+ind[,x2]+rnorm(100,0,0.05), grp=rep(A,100)) d2 - data.frame(ind, y=2+2*ind[,x1]+4*ind[,x2]+rnorm(100,0,0.05),grp=rep(B, 100)) d3 - data.frame(ind, y=3+2*ind[,x1]+4*ind[,x2]+6*ind[,x1]*ind[,x2]+rnorm(100,0,0.05),grp=rep(C,100)) data2 - rbind(d1,d2,d3) # Fit each surface by grp model - y ~ x1*x2 fits - lapply(split(data2, list(data2$grp), drop=TRUE), function(x){lm(model, data = x)}) lapply(fits, function(x){summary(x)}) # FAIL lapply(fits, function(x){step(x)}) # Output Start: AIC=-601.41 y ~ x1 * x2 Df Sum of Sq RSS AIC - x1:x2 1 0.001590.23 -602.71 none 0.23 -601.41 Error in as.data.frame.default(data) : cannot coerce class lm into a data.frame I get a different error if I use step directly on the first fit in the list. step(fits[[1]]) # Output Start: AIC=-601.41 y ~ x1 * x2 Df Sum of Sq RSS AIC - x1:x2 1 0.001590.23 -602.71 none 0.23 -601.41 Error in eval(predvars, data, env) : numeric 'envir' arg not of length one However step works if I do the fit outside of a function. fit - lm(model, data=data2[which(data2$grp==A),]) step(fit) # Output Start: AIC=-601.41 y ~ x1 * x2 Df Sum of Sq RSS AIC - x1:x2 1 0.001590.23 -602.71 none 0.23 -601.41 Step: AIC=-602.71 y ~ x1 + x2 Df Sum of Sq RSS AIC none 0.23 -602.71 - x21 8.368.58 -241.54 - x11 9.559.78 -228.47 Call: lm(formula = y ~ x1 + x2, data = data2[which(data2$grp == A), ]) Coefficients: (Intercept) x1 x2 0.001457 0.999313 1.010880 Q2. How can I setup step to exclude the intercept. In this example above both the intercept and interaction terms are insignificant. I have tried setting direction to both and forward but that didn't help. Thanks for your help. -- View this message in context: http://www.nabble.com/Error-when-using-step-tp24142487p24142487.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with checking wether file is present or not
On Jun 22, 2009, at 2:21 AM, venkata kirankumar wrote: Hi all, I have a problem with checking File is present in the directory or not like I have a sequence of files in one folder I have to take each file in order and have to caliculate on those files data but in order some files are missing for that I have to check and load those files for that I am using condition like if(file.exists(findings)==TRUE){} What is findings? Have you defined it elsewhere? What was the code? Is findings a properly constructed file name for R's interaction with whatever filesystem you are using? file.exists( file=/Users/davidwinsemius/Documents/R_folder/ actuar.pdf) [1] TRUE #would have been FALSE if only the file name was used, since that is not my working directory. its giving results for files which are present in that folder but for files which are not present in that folder its giving error but it have to skip and run for remaining files And am unable to reproduce the problem with a similar call. file.exists(xyz)==TRUE [1] FALSE You should note that file.exists returns a logical vector, so the comparison with TRUE is superfluous. can any one suggest any other way to make it work for all those files Without more specifics it's not possible to reproduce your problem. Include full code and the results of sessionInfo. ?sessionInfo thanks in advance kiran [[alternative HTML version deleted]] *** PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ! David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to customize the CV function in Boosting?
Hi all, Is there a way to write my own error criteria function in the cross-validation part of the Boosting algorithm? I am talking about the GBM package. Of course, if you know other good Boosting implementation in R, please give me some pointers! I am looking for a good classifier (not neccessarily Boosting) in R. Thank you very much! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind
I'm guessing that your 'data' and 'data1' are just vectors so your 'rbind' command returns a 2 by 3 matrix. Jim showed you already that: rbind(as.matrix(data), as.matrix(data1)) will probably get you what you are looking for. However, I'm suspicious that just: c(data, data1) will serve you just as well. What are you planning on doing with a one-column matrix? Patrick Burns patr...@burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of The R Inferno and A Guide for the Unwilling S User) Xiaogang Yang wrote: Hi, I have a array like this data: 1 5 2 2342 3 33 and another data1: 1 6 2 5 3 7 when I do rbind(data,data1) I get not what I want they become 1 5 2 2342 3 33 101 6 102 5 103 7 but I want to make the index as increasing one by one. like 1 .. 2 .. 3 .. 4 .. 5 .. 6 .. So what command I should use thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when using step
Chris Friedl cfriedalek at gmail.com writes: I have two questions about the built-in function step. Ultimately I want to apply a lm fitting and subsequent step procedure to thousands of data sets groups by a factor defined as a unique ID. Q1. The code below creates a data.frame comprising three marginally noisy surfaces. The code below fails when I use step in a function but summary seems to show the model fits are legitimate. Any ideas on what I'm doing wrong? ... Well designed example code omitted function(x){lm(model, data = x)}) lapply(fits, function(x){summary(x)}) # FAIL lapply(fits, function(x){step(x)}) Error in as.data.frame.default(data) : cannot coerce class lm into a data.frame Looks like an environment problem. I could not find a workaround quickly, but you might have a look at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html We call it Ripley's Game here, because variants of it can help you quite often. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind
Now that I've actually read the question, I'm in a better position to answer it. I have no idea how you are getting the results that you show, but you can use 'rownames' to set whatever row names you like. As in: rownames(result) - 1:6 Pat Patrick Burns wrote: I'm guessing that your 'data' and 'data1' are just vectors so your 'rbind' command returns a 2 by 3 matrix. Jim showed you already that: rbind(as.matrix(data), as.matrix(data1)) will probably get you what you are looking for. However, I'm suspicious that just: c(data, data1) will serve you just as well. What are you planning on doing with a one-column matrix? Patrick Burns patr...@burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of The R Inferno and A Guide for the Unwilling S User) Xiaogang Yang wrote: Hi, I have a array like this data: 1 5 2 2342 3 33 and another data1: 1 6 2 5 3 7 when I do rbind(data,data1) I get not what I want they become 1 5 2 2342 3 33 101 6 102 5 103 7 but I want to make the index as increasing one by one. like 1 .. 2 .. 3 .. 4 .. 5 .. 6 .. So what command I should use thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert NA to '.'
Dear R-users, For reporting purpose (using Sweave and LaTeX), I am creating complex tables with the cat function such as x-c(A, B, C, NA) cat(x, '\n') A B C NA For convenience, I would like to change all my NA value to something else like '.' (as in SAS for example). Is there a global option which allows this change? Or should I change all my code to work with the print function and the na.print argument? Best regards, David -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with installation of local gzip-ped packages
mau...@alice.it wrote: I was suggested to install two tar-red and gzip-ped packages that are not part of CRAN or BioConductors yet. I read the R manual about Administration and could only find a good description of how to install packages not canonically included in CRAN repository, on UNIX systems. Which part of the section Installing packages, including the subsection Windows, is not clear? If you need some example, the R Help Desk article called Package Management in R News 3 (3), pp. 37-39, might be of interest. Best, Uwe Ligges I work on Linux/SuSE and Windows ... so I am stuck with such an installation. Any suggestion in very welcome. Thank you. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] coloring agnes plots
Hi all, I am creating dendrograms using agnes and was wondering if it is possible to add color to the leaves (and just the leaves). For example, in the documentation, they have an example using the votes.repub data set. If I wanted to make the word Washington green (and only Washington), is it possible and if so how? I can apply summary to the object created by agnes and even see the names of the objects: Order of objects: [1] AlabamaGeorgiaArkansas Louisiana Mississippi ... but I don't know how to designate colors to the objects... Thank you! Ray __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] filtering number of values in a data frame
This does exactly what is needed. Thanks guys! -Original Message- From: markle...@verizon.net [mailto:markle...@verizon.net] Sent: Thursday, June 18, 2009 5:30 PM To: rene.schoenem...@tu-berlin.de Subject: Re: [R] filtering number of values in a data frame do.call(rbind,lapply(unname(split(DF,list(DF$X1,DF$START),drop=TRUE)), function(.df) { data.frame(.df[1,1:4],count=nrow(.df)) })) On Jun 18, 2009, René Schönemann rene.schoenem...@tu-berlin.de wrote: Dear list, given is the following data frame df(): Number Place Start End 1 218024740787 HHO 5 263 2008-01-02 00:21:14 2008-01-03 15:25:16 2 218024740787 HHO 5 263 2008-01-02 00:21:14 2008-01-02 00:21:14 3 318039091794 HHO 5 263 2008-01-02 00:21:14 2008-01-02 13:22:54 4 318039091794 HHO 5 263 2008-01-02 00:21:14 2008-01-02 00:21:14 5 318039379900 HHO 1 104 2008-01-02 06:45:01 2008-01-02 09:15:23 Now, I want to count the number of equal values of column Start but I also want the other columns to be preserved. Using: rle(as.character(df$Start)) - m n - data.frame(m$values, m$lengths) produces a list of items according to their frequency of the Start point: m.values m.lengths 1 2008-01-02 00:21:14 4 2 2008-01-02 06:45:01 1 I want now also other columns to be in this new data frame. It should look like that: Number Place m.values m.lengths 1 218024740787 HHO 5 263 2008-01-02 00:21:14 4 2 318039379900 HHO 1 104 2008-01-02 06:45:01 1 Does anybody can help me with this? Thanking you in advance! René Schönemann -- __ Technische Universität Berlin Institut für Land- und Seeverkehr Fachgebiet Schienenfahrwege und Bahnbetrieb Prof. Dr.-Ing. habil. Jürgen Siegmann Post Sekretariat SG 18 Salzufer 17-19 D-10587 Berlin Telefon +49 (0)30 314 - 23 314 Internet http://www.railways.tu-berlin.de __ Dipl.-Verk.wirtsch. René Schönemann - Wissenschaftlicher Mitarbeiter - Telefon +49 (0)30 314 - 22 710 Telefax +49 (0)30 314 - 25 530 E-Mail rschoenem...@railways.tu-berlin.de __ Technische Universität Berlin Körperschaft öffentlichen Rechts Präsident Prof. Dr. Kurt Kutzler __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help to optimize a piece of code involving zoo objects
Hi, everybody OK, I got it working with recursive. Don't know why this argument slipped my mind, as I use filter() so often! Now it is 44 times faster, which is good enough for me. :-) Thank you, Gabor and Jim. Best, Sergey On Fri, Jun 19, 2009 at 15:23, jim holtmanjholt...@gmail.com wrote: check out 'filter' to see if it does what you want with the 'recursive' option. On Fri, Jun 19, 2009 at 3:33 AM, Sergey Goriatchev serg...@gmail.com wrote: Hello, everyone I have a long script that uses zoo objects. In this script I used simple moving averages and these I can very efficiently calculate with filter() functions. Now, I have to use special exponential moving averages, and the only way I could write the code was with a for-loop, which makes everything extremely slow. I don't know how to optimize the code, but I need to find a solution. I hope someone can help me. The special moving average is calculated in the following way: EMA = ( K x ( C - P ) ) + P where, C = Current Value P = Previous periods EMA (A SMA is used for the first period's calculation) K = Exponential smoothing constant K = 2 / ( 1 + Periods ) Below is the code with the for-loop. -temp contains C -Periods is variable j in the for loop (so K varies) - I first produce a vector of simple equally weighted moving average, and use the first non-NA value to initiate the second for-loop x.Date - as.Date(2003-02-01) + seq(1,1100) - 1 temp - zoo(rnorm(1100, 0, 10)+100, x.Date) start.time - proc.time() for(j in seq(5,100,by=5)){ #PRODUCE FAST MOVING AVERAGE #Create equally weighted MA vector (we need only the first value) smafast - zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/j, j), sides=1)), order.by=time(temp)) #index of first non-NA value, which is the first SMA needed #which(is.na(smafast))[length(which(is.na(smafast)))]+1 #Calculate decay factor K #number of periods is j K - 2/(1+j) #Calculate recursively the EMA for the fast index (starting with second non-NA value) for (k in (which(is.na(smafast))[length(which(is.na(smafast)))]+2):length(smafast)) { smafast[k] - coredata(smafast[k-1])+K*(coredata(temp[k,1])-coredata(smafast[k-1])) } #PRODUCE SLOW MOVING AVERAGE #Create equally weighted MA vector (we need only the first value) smaslow - zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/(j*4), (j*4)), sides=1)), order.by=time(temp)) K - 2/(1+j*4) #Calculate EMA for (k in (which(is.na(smaslow))[length(which(is.na(smaslow)))]+2):length(smaslow)) { smaslow[k] - coredata(smaslow[k-1])+K*(coredata(temp[k,1])-coredata(smaslow[k-1])) } #COMBINE DIFFERENCES OF FAST AND SLOW temp - merge(temp, ma=smafast-smaslow) } proc.time()-start.time -- I'm not young enough to know everything. /Oscar Wilde Experience is one thing you can't get for nothing. /Oscar Wilde When you are finished changing, you're finished. /Benjamin Franklin Tell me and I forget, teach me and I remember, involve me and I learn. /Benjamin Franklin Luck is where preparation meets opportunity. /George Patten __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? -- I'm not young enough to know everything. /Oscar Wilde Experience is one thing you can't get for nothing. /Oscar Wilde When you are finished changing, you're finished. /Benjamin Franklin Tell me and I forget, teach me and I remember, involve me and I learn. /Benjamin Franklin Luck is where preparation meets opportunity. /George Patten __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] png() resolution problem {was Silhouette ...}
Martin Maechler wrote: Hallo Sebastian, SP == Sebastian Pölsterl s...@k-d-w.org on Sun, 14 Jun 2009 14:04:52 +0200 writes: SP Hello Martin, SP I plotting the silhouette of a clustering and storing it as png. When I SP try to store the image as png the bars are missing. The bars are plotted SP when I use x11 or postscript as device. In addition, it seems to work SP when I use a smaller matrix (e.g. ruspini). SP Would be great if you have look at this issue. Hmm, I've been at a conference in Italy... The silhouette plot only uses standard R plotting functions, so any problem with it exposes problems in standard R graphics. -- Such a message should really go to R-help. to which I CC now. -- library(cluster) nmat - matrix(rnorm(2500*300), ncol=300, nrow=2500) rmat - matrix(rchisq(1000, 300, 50), ncol=300, nrow=1000) mat - rbind(nmat, rmat) pr - pam(mat, 2) sil - silhouette(pr) png(sil.png) #postscript(sil.ps) plot(sil) dev.off() -- Anyway, I can confirm the problem, but of course, it has not much to do with the silhouette function, but rather with the png() device which produces a bitmap, and the lines you draw are too fine (in the bitmap resolution) and so are rounded to invisible. You can reproduce the problem much more simply: set.seed(1); x - rlnorm(5000) png(bar.png);barplot(x,col=gray,border=0,horiz=TRUE);dev.off() system(eog bar.png ) ## which is also empty, and the completely analogue, replacing ## png [bitmap] with pdf [vector graphic] pdf(bar.pdf);barplot(x,col=gray,border=0,horiz=TRUE);dev.off() system(evince bar.pdf ) ## gives a very nice plot, into which you can zoom and see all details. Now in principle you should be able to use png() with a much higher resolution than the default one, but replacing the above png(bar.bng) with png(bar.bng, res = 1200) did not help, as we now get the infamous Error in plot.new() : figure margins too large Other R-help readers will be able to make the png() example work for such cases, where you need so many lines. {but let's stick with barplot(*, border=0, *)} Well, it's pretty hard, but one of the few cases where conversion from some vector image to the bitmap produces fair results, hence I'd use bitmap() with ghostscript to generate the png as in: bitmap(sil.png, type=pnggray, res=300) plot(sil) dev.off() Best, Uwe Regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help using substitute and expression functions
I'm stopped at a browser in a loop where the following objects look like this: Browse[1] jk [1] 1 Browse[1] leg.ab[jk] [1] Snails Rep1 Browse[1] top.k [1] LT95=7.5; LT99=8.8 I can join them and a few other characters together like this easily enough: Browse[1] paste(jk, : , leg.ab[jk], [, top.k, ], sep = ) [1] 1: Snails Rep1 [LT95=7.5; LT99=8.8] Browse[1] Now, suppose that instead of a simple character string for top.k, I had an expression like this: Browse[1] leg.k.exp expression(paste(LT[95] == 7.5, ; , LT[99] == 8.8)) Browse[1] which was created in a slightly complicated loop joining the 95 and 99 bits together. That code is designed to have a variable number of those bits. I tried using substitute(), bquote and expression() to join the leg.k.exp object in with the other characters, but it always falls over trying to parse the : character string instead of using it just as a string. I can see in the help for plot.math lots of ways of adding mathematical symbols and Greeek letters, but nothing for what I wish to do. There's a trick or two I'm missing. Apologies for the lack of specifics of what I've tried. That's on another computer inaccessible from this one, and that one doesn't have a usable email client. TIA -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on creating a sequence of vectors
I want to create a number of vectors like : vec1 - rnorm(1) vec2 - rnorm(2) vec3 - rnorm(3) and so on... Here I tried following : for (i in 1:10) paste(vec, i, sep=) - rnorm(i) However obviously that is not working. Here vectors I need to be seperated i.e I do not want to create a list. How to modify above code? -- View this message in context: http://www.nabble.com/Help-on-creating-a-sequence-of-vectors-tp24144347p24144347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on creating a sequence of vectors
megh: I want to create a number of vectors like : vec1 - rnorm(1) vec2 - rnorm(2) vec3 - rnorm(3) and so on... Maybe try the assign() function. Something like: for (i in 1:10) assign ( paste ( vec , i , sep = ) , rnorm(i) ) Kind regards, Nikos __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Customize axis labels in xyplot
I have attached 2 files sample.csv and sample.r to illustrate the problem. Thank you for considering. -- View this message in context: http://www.nabble.com/Customize-axis-labels-in-xyplot-tp24126788p24143742.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: subscripts, groups and subset
Hi, I'm running the following code to produce lattice plots of microfibril angle versus ring number in Scots pine. There are 12 trees and 5 sample positions (Position) in each tree: xyplot(MFA ~ RN | Tree, data = MFA.data, groups = Position, subscripts=TRUE, auto.key=list(space = top, points = FALSE, lines = TRUE, reverse.rows=TRUE, title=Disc Position (1=1.3m), cex=0.5, text=paste(Disc:, levels(MFA.data$Position))), xlab = Ring Number, ylab = MFA, panel = function(x, y, subscripts, groups) { panel.grid(h=-1, v= 2) panel.xyplot(x, y, subscripts=subscripts, groups=groups, type=a) panel.superpose(x=x, y=y, groups=Position, subscripts=subscripts,lty=8, cex=0.25) }) But it is giving me the error message: }) Error: unexpected '}' in} This code has worked for me previously but now will not output the plots. Does anyone know what's going on here? Thanks in advance, Dave Dave Auty MSc PhD Student Forest Research NRS Roslin Midlothian EH25 9SY Tel: +44(0)131 445 6936 Fax: +44(0)131 445 5124 Mob: +44(0)7896 685895 www.forestresearch.gov.uk/timberproperties + The Forestry Commission's computer systems may be monitored and communications carried out on them recorded, to secure the effective operation of the system and for other lawful purposes. + The original of this email was scanned for viruses by the Government Secure Intranet (GSi) virus scanning service supplied exclusively by Cable Wireless in partnership with MessageLabs. On leaving the GSi this email was certified virus-free [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New line operator in mtext
Dear R Users, I'm finding that when I execute the following bit of code, that the new line argument is actually displayed as text in the graphics device. How do I avoid this happening? mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff (mm/month), /n, and Summed Monthly Precipitation (mm x ,10^2,/month I suspect that I've done, or omitted, something fairly obvious, but as yet cannot see it! Thanks for your help, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] QQplots of probability vector data
I'm trying to determine if a set of data is normal from a qq plot but seem to be having a bit of difficulty. I have a file of the following form 9 36 3 37 6 38 7 39 . where the left column is the frequency of the number in the right column. I've found the probabilities of each number and put it in a file of the form 36 .0009 37 .0003 38 .0006 39 .0007 where the first column is the number and the second is the probability of that number. Now what I've done so far is as follows: null=read.table(probfile.txt,header=FALSE)#Prob file is the 2nd file where $V1 is the number and $V2 isthe probability and I want to do x=qnorm(null$V2, ... ) but I don't know how to get the mean and sd from those 2 files. When I looked up mean() and sd(), I had to give a vector of numbers. As I only have the number of occurances and their probability, I can't really get a vector of this data unless I make another file that has the frequency of the numbers written out -- which is intractable given I have 1000 data points. I mean I could write a quick program to get the mean (which I did in perl) but I'd rather not do that for the sd as I am sure there is an easier way to do this. Either way, once I have the qnorm stored in x, all I have to do is qqnorm(x) right? -- View this message in context: http://www.nabble.com/QQplots-of-probability-vector-data-tp24145321p24145321.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New line operator in mtext
Steve Murray wrote: Dear R Users, I'm finding that when I execute the following bit of code, that the new line argument is actually displayed as text in the graphics device. How do I avoid this happening? mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff (mm/month), /n, and Summed Monthly Precipitation (mm x ,10^2,/month use \n rather than /n Uwe Ligges I suspect that I've done, or omitted, something fairly obvious, but as yet cannot see it! Thanks for your help, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Testing if all elements are equal in a vector/matrix
Hi jim holtman jholt...@gmail.com napsal dne 19.06.2009 15:06:55: I have wondered about this way of testing for equality: x - c(1,0,3,0) x[1] * length(x) == sum(x) [1] TRUE x - rep(1,4) x[1] * length(x) == sum(x) [1] TRUE This would seem to indicate that both vectors contain the same values, but not necessarily true. My solution has some flaws however if user is be reasonably sure that such condition is improbable it could be used. The problem was stated as this I just want to check whether all the elements of a vector are same. My vector has one million elements and it is highly likely that there are distinct elements in the first few itself. For example: x = c(1,2,rep(1,10)) and for that kind of vectors it could be quite safe. Although I would prefer something like this. fff2-function(x) length(unique(x))==1 system.time(print(fff2(x))) [1] FALSE user system elapsed 0.390.080.47 Regards Petr On Fri, Jun 19, 2009 at 8:18 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi utkarshsinghal utkarsh.sing...@global-analytics.com napsal dne 17.06.2009 15:29:34: I will wait for the next version-2.9.1 and presently using Petr's suggestion, i.e., (x[1]*length(x))==sum(x) which significantly reduced the run time. The problem is now there might be only small differences ,say, of the order of 10^-10 which I want to ignore. So I used: isTRUE(all.equal((x[1]*length(x)),sum(x))) as suggested in the documentation of all.equal. But this again increased the run time to five times. 1) Is there any faster way of doing the same? Maybe (not tested) (x[1]*length(x))==round(sum(x),10) Petr 2) Will the function anyDuplicated treat almost equal values as duplicated or not? Actually I need both the options. Regards Utkarsh Prof Brian Ripley wrote: On Tue, 16 Jun 2009, Prof Brian Ripley wrote: On Tue, 16 Jun 2009, jim holtman wrote: I think the only way that you are going to get it to stop on the first mismatch is to write your own function in C if you are concerned about the time. Matching on character vectors will be even more costly since it is having to loop to check the equality of each character in each element. This is one of the places it might pay to convert to factors and then the comparison only uses the integer values assigned to the factors. Not so in a recent R: comparison of character vectors is now done by comparing pointers in the first instance so (at least on a 32-bit platform) is as fast as comparing integers. And on x86_64 Linux: x - as.character(c(1,2,rep(1,1000))) system.time(print(all(x[1] == x))) [1] FALSE user system elapsed 0.123 0.019 0.142 system.time(xx - as.factor(x)) user system elapsed 9.874 0.284 10.159 system.time(print(all(xx[1] == xx))) [1] FALSE user system elapsed 0.511 0.145 0.656 Recent pre-release versions of R (e.g. 2.9.1 beta) allow system.time(anyDuplicated(x)) user system elapsed 0.034 0.078 0.113 system.time(anyDuplicated(xx)) user system elapsed 0.037 0.076 0.113 I'm sorry, a line got reverted here: I had edited this to say 'which is a C-level speedup of the sort the original poster seemed to be looking for' On Tue, Jun 16, 2009 at 8:31 AM, utkarshsinghal utkarsh.sing...@global-analytics.com wrote: Hi Jim, What you are saying is correct. Although, my computer might not have same speed and I am getting the following for 10M entries: user system elapsed 0.559 0.038 0.607 Moreover, in the case of character vectors, it gets more than double. In my modeling, which is already highly time consuming, I need to do check this for few thousand vectors and the entries can easily be 10M in each vector. So I am just looking for any possibilities of time saving. I am pretty sure that whenever elements are not all equal, it can be concluded from any few entries (most of the times). It will be worth if I can find a way which stops checking further the moment it find two distinct elements. Regards Utkarsh jim holtman wrote: Just check that the first (or any other element) is equal to all the rest: x = c(1,2,rep(1,1000)) # 10,000,000 system.time(print(all(x[1] == x))) [1] FALSE user system elapsed 0.180.000.19 This was for 10M entries. On Tue, Jun 16, 2009 at 7:42 AM, utkarshsinghal utkarsh.sing...@global-analytics.com wrote: Hi All, There are several replies to the question below, but I think there must exist a better way of doing so. I just want to check whether all the elements of a vector are same. My vector has one million elements and it is highly likely that there are distinct elements in the first few itself. For example: x = c(1,2,rep(1,10)) I want the answer as FALSE, which
Re: [R] New line operator in mtext
Thanks for the response, however, whilst this eliminates the 'new line' character from appearing, it doesn't actually cause a new line to be printed! Instead, I have the last few characters of the first character string overlapping with the first few characters of the next. How best should I change the code to execute a new line? Thanks again! _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help on creating a sequence of vectors
Consider the use of a 'list' so you don't clutter up the global enviroment with a lot of objects that you might forget about: vec - lapply(1:10, rnorm) vec [[1]] [1] -0.6264538 [[2]] [1] 0.1836433 -0.8356286 [[3]] [1] 1.5952808 0.3295078 -0.8204684 [[4]] [1] 0.4874291 0.7383247 0.5757814 -0.3053884 [[5]] [1] 1.5117812 0.3898432 -0.6212406 -2.2146999 1.1249309 [[6]] [1] -0.04493361 -0.01619026 0.94383621 0.82122120 0.59390132 0.91897737 [[7]] [1] 0.78213630 0.07456498 -1.98935170 0.61982575 -0.05612874 -0.15579551 -1.47075238 [[8]] [1] -0.47815006 0.41794156 1.35867955 -0.10278773 0.38767161 -0.05380504 -1.37705956 -0.41499456 [[9]] [1] -0.3942900 -0.0593134 1.1000254 0.7631757 -0.1645236 -0.2533617 0.6969634 0.5566632 -0.6887557 [[10]] [1] -0.7074952 0.3645820 0.7685329 -0.1123462 0.8811077 0.3981059 -0.6120264 0.3411197 -1.1293631 1.4330237 On Mon, Jun 22, 2009 at 5:38 AM, megh megh700...@yahoo.com wrote: I want to create a number of vectors like : vec1 - rnorm(1) vec2 - rnorm(2) vec3 - rnorm(3) and so on... Here I tried following : for (i in 1:10) paste(vec, i, sep=) - rnorm(i) However obviously that is not working. Here vectors I need to be seperated i.e I do not want to create a list. How to modify above code? -- View this message in context: http://www.nabble.com/Help-on-creating-a-sequence-of-vectors-tp24144347p24144347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Roxygen vs Sweave for S4 documentation
TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be on Sun, 21 Jun 2009 08:25:07 +0200 writes: TobiasV Hi Ken, I have been using R for a while. Recently, I have begun converting my package into S4 classes. I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation. It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are there any users of Roxygen or Sweave who can comment on the strengths or weaknesses of one or othe other? Thanks in advance. TobiasV For the moment proper documentation of S4 classes (with a @slot tag TobiasV e.g.) is not implemented yet, how did you define proper here? I know that the result of promptClass() may not always be perfect. As most things are not perfect, I would not quickly call this improper ... Or is using *.Rd not proper for you, as it does not have all of code + docs in one file? Regards, Martin TobiasV but my secret hope is that this will TobiasV be implemented before Peter and Manuel (in cc) will present Roxygen TobiasV at DSC2009. Maybe they have further comments ? TobiasV Kind regards, TobiasV Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Roxygen vs Sweave for S4 documentation
On Mon, Jun 22, 2009 at 7:12 AM, Martin Maechlermaech...@stat.math.ethz.ch wrote: TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be on Sun, 21 Jun 2009 08:25:07 +0200 writes: TobiasV Hi Ken, I have been using R for a while. Recently, I have begun converting my package into S4 classes. I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation. It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are there any users of Roxygen or Sweave who can comment on the strengths or weaknesses of one or othe other? Thanks in advance. TobiasV For the moment proper documentation of S4 classes (with a @slot tag TobiasV e.g.) is not implemented yet, how did you define proper here? I know that the result of promptClass() may not always be perfect. As most things are not perfect, I would not quickly call this improper ... Or is using *.Rd not proper for you, as it does not have all of code + docs in one file? I think Tobias was referring to Roxygen support for S4 classes, not the existence of .Rd tags. Regards, Martin TobiasV but my secret hope is that this will TobiasV be implemented before Peter and Manuel (in cc) will present Roxygen TobiasV at DSC2009. Maybe they have further comments ? TobiasV Kind regards, TobiasV Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What has happened to the R-Help Google Groups Archive? Alternative?
Greetings, I usually read this mailing list through google groups (http://groups.google.com/group/r-help-archive), but when I opened the webpage this morning it said: The group named r-help-archive has been removed because it violated Google's Terms Of Service. Is there an alternative website which uses a similar structure to google groups? I had a quick browse on the R Wiki (http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't see a page with this sort of info. Thank you kindly, Tony Breyal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New line operator in mtext
On 6/22/2009 7:58 AM, Steve Murray wrote: Thanks for the response, however, whilst this eliminates the 'new line' character from appearing, it doesn't actually cause a new line to be printed! Instead, I have the last few characters of the first character string overlapping with the first few characters of the next. How best should I change the code to execute a new line? This works for me. mtext(side=2, line=5.5, expression(paste(Monthly Summed Runoff (mm/month)\nand Summed Monthly Precipitation (mm x ,10^2,/month Note that the newline is embedded in the string that you want to split, it's not a separate element of the label. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What has happened to the R-Help Google Groups Archive? Alternative?
On 6/22/2009 7:23 AM, Tony Breyal wrote: Greetings, I usually read this mailing list through google groups (http://groups.google.com/group/r-help-archive), but when I opened the webpage this morning it said: The group named r-help-archive has been removed because it violated Google's Terms Of Service. Is there an alternative website which uses a similar structure to google groups? I had a quick browse on the R Wiki (http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't see a page with this sort of info. gmane.org has an archive at http://dir.gmane.org/gmane.comp.lang.r.general, but I don't know if it is similar enough for you. You might want to contact Google to ask what terms of service were violated. Duncan Murdoch Thank you kindly, Tony Breyal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert NA to '.'
Dear David, Try this: x - c(A, B, C, NA) x[is.na(x)] - . x HTH, Jorge On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote: Dear R-users, For reporting purpose (using Sweave and LaTeX), I am creating complex tables with the cat function such as x-c(A, B, C, NA) cat(x, '\n') A B C NA For convenience, I would like to change all my NA value to something else like '.' (as in SAS for example). Is there a global option which allows this change? Or should I change all my code to work with the print function and the na.print argument? Best regards, David -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make try to catch warnings in logistic glm
Dear list, From an earlier post I got the impression that one could promote warnings from a glm to errors (presumably by putting options(warn=1)?), then try() would flag them as errors. I’ve spent half the day trying to do this, but no luck. Do you have an explicit solution? My problems is that I am trying to figure out during what conditions one may find 5 significant parameters in a logistic regression with only 21 bad and 28 outcomes (this appears in a published article) and I would like to have my.glm-try(glm(outcome~ald + fat+ hstop + btun + time, data=DF, family=binomial)) to indicate if warnings (convergence / probabilities equal to 0/1) occurs (see attempt to code below if my explanation is terse). Best regards, Fredrik Nilsson -- Fredrik Nilsson, PhD Competence Centre for Clinical Research Lund University Hospital Warnings I'd like to catch: Warning in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : algorithm did not converge Warning in glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : fitted probabilities numerically 0 or 1 occurred script ngood-28 nbad-21 resmat-DFr j-0 nsim-100 res-matrix(NA, nrow=nsim, ncol=6) ow - options(warn) options(warn= 1) outcome-rep(c(G,B), times=c(ngood, nbad)) outcome-as.factor(outcome) agood-rep(c(0,1), times=c(27,1)) abad-rep(c(0,1), times=c(14,7)) thgood-rep(c(0,1), times=c(24,4)) thbad-rep(c(0,1), times=c(12,9)) for (i in 1:nsim) { tgood-sample(agood, ngood) tbad-sample(abad, nbad) hstop-c(tgood, tbad) hstop-as.factor(hstop) aldgood-rnorm(ngood, mean=54, sd=8/0.675) aldbad-rnorm(nbad, mean=64, sd=8/0.675) ald-c(aldgood, aldbad) fatgood-exp(rnorm(ngood, mean=log(23.9)-0.06^2/2,sd=0.06)) fatbad-exp(rnorm(nbad, mean=log(27.4)-0.09^2/2, sd=0.09)) fat-c(fatgood, fatbad) thgood-sample(thgood, ngood) thbad-sample(thbad, nbad) btun-c(thgood, thbad) btun-as.factor(btun) timegood-exp(rnorm(ngood, mean=log(443)-0.5^2/2, sd=0.5)) timebad- exp(rnorm(nbad, mean=log(555)-0.6^2/2, sd=0.6)) time-c(timegood, timebad) DF-data.frame(outcome, ald, fat, hstop, btun, time) my.glm-try(glm(outcome~ald + fat+ hstop + btun + time, data=DF, family=binomial)) test-!is.null(attr(my.glm, try-error)) if(test | !my.glm$converged) { res[i, 1:6]-NA } else { W-summary(my.glm)$coefficients lres-as.numeric(W[,4]=0.05) if (sum(lres)==5) { j-j+1 assign(paste(resmat, j, sep=., collapse=), DF) } res[i,1:6]-lres } } options(ow) From: Dieter Menne dieter.menne_at_menne-biomed.de Date: Fri, 14 Dec 2007 20:49:20 + (UTC) Caroline Paulsen cpaulsen at u.washington.edu writes: I'm attempting to run 250 permutations of a negative binomial GLM model for data on fish counts. Many of the models are fit appropriately, but some issue warnings such as convergence not reached or step size truncated due to divergence. I'd like to figure out a way to flag the models that have warnings and output them into either a table or into R console. Then I could evaluate the problems associated with these models and know not choose them as the best models for fitting the data. You could promote the options to errors (?warning) and use try() Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: subscripts, groups and subset
Auty, Dave dave.auty at forestry.gsi.gov.uk writes: I'm running the following code to produce lattice plots of microfibril angle versus ring number in Scots pine. There are 12 trees and 5 sample positions (Position) in each tree: xyplot(MFA ~ RN | Tree, data = MFA.data, groups = Position, subscripts=TRUE, subscripts=subscripts,lty=8, cex=0.25) }) But it is giving me the error message: }) Error: unexpected '}' in} The code syntactically works for me, but you have posted it out of context, so problem might be BEFORE the posted sample. It is also possible that some unprinting code might have interfered, which is difficult to find. So if there is really no problem above the xyplot(), try to explicitly retype the whole function call. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Roxygen vs Sweave for S4 documentation
On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu wrote: On Mon, Jun 22, 2009 at 7:12 AM, Martin Maechlermaech...@stat.math.ethz.ch wrote: TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be on Sun, 21 Jun 2009 08:25:07 +0200 writes: TobiasV Hi Ken, I have been using R for a while. Recently, I have begun converting my package into S4 classes. I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation. It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are there any users of Roxygen or Sweave who can comment on the strengths or weaknesses of one or othe other? Thanks in advance. TobiasV For the moment proper documentation of S4 classes (with a @slot tag TobiasV e.g.) is not implemented yet, how did you define proper here? I know that the result of promptClass() may not always be perfect. As most things are not perfect, I would not quickly call this improper ... Or is using *.Rd not proper for you, as it does not have all of code + docs in one file? I think Tobias was referring to Roxygen support for S4 classes, not the existence of .Rd tags. Yes, I was just referring to the Roxygen support and as a matter of fact I recommended Roxygen to mimick the nice job done by promptClass a while ago https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html No intent to criticize nor current Rd tags nor Rd as a documentation system, although having code + docs in one file (as in Roxygen) seems to be convenient to some. Best, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make try to catch warnings in logistic glm
Fredrik Nilsson-5 wrote: From an earlier post I got the impression that one could promote warnings from a glm to errors (presumably by putting options(warn=1)?), then try() would flag them as errors. I’ve spent half the day trying to do this, but no luck. Do you have an explicit solution? You were rather close... 1) use warn=2 2) make sure to get the | and || right (this is nasty, I never learn it) Dieter options(warn=2) # my.glm-try(glm(outcome~ald + fat+ hstop + btun + time, data=DF, family=binomial),FALSE) failed - inherits(my.glm,try-error) if(failed || !my.glm$converged) { res[i, 1:6]-NA } else { -- View this message in context: http://www.nabble.com/How-to-make-try-to-catch-warnings-in-logistic-glm-tp24146726p24147228.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice group colors?
Dear list, I have been struggling to find how I would go about changing the bakground colors of groups in a lattice barchart in a way so that the auto.key generated also does the right thing and pick it up for the key. I have used the col argument (which I guess is sent to par()) in a way so that the colors are like I would want them, but now I guess I need to know which part part of the theme I need to change in order to make this change permanent for all the plots, and all the keys. I am of course thankful for all the help I can get. (Also, how does one know these things about the theme variables? Is there some documentation somewhere?) /Fredrik -- Life is like a trumpet - if you don't put anything into it, you don't get anything out of it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice group colors?
Look at show.settings() and str(trellis.par.get()). This will show you what the default settings are. The group colors are set by the superpose.* elements (e.g. superpose.line is for group lines). To set them, I usually create a list and pass it to par.settings. For example, my.theme - list(superpose.line = list(col = c(darkred, darkblue), lty = 1:2)) xyplot(y ~ x, data = mydata, groups = g, par.settings = my.theme, auto.key = list(points = FALSE, lines = TRUE)) HTH, --sundar On Mon, Jun 22, 2009 at 6:30 AM, Fredrik Karlssondargo...@gmail.com wrote: Dear list, I have been struggling to find how I would go about changing the bakground colors of groups in a lattice barchart in a way so that the auto.key generated also does the right thing and pick it up for the key. I have used the col argument (which I guess is sent to par()) in a way so that the colors are like I would want them, but now I guess I need to know which part part of the theme I need to change in order to make this change permanent for all the plots, and all the keys. I am of course thankful for all the help I can get. (Also, how does one know these things about the theme variables? Is there some documentation somewhere?) /Fredrik -- Life is like a trumpet - if you don't put anything into it, you don't get anything out of it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The gradient of a multivariate normal density with respect to its parameters
Does anybody know of a function that implements the derivative (gradient) of the multivariate normal density with respect to the *parameters*? It’s easy enough to implement myself, but I’d like to avoid reinventing the wheel (with some bugs) if possible. Here’s a simple example of the result I’d like, using numerical differentiation: library(mvtnorm) library(numDeriv) f=function(pars, xx, yy) { mu=pars[1:2] sig1=pars[3] sig2=pars[4] rho=pars[5] sig=matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2),2) dmvnorm(cbind(x,y),mu,sig) } mu1=1 mu2=2 sig1=3 sig2=4 rho=.5 x=2 # or a x vector y=3 # or a y vector jacobian(f,c(mu1,mu2,sig1,sig2,rho),xx=x,yy=y) # (Can replace ‘jacobian’ with ‘grad’ if x and y have length 1.) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Eigen value calculation
Hi all, Eigen vectors obtained from the function eigen() are ortho-normal? I see the documentation however there is no formal mention on that. If no, then is there any direct function to do the same? -- View this message in context: http://www.nabble.com/Eigen-value-calculation-tp24147807p24147807.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with checking wether file is present or not
What error is it giving? Please include the exact error. What happens if you do this: if (file.exists(findings)) cat('File',findings,'exists\n') else cat('File',findings,not found\n') Your description suggests that you are using 'if' expression inside a loop. If that is the case, try if ( !file.exists(findings) ) next inside your loop, and near the beginning of the loop. This should immediately skip to the next file, if the file is missing (and if your loop is written they way I assume it is) -Don At 11:51 AM +0530 6/22/09, venkata kirankumar wrote: Hi all, I have a problem with checking File is present in the directory or not like I have a sequence of files in one folder I have to take each file in order and have to caliculate on those files data but in order some files are missing for that I have to check and load those files for that I am using condition like if(file.exists(findings)==TRUE){} its giving results for files which are present in that folder but for files which are not present in that folder its giving error but it have to skip and run for remaining files can any one suggest any other way to make it work for all those files thanks in advance kiran [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https:// stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http:// www. R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA 925-423-1062 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New line operator in mtext
Thanks again for a very useful comment. That seems to have separated the text and put it onto separate lines. However, whilst this results in the text being centralised in relation to the axis, it means that the lower line is left-justified in relation to the upper line, rather than being centralised. How do I go about centralising the lower line in relation to the upper line, whilst keeping it central to the axis? Many thanks again, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The gradient of a multivariate normal density with respect to its parameters
Hi Karl, I am not aware of any. May I ask what your purpose is? You don't really need this if you are going to use it in optimization, since most optimizers use a simple finite-difference approximation if you don't provide the gradient. Using the numerical approximation from numDeriv will be quite time-consuming in an optimization routine, since numDeriv uses a high-order Richardosn extrapolation to compute an accurate approximation of the gradient. Regardless of your purpose, there is a small bug in your function. You should change `dmvnorm(cbind(x,y),mu,sig)' to `dmvnorm(cbind(xx,yy),mu,sig)'. Also, taking the derivative of log-density might be better. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Karl Ove Hufthammer Sent: Monday, June 22, 2009 10:10 AM To: r-h...@stat.math.ethz.ch Subject: [R] The gradient of a multivariate normal density with respect to its parameters Does anybody know of a function that implements the derivative (gradient) of the multivariate normal density with respect to the *parameters*? It's easy enough to implement myself, but I'd like to avoid reinventing the wheel (with some bugs) if possible. Here's a simple example of the result I'd like, using numerical differentiation: library(mvtnorm) library(numDeriv) f=function(pars, xx, yy) { mu=pars[1:2] sig1=pars[3] sig2=pars[4] rho=pars[5] sig=matrix(c(sig1^2,rho*sig1*sig2,rho*sig1*sig2,sig2^2),2) dmvnorm(cbind(x,y),mu,sig) } mu1=1 mu2=2 sig1=3 sig2=4 rho=.5 x=2 # or a x vector y=3 # or a y vector jacobian(f,c(mu1,mu2,sig1,sig2,rho),xx=x,yy=y) # (Can replace 'jacobian' with 'grad' if x and y have length 1.) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when using step
On Jun 22, 2009, at 4:08 AM, Dieter Menne wrote: Chris Friedl cfriedalek at gmail.com writes: I have two questions about the built-in function step. Ultimately I want to apply a lm fitting and subsequent step procedure to thousands of data sets groups by a factor defined as a unique ID. Q1. The code below creates a data.frame comprising three marginally noisy surfaces. The code below fails when I use step in a function but summary seems to show the model fits are legitimate. Any ideas on what I'm doing wrong? ... Well designed example code omitted function(x){lm(model, data = x)}) lapply(fits, function(x){summary(x)}) # FAIL lapply(fits, function(x){step(x)}) Error in as.data.frame.default(data) : cannot coerce class lm into a data.frame Looks like an environment problem. I could not find a workaround quickly, Perhaps: by(data2, data2$grp, function(x) step(lm(model, data=x))) but you might have a look at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html We call it Ripley's Game here, because variants of it can help you quite often. Dieter David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New line operator in mtext
On 6/22/2009 10:30 AM, Steve Murray wrote: Thanks again for a very useful comment. That seems to have separated the text and put it onto separate lines. However, whilst this results in the text being centralised in relation to the axis, it means that the lower line is left-justified in relation to the upper line, rather than being centralised. How do I go about centralising the lower line in relation to the upper line, whilst keeping it central to the axis? I think two calls to mtext, one for each line, should do it. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GAM function with interaction
Hello R Users, I have a question regarding fitting a model with GAM{mgcv}. I have data from several predictor (X) variables I wish to use to develop a model to predict one Y variable. I am working with ecological data, so have data collected many times (about 20) over the course of two years. Plotting data independently for each date there appears to be relationships between Y (fish density) and at least several X variables (temperature and light). However, the actual value of X variables (e.g., temperature) changes with date/season. In other words, fish distribution is likely related to temperature, but available temperatures change through the season. Thus, when data from all dates are combined to create a model from the entire dataset, I think I need to include some type of metric/variable/interaction term to account for this date relationship. I have written the following code using a by term: Distribution.s.temp.logwm2.deltaT-gam(yoyras~s(temp,by=datecode)+s(logwm2,by=datecode)+s(DeltaT,by=datecode),data=AllData) However, I am not convinced this is the correct way to account for this relationship. What do you think? Is there another way to include this in my model? Maybe I should simply include date (datecode) as another term in the model? I also believe there may be an interaction between temperature and light (logwm2), and based on what I have read the by method may be the best way to include this. Correct? Thank you for any input, tips, or advice you may be able to offer. I am new to R, so especially grateful! Thanks again, Paul Simonin (PhD student) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] what's the R command to make the following?
Hi, I have a simple question, suppose I have the date 05/16/2008, what would be the command to get the month, day and year? Thanks, -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what's the R command to make the following?
Couple of choices depending on what you want to do with the data: x - as.POSIXct(05/16/2008, format=%m/%d/%Y) # if you want to use the date format(x, %d) # day [1] 16 format(x, %m) # month [1] 05 format(x, %Y) # year [1] 2008 # or y - strsplit(05/16/2008, '/') # to just get the characters y [[1]] [1] 05 16 2008 On Mon, Jun 22, 2009 at 11:36 AM, Jack Luo jluo.rh...@gmail.com wrote: Hi, I have a simple question, suppose I have the date 05/16/2008, what would be the command to get the month, day and year? Thanks, -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] standard error and p-value for the estimated parameter in AR model
Dear All, I used an AR(1) model to explain the process of the stationary residual and have used an 'ar' command in R. From the results, i tried to extract the standard error and p-value for the estimated parameter, but unfortunately, i never find any way to extract it from the output. What i did was, i assigned the residuals into the 'residual' object in R and used an 'ar' function as below. residual - residuals ar(residual, aic = TRUE, method = mle, order.max = 1) Could someone help me to extract the stadard error and the p-value for the estimated parameter, please? Thank you Fir [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: question about using _apply and/or aggregate functions
Resending, as am not sure about the original To: address. Sorry for any redundancy. - Cliff -- Forwarded message -- From: Clifford Long gnolff...@gmail.com Date: Mon, Jun 22, 2009 at 11:04 AM Subject: question about using _apply and/or aggregate functions To: r-h...@lists.r-project.org Hi R-list, I'll apologize in advance for (1) the wordiness of my note (not sure how to avoid it) and (2) any deficiencies on my part that lead to my difficulties. I have an application with several stages that is meant to simulate and explore different scenarios with respect to product sales (in units sold per month). My session info is at the bottom of this note. The steps include (1) an initial linear regression model, (2) building an ARIMA model, (3) creating 12 months of simulated 'real' data - for various values of projected changes in slope from the linear regression - from the ARIMA model using arima.sim with subsequent undifferencing and appending to historical data, (3) one-step-ahead forecasting for 12 months using the 'fixed' arima model and simulated data, (4) taking the residuals from the forecasting (simulated minus forecast for each of the 12 months) and applying QC charting, and (5) counting the number (if any) of runs-rules violations detected. The simulation-aspect calculates 100 simulations for each of 50 values of slope. All of this seems to work fine (although I'm sure that the coding could be improved through functions, vectorization, etc. ... ). However, the problem I'm having is at the end where I had hoped that things would be easier. I want to summarize and graph the probability of detecting a runs-rule violation vs. the amount of the shift in slope (of logunit sales). The output data array passed from the qcc section at the end includes: - the adjustment made to the slope (a multiplier) - the actual value of the slope - the iteration number of the simulation loop (within each value of slope) - the count of QC charting limits violations - the count of QC charting runs rules violations My code is in the attached file (generic_code.R) and my initial sales data needed to prime the simulation is in the other attached file (generic_data.csv). The relevant section of my code is at the bottom of the .R file after the end of the outer loop. I've tried to provide meaningful comments. I've read the help files for _apply, aggregate, arrays and data types, and have also consulted with several texts (Maindonald and Braun; Spector; Venebles and Ripley for S-plus). Somehow I still don't get it. My attempts usually result in a message like the following: agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean) Error in FUN(X[[1L]], ...) : arguments must have same length But when I check the length of the arguments, they appear to match. (??) length(simarray.part[,3]) [1] 5000 length(simarray.part[,4]) [1] 5000 length(list4) [1] 5000 I would have rather passed along a subset of the simulation/loop output dataset, but was unsure how to save it to preserve whatever properties that I may have missed that are causing my difficulties. If anyone still wants to help at this point, I believe that you'll need to load the .csv data and run the simulation (maybe reducing the number of iterations). Many thanks to anyone who can shed some light on my difficulties (whether self-induced or otherwise). Cliff I'm using a pre-compiled binary version of R for Windows. Session info: sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qcc_1.3 forecast_1.24 tseries_0.10-18 zoo_1.5-5 [5] quadprog_1.4-11 loaded via a namespace (and not attached): [1] grid_2.9.0 lattice_0.17-22 Sys.getlocale() [1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about using _apply and/or aggregate functions
On Jun 22, 2009, at 12:04 PM, Clifford Long wrote: Hi R-list, I'll apologize in advance for (1) the wordiness of my note (not sure how to avoid it) and (2) any deficiencies on my part that lead to my difficulties. I have an application with several stages that is meant to simulate and explore different scenarios with respect to product sales (in units sold per month). My session info is at the bottom of this note. The steps include (1) an initial linear regression model, (2) building an ARIMA model, (3) creating 12 months of simulated 'real' data - for various values of projected changes in slope from the linear regression - from the ARIMA model using arima.sim with subsequent undifferencing and appending to historical data, (3) one-step-ahead forecasting for 12 months using the 'fixed' arima model and simulated data, (4) taking the residuals from the forecasting (simulated minus forecast for each of the 12 months) and applying QC charting, and (5) counting the number (if any) of runs-rules violations detected. The simulation-aspect calculates 100 simulations for each of 50 values of slope. All of this seems to work fine (although I'm sure that the coding could be improved through functions, vectorization, etc. ... ). However, the problem I'm having is at the end where I had hoped that things would be easier. I want to summarize and graph the probability of detecting a runs-rule violation vs. the amount of the shift in slope (of logunit sales). The output data array passed from the qcc section at the end includes: - the adjustment made to the slope (a multiplier) - the actual value of the slope - the iteration number of the simulation loop (within each value of slope) - the count of QC charting limits violations - the count of QC charting runs rules violations My code is in the attached file (generic_code.R) and my initial sales data needed to prime the simulation is in the other attached file (generic_data.csv). The relevant section of my code is at the bottom of the .R file after the end of the outer loop. I've tried to provide meaningful comments. I've read the help files for _apply, aggregate, arrays and data types, and have also consulted with several texts (Maindonald and Braun; Spector; Venebles and Ripley for S-plus). Somehow I still don't get it. My attempts usually result in a message like the following: agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean) Error in FUN(X[[1L]], ...) : arguments must have same length I cannot comment on the overall strategy, but wondered if this minor mod to the code might succeed; agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean) My personal experience with aggregate() is not positive. I generally end up turning to tapply() (which is at the heart of aggregate() anyway) probably because I forget to wrap the second argument in a list. Slow learner, I guess. But when I check the length of the arguments, they appear to match. (??) length(simarray.part[,3]) [1] 5000 length(simarray.part[,4]) [1] 5000 length(list4) [1] 5000 I would have rather passed along a subset of the simulation/loop output dataset, but was unsure how to save it to preserve whatever properties that I may have missed that are causing my difficulties. If anyone still wants to help at this point, I believe that you'll need to load the .csv data and run the simulation (maybe reducing the number of iterations). Many thanks to anyone who can shed some light on my difficulties (whether self-induced or otherwise). Cliff I'm using a pre-compiled binary version of R for Windows. Session info: sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qcc_1.3 forecast_1.24 tseries_0.10-18 zoo_1.5-5 [5] quadprog_1.4-11 loaded via a namespace (and not attached): [1] grid_2.9.0 lattice_0.17-22 Sys.getlocale() [1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Roxygen vs Sweave for S4 documentation
On 6/22/2009 9:23 AM, Tobias Verbeke wrote: On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu wrote: On Mon, Jun 22, 2009 at 7:12 AM, Martin Maechlermaech...@stat.math.ethz.ch wrote: TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be on Sun, 21 Jun 2009 08:25:07 +0200 writes: TobiasV Hi Ken, I have been using R for a while. Recently, I have begun converting my package into S4 classes. I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation. It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are there any users of Roxygen or Sweave who can comment on the strengths or weaknesses of one or othe other? Thanks in advance. TobiasV For the moment proper documentation of S4 classes (with a @slot tag TobiasV e.g.) is not implemented yet, how did you define proper here? I know that the result of promptClass() may not always be perfect. As most things are not perfect, I would not quickly call this improper ... Or is using *.Rd not proper for you, as it does not have all of code + docs in one file? I think Tobias was referring to Roxygen support for S4 classes, not the existence of .Rd tags. Yes, I was just referring to the Roxygen support and as a matter of fact I recommended Roxygen to mimick the nice job done by promptClass a while ago https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html No intent to criticize nor current Rd tags nor Rd as a documentation system, although having code + docs in one file (as in Roxygen) seems to be convenient to some. Hopefully the Rd changes coming in R 2.10.0 will enhance either way of generating an Rd file. There are some things (e.g. all the methods for a generic, or all the generics with methods for a class, all the classes extending a given one) that aren't known at the time you run promptClass or Roxygen, but which are known at display time; the new system should allow those to be computed just before displaying the page. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert NA to '.'
Jorge, Thanks a lot for your reply. It's indeed working in this case. However, with data frames mixing numeric and character is not working anymore. Morever I am working with many variables and I don't want to modify them. What I would really appreciate is a global option (in the Rprofile?) that allow to change the display nd printing of NA by any symbol? Does it exist? Best regards, David Jorge Ivan Velez wrote: Dear David, Try this: x - c(A, B, C, NA) x[is.na(x)] - . x HTH, Jorge On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote: Dear R-users, For reporting purpose (using Sweave and LaTeX), I am creating complex tables with the cat function such as x-c(A, B, C, NA) cat(x, '\n') A B C NA For convenience, I would like to change all my NA value to something else like '.' (as in SAS for example). Is there a global option which allows this change? Or should I change all my code to work with the print function and the na.print argument? Best regards, David -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The gradient of a multivariate normal density with respect to its parameters
Ravi Varadhan skreiv: I am not aware of any. May I ask what your purpose is? You don't really need this if you are going to use it in optimization, since most optimizers use a simple finite-difference approximation if you don't provide the gradient. Using the numerical approximation from numDeriv will be quite time-consuming in an optimization routine, since numDeriv uses a high-order Richardosn extrapolation to compute an accurate approximation of the gradient. No, I don’t use it in an optimisation. The expression is part of a more complicated formula used for calculating some estimates in a special nonparametric model. I won’t use the numerical approximation; the alternative would be to calculate the analytical expressions myself. It’s not too difficult, but tedious, and the expressions I end up with may not be the fastest or most numerically accurate, so if there was a package implementing them in a good way, it would be nice. :) Regardless of your purpose, there is a small bug in your function. You should change `dmvnorm(cbind(x,y),mu,sig)' to `dmvnorm(cbind(xx,yy),mu,sig)'. Yes, of course. I originally used x and y when creating the example, but then discovered that the jacobian() function already used x as an argument for something else, so I renamed them to xx and yy (though obviously not everywhere!). I really should have tested it in a completely clean environment before posting. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help needed: Fraction for Histogram 1 ???
I have been trying to draw histogram for my manscript and found some strange things that I could not figure out why. Using the same code listed below I have successfully draw histograms for a few figures with fraction labeled on Y axis less than 1 (acturally between 0 to 0.1). But one dataset gives the Y axis label 0 to 5 as fraction. This is not true, as fraction are less than 1, although the value distribution on the figure seems to me is right. The only difference between the first few datasets and last dataset is: All values for the first few data sets 1. The values for the last data sets between 0 and 1. Any idea why this happens. Your help is highly appreciated. Charles === postscript(Figure.eps, paper=letter, horizontal=FALSE) par(mfrow=c(3,3)) par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) ) my.input - read.table(input.data, header=FALSE, sep=\t) my.input.exp - read.table(exp.data, header=FALSE, sep=\t) hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border = grey30, ylab=, main=) hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=, border = red,add =TRUE) mtext(Fraction, side=2, line=2, cex=0.7) box() == -- View this message in context: http://www.nabble.com/Help-needed%3A-Fraction-for-Histogram-%3E-1-tp24150345p24150345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Roxygen vs Sweave for S4 documentation
DM == Duncan Murdoch murd...@stats.uwo.ca on Mon, 22 Jun 2009 09:41:12 -0400 writes: DM On 6/22/2009 9:23 AM, Tobias Verbeke wrote: On Mon, Jun 22, 2009 at 2:18 PM, Douglas Batesba...@stat.wisc.edu wrote: On Mon, Jun 22, 2009 at 7:12 AM, Martin Maechlermaech...@stat.math.ethz.ch wrote: TobiasV == Tobias Verbeke tobias.verb...@openanalytics.be on Sun, 21 Jun 2009 08:25:07 +0200 writes: TobiasV Hi Ken, I have been using R for a while. Recently, I have begun converting my package into S4 classes. I was previously using Rdoc for documentation. Now, I am looking to use the best tool for S4 documentation. It seems that the best choices for me are Roxygen and Sweave (I am fine with tex). Are there any users of Roxygen or Sweave who can comment on the strengths or weaknesses of one or othe other? Thanks in advance. TobiasV For the moment proper documentation of S4 classes (with a @slot tag TobiasV e.g.) is not implemented yet, how did you define proper here? I know that the result of promptClass() may not always be perfect. As most things are not perfect, I would not quickly call this improper ... Or is using *.Rd not proper for you, as it does not have all of code + docs in one file? I think Tobias was referring to Roxygen support for S4 classes, not the existence of .Rd tags. Yes, I was just referring to the Roxygen support and as a matter of fact I recommended Roxygen to mimick the nice job done by promptClass a while ago https://lists.r-forge.r-project.org/pipermail/roxygen-devel/2009-February/23.html No intent to criticize nor current Rd tags nor Rd as a documentation system, although having code + docs in one file (as in Roxygen) seems to be convenient to some. DM Hopefully the Rd changes coming in R 2.10.0 will enhance either way of DM generating an Rd file. There are some things (e.g. all the methods for DM a generic, or all the generics with methods for a class, all the classes DM extending a given one) that aren't known at the time you run promptClass DM or Roxygen, but which are known at display time; the new system should DM allow those to be computed just before displaying the page. Yes, indeed! S4 classes and methods are indeed a very good example why a dynamic help system as you are envisaging is very desirable. Thanks a lot, Duncan, for your work on this! Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what's the R command to make the following?
There are many ways to do this in R. See R News 4/1 for info on dates. Here is one method: library(chron) mdy - month.day.year(05/16/2008) mdy $month [1] 5 $day [1] 16 $year [1] 2008 mdy$month [1] 5 On Mon, Jun 22, 2009 at 11:36 AM, Jack Luojluo.rh...@gmail.com wrote: Hi, I have a simple question, suppose I have the date 05/16/2008, what would be the command to get the month, day and year? Thanks, -Jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The gradient of a multivariate normal density with respect toits parameters
Karl, You may want to look at the paper by Dwyer on Some applications of matrix derivatives in multivariate analysis (JASA 1967), especially the Table 2 on p. 617. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: Karl Ove Hufthammer [mailto:karl.huftham...@math.uib.no] Sent: Monday, June 22, 2009 12:12 PM To: r-h...@stat.math.ethz.ch Cc: Ravi Varadhan Subject: Re: [R] The gradient of a multivariate normal density with respect toits parameters Ravi Varadhan skreiv: I am not aware of any. May I ask what your purpose is? You don't really need this if you are going to use it in optimization, since most optimizers use a simple finite-difference approximation if you don't provide the gradient. Using the numerical approximation from numDeriv will be quite time-consuming in an optimization routine, since numDeriv uses a high-order Richardosn extrapolation to compute an accurate approximation of the gradient. No, I don't use it in an optimisation. The expression is part of a more complicated formula used for calculating some estimates in a special nonparametric model. I won't use the numerical approximation; the alternative would be to calculate the analytical expressions myself. It's not too difficult, but tedious, and the expressions I end up with may not be the fastest or most numerically accurate, so if there was a package implementing them in a good way, it would be nice. :) Regardless of your purpose, there is a small bug in your function. You should change `dmvnorm(cbind(x,y),mu,sig)' to `dmvnorm(cbind(xx,yy),mu,sig)'. Yes, of course. I originally used x and y when creating the example, but then discovered that the jacobian() function already used x as an argument for something else, so I renamed them to xx and yy (though obviously not everywhere!). I really should have tested it in a completely clean environment before posting. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems trying to run R2jags on a Linux Cluster
Dear R users, I am having problems in trying to run R2jags on a Linux Cluster. When I tried to run the model given in the R2jags manual I get the following error error message: Error in FUN(X[[3L]], ...) : object 'J' not found Can any one help me with this problem? Looking to hear from you all. Regards, -- Luwis Diya, Leuven Biostatistics and Statistical Bioinformatics Centre (L-BioStat), Kapucijnenvoer 35 blok d - bus 7001, 3000 Leuven, Belgium Tel: +32 16 336886 or +32 16 336892 Fax: +32 16 337015 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] User defined GLM?
Hello, I have this generalized linear formula: log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i] where the the x[i] and the n[i] are known. Is there a way to program the GLM procedure to input the formula above and get the beta[i] estimates? If not the GLM is there another procedure to do that? The aim also afterwards is to estimate the profile-likelihood CIs for the beta parameters. -- View this message in context: http://www.nabble.com/User-defined-GLM--tp24150859p24150859.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help needed: Fraction for Histogram 1 ???
When freq=FALSE then the y axis is not the proportion in each group (what I am assuming you mean by fraction), but rather is scaled so that the total area of the histogram is 1 (making comparing to theoretical densities easier). If all the data values are between 0 and 1, then the height of at least one bar needs to be = 1 for the total area to equal 1. If you want the y-axis to show relative frequency (proportion, fraction, etc.), then you either need to plot the y-axis yourself or use a different function than 'hist'. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of charles78 Sent: Monday, June 22, 2009 10:08 AM To: r-help@r-project.org Subject: [R] Help needed: Fraction for Histogram 1 ??? I have been trying to draw histogram for my manscript and found some strange things that I could not figure out why. Using the same code listed below I have successfully draw histograms for a few figures with fraction labeled on Y axis less than 1 (acturally between 0 to 0.1). But one dataset gives the Y axis label 0 to 5 as fraction. This is not true, as fraction are less than 1, although the value distribution on the figure seems to me is right. The only difference between the first few datasets and last dataset is: All values for the first few data sets 1. The values for the last data sets between 0 and 1. Any idea why this happens. Your help is highly appreciated. Charles === postscript(Figure.eps, paper=letter, horizontal=FALSE) par(mfrow=c(3,3)) par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) ) my.input - read.table(input.data, header=FALSE, sep=\t) my.input.exp - read.table(exp.data, header=FALSE, sep=\t) hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border = grey30, ylab=, main=) hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=, border = red,add =TRUE) mtext(Fraction, side=2, line=2, cex=0.7) box() == -- View this message in context: http://www.nabble.com/Help-needed%3A- Fraction-for-Histogram-%3E-1-tp24150345p24150345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert NA to '.'
Hi David, It works for me when handling data frames mixing characters and numeric by using either print() or a function called foo: # Some data x - c(A, B, C, 3.5, 1, 0, NA, a character,NA, another character) mydata - data.frame(x = x, y = sample(x)) mydata # Option 1: print()ing print(mydata, na.print = .) # Option 2: using a function foo foo - function(x){ y - as.character(x) y[is.na(y)] - . y } mydata[,] - apply(mydata, 2, foo) mydata See ?print for more information. HTH, Jorge On Mon, Jun 22, 2009 at 9:22 AM, David_D david.da...@gmail.com wrote: Jorge, Thanks a lot for your reply. It's indeed working in this case. However, with data frames mixing numeric and character is not working anymore. Morever I am working with many variables and I don't want to modify them. What I would really appreciate is a global option (in the Rprofile?) that allow to change the display nd printing of NA by any symbol? Does it exist? Best regards, David Jorge Ivan Velez wrote: Dear David, Try this: x - c(A, B, C, NA) x[is.na(x)] - . x HTH, Jorge On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote: Dear R-users, For reporting purpose (using Sweave and LaTeX), I am creating complex tables with the cat function such as x-c(A, B, C, NA) cat(x, '\n') A B C NA For convenience, I would like to change all my NA value to something else like '.' (as in SAS for example). Is there a global option which allows this change? Or should I change all my code to work with the print function and the na.print argument? Best regards, David -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24142187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert NA to '.'
On Jun 22, 2009, at 1:21 PM, Jorge Ivan Velez wrote: Hi David, It works for me when handling data frames mixing characters and numeric by using either print() or a function called foo: # Some data x - c(A, B, C, 3.5, 1, 0, NA, a character,NA, another character) mydata - data.frame(x = x, y = sample(x)) mydata Those are not numeric. They are factors. # Option 1: print()ing print(mydata, na.print = .) Does not work as expected with numeric NA's. mydata - data.frame(x = x, y = sample(x), c=sample(c(1:2,NA),length(x),replace=TRUE)) print(mydata, na.print = .) x y c 1 A . NA 2 B . NA 3 C another character NA 43.5 0 1 5 1 B 1 6 0 C 1 7 . 1 1 8a character a character 1 9 . 3.5 2 10 another character A 1 ?format.data.frame # Option 2: using a function foo foo - function(x){ y - as.character(x) y[is.na(y)] - . y } mydata[,] - apply(mydata, 2, foo) mydata But this does work, even when when given numeric NA's mydata[,] - apply(mydata, 2, foo) mydata x y c 1 A . . 2 B . . 3 C another character . 43.5 0 1 5 1 B 1 6 0 C 1 7 . 1 1 8a character a character 1 9 . 3.5 2 10 another character A 1 On Mon, Jun 22, 2009 at 9:22 AM, David_D david.da...@gmail.com wrote: Jorge, Thanks a lot for your reply. It's indeed working in this case. However, with data frames mixing numeric and character is not working anymore. Morever I am working with many variables and I don't want to modify them. What I would really appreciate is a global option (in the Rprofile?) that allow to change the display nd printing of NA by any symbol? Does it exist? Best regards, David Jorge Ivan Velez wrote: Dear David, Try this: x - c(A, B, C, NA) x[is.na(x)] - . x HTH, Jorge On Mon, Jun 22, 2009 at 2:11 AM, David_D david.da...@gmail.com wrote: Dear R-users, For reporting purpose (using Sweave and LaTeX), I am creating complex tables with the cat function such as x-c(A, B, C, NA) cat(x, '\n') A B C NA For convenience, I would like to change all my NA value to something else like '.' (as in SAS for example). Is there a global option which allows this change? Or should I change all my code to work with the print function and the na.print argument? Best regards, David -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27- tp24142187p24142187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Convert-NA-to-%27.%27-tp24142187p24147157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User defined GLM?
On Jun 22, 2009, at 12:35 PM, francogrex wrote: Hello, I have this generalized linear formula: log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i] where the the x[i] and the n[i] are known. Is there a way to program the GLM procedure to input the formula above and get the beta[i] estimates? If not the GLM is there another procedure to do that? The aim also afterwards is to estimate the profile-likelihood CIs for the beta parameters. Not as stated. That would be expecting R to have symbolic algebraic capabilities. Maybe next year^Wdecade? That model is equivalent to: log(x[i]/n[i])=log(sum(x)) - log(sum(n)) + beta[i] So you could use a no-intercept model with log link in glm() and an offset term for the first two terms of the r.h.s. probi - (x[i]/n[i]) glm(probi ~ 0 + offset(log(sum(x)) - log(sum(n) ), family=poisson) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SAS-like method of recoding variables?
Dear R-helpers, I am helping a SAS user run some analyses in R that she cannot do in SAS and she is complaining about R's peculiar (to her!) way of recoding variables. In particular, she is wondering if there is an R package that allows this kind of SAS recoding: IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2); Thanks for any help or suggestions you might be able to provide! Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with storing a sequence of lmer() model fit into a list
Dear R-helpers: May I ask a question related to storing a number of lmer model fit into a list. Basically, I have a for-loop (see towards the bottom of this email) in the loop, I am very sure that the i-th model fit (i.e.,fit_i) is successfully generated and the character string (i.e., tmp_i) is created correctly. The problem stems from the following line in the for-loop #trouble making line below fit.list[[tmp_i]] - fit_i I tried the following example which stores glm() model fit without a problem. #the following code can store glm() model fit into a list --- x1-runif(200) x2-rnorm(200) y-x1+x2 testdf-data.frame(y=y, x1=x1, x2=x2) indepvec-c(x1,x2) fit.list-NULL fit_1-glm(y~x1,data=testdf) fit_2-glm(y~x2,data=testdf) fit.list[[paste('fit_',indepvec[1],sep='')]]-fit_1 fit.list[[paste('fit_',indepvec[12],sep='')]]-fit_2 so why cannot I store lmer() model fit in a list? Would someone kindly explain to me what the R error message(last line of this email) really means? Your kind help will be highly appreciated! -Sean #the following for-loop intends to store lmer() random poisson model output into list (fit.list), it does not work --- fit.list-NULL for (i in seq_along(depvar_vec)) { #I found that s_sex, ses1 and race are not useful fit_i - lmer(as.formula(gen.ranpoisson.fml.jh(depvar_vec[i], offsetvar ,factorindepvars, nonfactorindepvars ,ranintvar )), family=quasipoisson(link=log),verbose=F, data=indf) tmp_i-paste('ranpoi_', depvar_vec[i], sep='') fit.list[[tmp_i]] - fit_i #assign also does not work #assign(fit.list$parse(text = tmp_i), fit_i) } --- #R gives the following error message. Error in fit.list[[tmp_i]] - fit_i : invalid type/length (S4/0) in vector allocation [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS-like method of recoding variables?
On Jun 22, 2009, at 2:27 PM, Mark Na wrote: Dear R-helpers, I am helping a SAS user run some analyses in R that she cannot do in SAS and she is complaining about R's peculiar (to her!) way of recoding variables. In particular, she is wondering if there is an R package that allows this kind of SAS recoding: IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2); ?ifelse Thanks for any help or suggestions you might be able to provide! Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS-like method of recoding variables?
On 6/22/2009 2:27 PM, Mark Na wrote: Dear R-helpers, I am helping a SAS user run some analyses in R that she cannot do in SAS and she is complaining about R's peculiar (to her!) way of recoding variables. In particular, she is wondering if there is an R package that allows this kind of SAS recoding: IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2); Thanks for any help or suggestions you might be able to provide! If the variables are in a data frame called mydf, she might do something like this: mydf$VEHICLE - with(mydf, ifelse(TYPE=='TRUCK' count==12, TRUCK+((CAR+BIKE)/2.2), NA)) or mydf - transform(mydf, VEHICLE = ifelse(TYPE=='TRUCK' count==12, TRUCK+((CAR+BIKE)/2.2), NA)) Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SAS-like method of recoding variables?
Mark Na wrote: Dear R-helpers, I am helping a SAS user run some analyses in R that she cannot do in SAS and she is complaining about R's peculiar (to her!) way of recoding variables. In particular, she is wondering if there is an R package that allows this kind of SAS recoding: IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2); vehicles - ifelse(TYPE=='TRUCK' count=12, TRUCK+((CAR+BIKE)/2.2), NA) or maybe newdata - within(mydata, vehicles - ifelse(TYPE=='TRUCK' count=12, TRUCK+((CAR+BIKE)/2.2), NA) ) or transform(mydata, vehicles=) or, if you insist on only having the calculation done for the subgroup, sub - with(mydata, TYPE=='TRUCK' count=12) sub - sub !is.na(sub) mydata$vehicles - NA mydata$vehicles[sub] - with(mydata[sub,], TRUCK+((CAR+BIKE)/2.2) ) (all assuming that there's no vehicles in the data to begin with). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User defined GLM?
Le lundi 22 juin 2009 à 09:35 -0700, francogrex a écrit : Hello, I have this generalized linear formula: log(x[i]/n[i])=log(sum(x)/sum(n)) + beta[i] I do not understand the problem as stated. if x[i] and n[i] are known, and unless sum(n)=0, your dataset reduces to a set of nrow(dataset) independent linear equations with nrow(dataset) unknowns (the beta[i]), whose solution is trivially beta[i]=log(x[i]/n[i])-log(sum(x)/sum(n), except when n[i]=0 in which case your equation has no solution. Could you try to re-express your problem ? Emmanuel Charpentier __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What has happened to the R-Help Google Groups Archive? Alternative?
Gmane has already been mentioned, and you might also want to consider Nabble -- judging from my referrer logs many people use it to read r-help. If you use Gmail, you might also want to consider subscribing to the list and using a simple filter. Details and links at blog.revolution-computing.com, here: http://bit.ly/FE2s9 # David Smith On Mon, Jun 22, 2009 at 4:23 AM, Tony Breyaltony.bre...@googlemail.com wrote: Greetings, I usually read this mailing list through google groups (http://groups.google.com/group/r-help-archive), but when I opened the webpage this morning it said: The group named r-help-archive has been removed because it violated Google's Terms Of Service. Is there an alternative website which uses a similar structure to google groups? I had a quick browse on the R Wiki (http://wiki.r-project.org/rwiki/doku.php?id=links:links) but didn't see a page with this sort of info. Thank you kindly, Tony Breyal -- David M Smith da...@revolution-computing.com Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Shapiro.test on data frame
Hi, I need help to perform a Shapiro.test on a data frame, I know that this test works only with vector but I guess there most be a way to permor it on a data frame instead of vactor by vector (i.e. I've got 40 variables to analyze and its kinda annoying to do it one by one) Thanks to anyone that can help me. Gonzalo Quiroga __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maps maptools
Hey, A basic question. Is there anyone that knows a good text on using the R packages map and maptools for R? I have read the references for both but having trouble getting started. What I want to do is to load a map (.shp file) and display statistics on this maps using colours. Thanks in advance. Kind regards, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User defined GLM?
Emmanuel Charpentier wrote: I do not understand the problem as stated. if x[i] and n[i] are known, and unless sum(n)=0, your dataset reduces to a set of nrow(dataset) independent linear equations with nrow(dataset) unknowns (the beta[i]), whose solution is trivially beta[i]=log(x[i]/n[i])-log(sum(x)/sum(n), except when n[i]=0 in which case your equation has no solution. Could you try to re-express your problem ? This is taken from this article: Methods for confidence interval estimation of a ratio parameter with application to location quotients. The authors argue that it's not the solution to find beta that they're after but the side effect of having the profile-likelihood confidence intervals in the estimation process. Those guys say they used the SAS Genmod procedure. I thought I could use glm in R and find the profile-likelihood CIs using confint on the glm results. -- View this message in context: http://www.nabble.com/User-defined-GLM--tp24150859p24154447.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help needed: Fraction for Histogram 1 ???
It is not the case as you described. In any case, the total area should be 1 and labeled fraction on y axis should be far less than 1, since I have more than 1 data points. I also test differerent bin size by change the break. I draw the graph using only 1 group, the same result was obtained. Any othe suggestion? Charles. Greg Snow-2 wrote: When freq=FALSE then the y axis is not the proportion in each group (what I am assuming you mean by fraction), but rather is scaled so that the total area of the histogram is 1 (making comparing to theoretical densities easier). If all the data values are between 0 and 1, then the height of at least one bar needs to be = 1 for the total area to equal 1. If you want the y-axis to show relative frequency (proportion, fraction, etc.), then you either need to plot the y-axis yourself or use a different function than 'hist'. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of charles78 Sent: Monday, June 22, 2009 10:08 AM To: r-help@r-project.org Subject: [R] Help needed: Fraction for Histogram 1 ??? I have been trying to draw histogram for my manscript and found some strange things that I could not figure out why. Using the same code listed below I have successfully draw histograms for a few figures with fraction labeled on Y axis less than 1 (acturally between 0 to 0.1). But one dataset gives the Y axis label 0 to 5 as fraction. This is not true, as fraction are less than 1, although the value distribution on the figure seems to me is right. The only difference between the first few datasets and last dataset is: All values for the first few data sets 1. The values for the last data sets between 0 and 1. Any idea why this happens. Your help is highly appreciated. Charles === postscript(Figure.eps, paper=letter, horizontal=FALSE) par(mfrow=c(3,3)) par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) ) my.input - read.table(input.data, header=FALSE, sep=\t) my.input.exp - read.table(exp.data, header=FALSE, sep=\t) hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border = grey30, ylab=, main=) hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=, border = red,add =TRUE) mtext(Fraction, side=2, line=2, cex=0.7) box() == -- View this message in context: http://www.nabble.com/Help-needed%3A- Fraction-for-Histogram-%3E-1-tp24150345p24150345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-needed%3A-Fraction-for-Histogram-%3E-1-tp24150345p24154095.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shapiro.test on data frame
Try this: x - data.frame(A = runif(10), B = rnorm(10)) lapply(x, shapiro.test) On Mon, Jun 22, 2009 at 3:15 PM, Gonzalo Quiroga quirogagonz...@gmail.comwrote: Hi, I need help to perform a Shapiro.test on a data frame, I know that this test works only with vector but I guess there most be a way to permor it on a data frame instead of vactor by vector (i.e. I've got 40 variables to analyze and its kinda annoying to do it one by one) Thanks to anyone that can help me. Gonzalo Quiroga __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with storing a sequence of lmer() model fit into a list
(a) Your code is unnecessarily convoluted. (b) The example of things *not* working is not reproducible. (Read the posting guide!!!) (c) Nonetheless the phenomenon you describe is weird/interesting. On my system, the following runs without error: fit.list - NULL a - factor(rep(1:10,each=20)) set.seed(42) for(i in 1:10) { x1-runif(200) x2-rnorm(200) y - x1+x2 testdf-data.frame(y=y, x1=x1) fit - lm(y~x1+a,data=testdf) fit.list[[i]] - fit } However if the call to lm() is replace by a call to lmer() library(lme4) fit.list - NULL a - factor(rep(1:10,each=20)) set.seed(42) for(i in 1:10) { x1-runif(200) x2-rnorm(200) y - x1+x2 testdf-data.frame(y=y, x1=x1) fit - lmer(y~x1 + (1|a),data=testdf) fit.list[[i]] - fit } I get the same error that you reported, namely Error in fit.list[[i]] - fit : invalid type/length (S4/0) in vector allocation Finally if fit.list is initialized to an empty list (which is the way it *should* be initialized anyway) rather than to NULL, then things appear to work smoothly even with a call to lmer(): library(lme4) fit.list - list() a - factor(rep(1:10,each=20)) set.seed(42) for(i in 1:10) { x1-runif(200) x2-rnorm(200) y - x1+x2 testdf-data.frame(y=y, x1=x1) fit - lmer(y~x1 + (1|a),data=testdf) fit.list[[i]] - fit } So a ``fix'' for the problem would seem to be to start with fit.list being an empty list, rather than NULL. But why the problem shows up only with calls to lmer() is completely mysterious to me. Perhaps others will be able to shed light. HTH cheers, Rolf Turner On 23/06/2009, at 6:34 AM, Sean Zhang wrote: Dear R-helpers: May I ask a question related to storing a number of lmer model fit into a list. Basically, I have a for-loop (see towards the bottom of this email) in the loop, I am very sure that the i-th model fit (i.e.,fit_i) is successfully generated and the character string (i.e., tmp_i) is created correctly. The problem stems from the following line in the for-loop #trouble making line below fit.list[[tmp_i]] - fit_i I tried the following example which stores glm() model fit without a problem. #the following code can store glm() model fit into a list --- x1-runif(200) x2-rnorm(200) y-x1+x2 testdf-data.frame(y=y, x1=x1, x2=x2) indepvec-c(x1,x2) fit.list-NULL fit_1-glm(y~x1,data=testdf) fit_2-glm(y~x2,data=testdf) fit.list[[paste('fit_',indepvec[1],sep='')]]-fit_1 fit.list[[paste('fit_',indepvec[12],sep='')]]-fit_2 so why cannot I store lmer() model fit in a list? Would someone kindly explain to me what the R error message(last line of this email) really means? Your kind help will be highly appreciated! -Sean #the following for-loop intends to store lmer() random poisson model output into list (fit.list), it does not work -- - fit.list-NULL for (i in seq_along(depvar_vec)) { #I found that s_sex, ses1 and race are not useful fit_i - lmer(as.formula(gen.ranpoisson.fml.jh(depvar_vec[i], offsetvar ,factorindepvars, nonfactorindepvars ,ranintvar )), family=quasipoisson(link=log),verbose=F, data=indf) tmp_i-paste('ranpoi_', depvar_vec[i], sep='') fit.list[[tmp_i]] - fit_i #assign also does not work #assign(fit.list$parse(text = tmp_i), fit_i) } --- #R gives the following error message. Error in fit.list[[tmp_i]] - fit_i : invalid type/length (S4/0) in vector allocation ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Shapiro.test on data frame
On Monday 22 June 2009, Gonzalo Quiroga wrote: Hi, I need help to perform a Shapiro.test on a data frame, I know that this test works only with vector but I guess there most be a way to permor it on a data frame instead of vactor by vector (i.e. I've got 40 variables to analyze and its kinda annoying to do it one by one) Thanks to anyone that can help me. Gonzalo Quiroga __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. are you looking to perform this column-wise or row-wise? see ?apply for ideas cheers, Dylan -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help needed: Fraction for Histogram 1 ???
On Mon, 22 Jun 2009, charles78 wrote: It is not the case as you described. In any case, the total area should be 1 and labeled fraction on y axis should be far less than 1, since I have more than 1 data points. I also test differerent bin size by change the break. Try reading Greg's response again. Then calculate the area of one of the bars in your histogram, remembering that the area of a rectangle is not the same as its height, but is the width times height. If the area of the bar is greater than 1, then you can report a problem. -thomas I draw the graph using only 1 group, the same result was obtained. Any othe suggestion? Charles. Greg Snow-2 wrote: When freq=FALSE then the y axis is not the proportion in each group (what I am assuming you mean by fraction), but rather is scaled so that the total area of the histogram is 1 (making comparing to theoretical densities easier). If all the data values are between 0 and 1, then the height of at least one bar needs to be = 1 for the total area to equal 1. If you want the y-axis to show relative frequency (proportion, fraction, etc.), then you either need to plot the y-axis yourself or use a different function than 'hist'. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of charles78 Sent: Monday, June 22, 2009 10:08 AM To: r-help@r-project.org Subject: [R] Help needed: Fraction for Histogram 1 ??? I have been trying to draw histogram for my manscript and found some strange things that I could not figure out why. Using the same code listed below I have successfully draw histograms for a few figures with fraction labeled on Y axis less than 1 (acturally between 0 to 0.1). But one dataset gives the Y axis label 0 to 5 as fraction. This is not true, as fraction are less than 1, although the value distribution on the figure seems to me is right. The only difference between the first few datasets and last dataset is: All values for the first few data sets 1. The values for the last data sets between 0 and 1. Any idea why this happens. Your help is highly appreciated. Charles = == postscript(Figure.eps, paper=letter, horizontal=FALSE) par(mfrow=c(3,3)) par(omi=c(2,0.2,1.8,0.2), mai=c(0.4, 0.4, 0.5, 0.1) ) my.input - read.table(input.data, header=FALSE, sep=\t) my.input.exp - read.table(exp.data, header=FALSE, sep=\t) hist(my.input.exp[,2], breaks=40,freq = FALSE,xlab=, border = grey30, ylab=, main=) hist(my.input[,2], breaks=40,freq = FALSE,xlab=, ylab=, main=, border = red,add =TRUE) mtext(Fraction, side=2, line=2, cex=0.7) box() = = -- View this message in context: http://www.nabble.com/Help-needed%3A- Fraction-for-Histogram-%3E-1-tp24150345p24150345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-needed%3A- Fraction-for-Histogram-%3E-1-tp24150345p24154095.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nls vs nlme: parameter constraints
Hello list, I'm trying to fit a model like beta[trt]/(1+alpha*x) where the data include some grouping factor. The problem is that the estimate for alpha is undefined for some of the treatments - any value greater than 20 is equally good and a step function would suffice. Ignoring the grouping structure, I can fit this using nls with the port algorithm by restricting the upper value of alpha to 20. Is there a way to do something similar in nlme? Thanks! Manuel -- http://mutualism.williams.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maps maptools
Hi John, The SIDS examples in the spatial graphics gallery may be helpful: http://r-spatial.sourceforge.net/gallery/ For a general reference, see Applied Spatial Data Analysis with R. Note that the book's webpage includes figures with code http://www.asdar-book.org/ There's also the spatial Task View http://cran.r-project.org/web/views/Spatial.html and an active spatial mailing list https://www.stat.math.ethz.ch/mailman/listinfo/R-SIG-Geo/ hth, Kingsford Jones On Mon, Jun 22, 2009 at 1:01 PM, John Lipkinsjohn.lipk...@googlemail.com wrote: Hey, A basic question. Is there anyone that knows a good text on using the R packages map and maptools for R? I have read the references for both but having trouble getting started. What I want to do is to load a map (.shp file) and display statistics on this maps using colours. Thanks in advance. Kind regards, John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question about using _apply and/or aggregate functions
Hi R-list, I'll apologize in advance for (1) the wordiness of my note (not sure how to avoid it) and (2) any deficiencies on my part that lead to my difficulties. I have an application with several stages that is meant to simulate and explore different scenarios with respect to product sales (in units sold per month). My session info is at the bottom of this note. The steps include (1) an initial linear regression model, (2) building an ARIMA model, (3) creating 12 months of simulated 'real' data - for various values of projected changes in slope from the linear regression - from the ARIMA model using arima.sim with subsequent undifferencing and appending to historical data, (3) one-step-ahead forecasting for 12 months using the 'fixed' arima model and simulated data, (4) taking the residuals from the forecasting (simulated minus forecast for each of the 12 months) and applying QC charting, and (5) counting the number (if any) of runs-rules violations detected. The simulation-aspect calculates 100 simulations for each of 50 values of slope. All of this seems to work fine (although I'm sure that the coding could be improved through functions, vectorization, etc. ... ). However, the problem I'm having is at the end where I had hoped that things would be easier. I want to summarize and graph the probability of detecting a runs-rule violation vs. the amount of the shift in slope (of logunit sales). The output data array passed from the qcc section at the end includes: - the adjustment made to the slope (a multiplier) - the actual value of the slope - the iteration number of the simulation loop (within each value of slope) - the count of QC charting limits violations - the count of QC charting runs rules violations My code is in the attached file (generic_code.R) and my initial sales data needed to prime the simulation is in the other attached file (generic_data.csv). The relevant section of my code is at the bottom of the .R file after the end of the outer loop. I've tried to provide meaningful comments. I've read the help files for _apply, aggregate, arrays and data types, and have also consulted with several texts (Maindonald and Braun; Spector; Venebles and Ripley for S-plus). Somehow I still don't get it. My attempts usually result in a message like the following: agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean) Error in FUN(X[[1L]], ...) : arguments must have same length But when I check the length of the arguments, they appear to match. (??) length(simarray.part[,3]) [1] 5000 length(simarray.part[,4]) [1] 5000 length(list4) [1] 5000 I would have rather passed along a subset of the simulation/loop output dataset, but was unsure how to save it to preserve whatever properties that I may have missed that are causing my difficulties. If anyone still wants to help at this point, I believe that you'll need to load the .csv data and run the simulation (maybe reducing the number of iterations). Many thanks to anyone who can shed some light on my difficulties (whether self-induced or otherwise). Cliff I'm using a pre-compiled binary version of R for Windows. Session info: sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qcc_1.3 forecast_1.24 tseries_0.10-18 zoo_1.5-5 [5] quadprog_1.4-11 loaded via a namespace (and not attached): [1] grid_2.9.0 lattice_0.17-22 Sys.getlocale() [1] LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about using _apply and/or aggregate functions
Hi David, I appreciate the advice. I had coerced 'list4' to as.list, but forgot to specify list=() in the call to aggregate. I made the correction, and now get the following: slope.mult = simarray[,1] adj.slope.value = simarray[,2] adj.slope.level = simarray[,2] qc.run.violation = simarray[,5] simarray.part = cbind(slope.mult, adj.slope.value, qc.run.violation, adj.slope.level) list4 = as.list(simarray.part[,4]) agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean) Error in sort.list(unique.default(x), na.last = TRUE) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? ... I'm not sure what this means that I've done wrong. I did check 'list4' using is.list, and get TRUE back as an answer, so feel that my mistake is some other fundamental aspect of R that I'm failing to grasp. To your note on 'tapply' ... I did try this as well (actually, tried it first) with no initial success. On your recommendation, I gave tapply another go, and get something recognizable: vtt = tapply(simarray.part[,3], simarray.part[,2], mean) dim(vtt) [1] 50 length(vtt) [1] 50 vtt[1:5] 0.003132 0.006264 0.009396 0.012528 0.01566 0.29 0.24 0.23 0.16 0.22 vtt[1] 0.003132 0.29 vtt[[1]][1] [1] 0.29 I see that the output stored in vtt has one dimension with length=50. But each place in vtt appears to hold two values. I'll admit that I'm not sure how to access/reference the entirety of all 50 values = 0.29 0.24 0.23 0.16 0.22 (and so on). I don't appear to be able to access/reference only what appears to be an embedded index = 0.003132 0.006264 0.009396 etc. What am I missing? Is there a reference that I need to re-read? I would like to be able to plot one against the other. Thanks again for taking the time outside of your day job for your earlier reply! Cliff On Mon, Jun 22, 2009 at 11:28 AM, David Winsemiusdwinsem...@comcast.net wrote: On Jun 22, 2009, at 12:04 PM, Clifford Long wrote: Hi R-list, I'll apologize in advance for (1) the wordiness of my note (not sure how to avoid it) and (2) any deficiencies on my part that lead to my difficulties. I have an application with several stages that is meant to simulate and explore different scenarios with respect to product sales (in units sold per month). My session info is at the bottom of this note. The steps include (1) an initial linear regression model, (2) building an ARIMA model, (3) creating 12 months of simulated 'real' data - for various values of projected changes in slope from the linear regression - from the ARIMA model using arima.sim with subsequent undifferencing and appending to historical data, (3) one-step-ahead forecasting for 12 months using the 'fixed' arima model and simulated data, (4) taking the residuals from the forecasting (simulated minus forecast for each of the 12 months) and applying QC charting, and (5) counting the number (if any) of runs-rules violations detected. The simulation-aspect calculates 100 simulations for each of 50 values of slope. All of this seems to work fine (although I'm sure that the coding could be improved through functions, vectorization, etc. ... ). However, the problem I'm having is at the end where I had hoped that things would be easier. I want to summarize and graph the probability of detecting a runs-rule violation vs. the amount of the shift in slope (of logunit sales). The output data array passed from the qcc section at the end includes: - the adjustment made to the slope (a multiplier) - the actual value of the slope - the iteration number of the simulation loop (within each value of slope) - the count of QC charting limits violations - the count of QC charting runs rules violations My code is in the attached file (generic_code.R) and my initial sales data needed to prime the simulation is in the other attached file (generic_data.csv). The relevant section of my code is at the bottom of the .R file after the end of the outer loop. I've tried to provide meaningful comments. I've read the help files for _apply, aggregate, arrays and data types, and have also consulted with several texts (Maindonald and Braun; Spector; Venebles and Ripley for S-plus). Somehow I still don't get it. My attempts usually result in a message like the following: agg.result = aggregate(simarray.part[,3], by=list4, FUN = mean) Error in FUN(X[[1L]], ...) : arguments must have same length I cannot comment on the overall strategy, but wondered if this minor mod to the code might succeed; agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean) My personal experience with aggregate() is not positive. I generally end up turning to tapply() (which is at the heart of aggregate() anyway) probably because I forget to wrap the second argument in a list. Slow learner, I guess. But when I check the length of the arguments, they appear to match. (??)
[R] Calculating row standard deviations
Hi R-helpers, I have been struggling with calculating row and column statistics, e.g. standard deviation. I know that datac$Mean-rowMeans(datac,na.rm=TRUE) will give me row means. I have tried to replicate those row means with the apply function: datac$Mean2-apply(datac,2,mean) so that I can replace the function argument with sd (instead of mean) to get standard deviations. But, I'm running into this error: dim(datac) [1] 17 271 datac$Mean2-apply(datac,2,mean) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Can anyone see what I'm doing wrong? Thanks! Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice logaritmic scale (basis e ), rewriting labels using xscale.component
thanks Deepayan, that works great! 2009/6/19 Deepayan Sarkar deepayan.sar...@gmail.com: On 6/18/09, Katharina May may.kathar...@googlemail.com wrote: Hi there, sorry for troubling everybody once again, I've got a problem rewriting Sarkar's function for rewriting the tick locations in a logaritmic way (s. http://lmdvr.r-forge.r-project.org/code/Chapter08.R): His example works for log 2 but I need log e (natural logarithm). My problem is that if I replace 2 with e (using paste()), I get the error message that the location isn't a numeric value. R doesn't have a constant that represents e, but tick.at - logTicks(exp(lim), loc = c(1, 3)) instead of tick.at - logTicks(paste(e^,lim,sep=), loc = c(1, 3)) should give you e^lim. Or, if it makes it easier, e - exp(1) tick.at - logTicks(e^lim, loc = c(1, 3)) I don't think you need to change the logTicks function. -Deepayan Is there any way to get this working somehow or do I have to take a different approach? Thanks, Katharina Here my failing approach: require(lattice) data(Earthquake, package = MEMSS) xscale.components.log - function(lim, ...) { ans - xscale.components.default(lim = lim, ...) tick.at - logTicks(paste(e^,lim,sep=), loc = c(1, 3)) ans$bottom$ticks$at - log(tick.at, 2) ans$bottom$labels$at - log(tick.at, 2) ans$bottom$labels$labels - as.character(tick.at) ans } logTicks - function (lim, loc = c(1, 5)) { ii - floor(log(range(lim))) + c(-1, 2) main - paste(e^,(ii[1]:ii[2]),sep=) r - as.numeric(outer(loc, main, *)) r[lim[1] = r r = lim[2]] } xyplot(accel ~ distance, data=Earthquake, scales = list(log = e), xscale.components = xscale.components.log, Here is the original code of Sarkar: logTicks - function (lim, loc = c(1, 5)) { ii - floor(log10(range(lim))) + c(-1, 2) main - 10^(ii[1]:ii[2]) r - as.numeric(outer(loc, main, *)) r[lim[1] = r r = lim[2]] } xscale.components.log2 - function(lim, ...) { ans - xscale.components.default(lim = lim, ...) tick.at - logTicks(2^lim, loc = c(1, 3)) ans$bottom$ticks$at - log(tick.at, 2) ans$bottom$labels$at - log(tick.at, 2) ans$bottom$labels$labels - as.character(tick.at) ans } __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Time flies like an arrow, fruit flies like bananas. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating row standard deviations
Tena koe Mark I think you might want apply(datac, 1, mean) i.e., apply the function to the first dimension (rows) rather than the second (columns). HTH ... Peter Alspach -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mark Na Sent: Tuesday, 23 June 2009 10:20 a.m. To: r-help@r-project.org Subject: [R] Calculating row standard deviations Hi R-helpers, I have been struggling with calculating row and column statistics, e.g. standard deviation. I know that datac$Mean-rowMeans(datac,na.rm=TRUE) will give me row means. I have tried to replicate those row means with the apply function: datac$Mean2-apply(datac,2,mean) so that I can replace the function argument with sd (instead of mean) to get standard deviations. But, I'm running into this error: dim(datac) [1] 17 271 datac$Mean2-apply(datac,2,mean) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Can anyone see what I'm doing wrong? Thanks! Mark Na __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are confidential and may be subject to legal privilege. If you are not the intended recipient you must not use, disseminate, distribute or reproduce all or any part of this e-mail or attachments. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. Any opinion or views expressed in this e-mail are those of the individual sender and may not represent those of The New Zealand Institute for Plant and Food Research Limited. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating row standard deviations
Mark Na wrote: Hi R-helpers, I have been struggling with calculating row and column statistics, e.g. standard deviation. I know that datac$Mean-rowMeans(datac,na.rm=TRUE) will give me row means. I have tried to replicate those row means with the apply function: datac$Mean2-apply(datac,2,mean) so that I can replace the function argument with sd (instead of mean) to get standard deviations. But, I'm running into this error: dim(datac) [1] 17 271 datac$Mean2-apply(datac,2,mean) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent Can anyone see what I'm doing wrong? Not really, but you need apply(datac,1,mean) for row means: apply(airquality,2,mean) Ozone Solar.R Wind Temp Month Day NANA 9.957516 77.882353 6.993464 15.803922 which are obviously column means. However, I see a different error if I try to assign that back airquality$m - apply(airquality,2,mean) Error in `$-.data.frame`(`*tmp*`, m, value = c(NA, NA, 9.95751633986928, : replacement has 6 rows, data has 153 Can't get much closer without a REPRODUCIBLE example. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] easiest way to extend and recompile a package?
Hi all, I am thinking of extending a package by directly adding stuff to its C++ code. And then I have to recompile the package. Do I have to download the whole R source repository, in order to do the recompilation? What is the minimal setup requirement for such a recompilation? I am using MSVC Express 2008 on Windows, is that okay? Please give me some guidance. Thank you! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recursive partitioning algorithms in R vs. alia
I have used all 3 packages for decision trees (SAS/EM, CART and R). As another user on the list commented, the algorithms CART uses are proprietary. I also know that since the algorithms are proprietary, the decision tree that you get from SAS is based on a slightly different algorithm so as to not violate copyright laws. When I first started using R (rpart) I benchmarked it (in terms of results obtained) for my particular problem at the time against Salford Systems CART. R gave me an identical tree with the splitting value being different in the 2nd or 3rd decimal place from what I recall. I did not have SAS/EM at that particular company and so could not benchmark it. Salford Systems CART does have additional types of splitting criteria such as towing etc., but again, these may be of value in certain types of problems. The splitting criteria found in R are good enough. I do have SAS/EM right now but prefer R to SAS/EM since R can be programmed and SAS/EM cannot. This may not be relevant for decision trees but for neural networks, for example, if I want to build hundreds of neural networks (since there are no variable selection methods for neural networks) with different predictors and different number of neurons, I can do this easily in R but cannot do this in SAS/EM. SAS/EM does have a variable selection node but that is independent of the neural network node, so, from what I understand, you have to select the variables and then pass them to the neural network node. In general, you get prettier output with CART and SAS/EM for trees. However, there are packages in R that can give you prettier output than rpart does. One GUI that you may want to explore, that works with R, is Rattle. This builds trees, neural network, boosting, etc. and you can see the generated R code as well. In terms of handling large volumes of data, SAS/EM is probably the best. However, if you have a 64 bit operating system with lots of RAM, and use random sampling, R should suffice. It is debatable whether the extra features like pretty output and variable importance is worth the huge costs you have to pay for those products, unless you really need these features. With R you can do what you want, and that is build a good tree. From what I have read, variable importance measures can be biased as they are affected by factors such as multicollinearity, variables with many categories, etc., so their usefulness is questionable (however, end-users may love them). SAS/EM is by far the most expensive product, and Salford Systems CART is pretty expensive as well. So depending on your needs, R may be good enough or the best, because you can program it, and the latest methodologies will always be implemented in R first. For comparisons of the programming capabilities of SAS (macros) versus R you may want to look at what Frank Harrell and Terry Thearneu (who wrote rpart) have to say. Both are experts in SAS and R. Hope this helps. Jude Carlos wrote: Dear R-helpers, I had a conversation with a guy working in a business intelligence department at a major Spanish bank. They rely on recursive partitioning methods to rank customers according to certain criteria. They use both SAS EM and Salford Systems' CART. I have used package R part in the past, but I could not provide any kind of feature comparison or the like as I have no access to any installation of the first two proprietary products. Has anybody experience with them? Is there any public benchmark available? Is there any very good --although solely technical-- reason to pay hefty software licences? How would the algorithms implemented in rpart compare to those in SAS and/or CART? Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com http://www.datanalytics.com/ ___ Jude Ryan Director, Client Analytical Services Strategy Business Development UBS Financial Services Inc. 1200 Harbor Boulevard, 4th Floor Weehawken, NJ 07086-6791 Tel. 201-352-1935 Fax 201-272-2914 Email: jude.r...@ubs.com Please do not transmit orders or instructions regarding a UBS account electronically, including but not limited to e-mail, fax, text or instant messaging. The information provided in this e-mail or any attachments is not an official transaction confirmation or account statement. For your protection, do not include account numbers, Social Security numbers, credit card numbers, passwords or other non-public information in your e-mail. Because the information contained in this message may be privileged, confidential, proprietary or otherwise protected from disclosure, please notify us immediately by replying to this message and deleting it from your computer if you have received this communication in error. Thank you. UBS Financial Services Inc. UBS International Inc. UBS Financial Services Incorporated of Puerto Rico UBS AG UBS reserves the right to retain all messages.
[R] reading data in rjags
I am trying to run jags.model and need the data read in a suitable format. The package (rjags) documentation describes read.data, while the classic-bugs examples use read.jagsdata. I am running R (R 2.8.1) on a PowerBook G4 (Mac OS X 10.5.7), and my installation recognizes neither read command. Am I missing something? TIA, David Cross www.davidcross.us __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Convert ragged list to matrix
Hi, I have a list made up of character strings with each item a different length (each item is between 1and 6 strings long). Is there a way to convert a ragged list to a matrix such that each item is its own row? Here is a simple example: a=list(); a[[1]] = c(a, b, c); a[[2]] = c(d, e); a[[3]] = c(f, g, h, i); I would like to convert the list to a matrix (or data frame) in this form, with the missing entries NA or something similar: [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i Any suggestions? Thanks! Ken kat...@psu.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert ragged list to matrix
try this: a [[1]] [1] a b c [[2]] [1] d e [[3]] [1] f g h i # find max row length rowMax - max(sapply(a, length)) # now create output matrix by lengthening rows do.call(rbind, lapply(a, function(x){ + length(x) - rowMax + x + })) [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagi kat...@psu.edu wrote: Hi, I have a list made up of character strings with each item a different length (each item is between 1and 6 strings long). Is there a way to convert a ragged list to a matrix such that each item is its own row? Here is a simple example: a=list(); a[[1]] = c(a, b, c); a[[2]] = c(d, e); a[[3]] = c(f, g, h, i); I would like to convert the list to a matrix (or data frame) in this form, with the missing entries NA or something similar: [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i Any suggestions? Thanks! Ken kat...@psu.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert ragged list to matrix
Try this: matrix(unlist(lapply(a, '[', 1:max(sapply(a, length, ncol = 4, byrow = TRUE) or do.call(rbind, lapply(a, '[', 1:max(sapply(a, length On Mon, Jun 22, 2009 at 8:15 PM, Kenneth Takagi kat...@psu.edu wrote: Hi, I have a list made up of character strings with each item a different length (each item is between 1and 6 strings long). Is there a way to convert a ragged list to a matrix such that each item is its own row? Here is a simple example: a=list(); a[[1]] = c(a, b, c); a[[2]] = c(d, e); a[[3]] = c(f, g, h, i); I would like to convert the list to a matrix (or data frame) in this form, with the missing entries NA or something similar: [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i Any suggestions? Thanks! Ken kat...@psu.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert ragged list to matrix
Thanks for all the feedback; I found a previous post covering this question: https://stat.ethz.ch/pipermail/r-help/2009-February/189232.html Problem solved! Kenneth Takagi kat...@psu.edu From: jim holtman [mailto:jholt...@gmail.com] Sent: Monday, June 22, 2009 7:24 PM To: Kenneth Takagi Cc: r-help@r-project.org Subject: Re: [R] Convert ragged list to matrix try this: a [[1]] [1] a b c [[2]] [1] d e [[3]] [1] f g h i # find max row length rowMax - max(sapply(a, length)) # now create output matrix by lengthening rows do.call(rbind, lapply(a, function(x){ + length(x) - rowMax + x + })) [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagi kat...@psu.edu wrote: Hi, I have a list made up of character strings with each item a different length (each item is between 1and 6 strings long). Is there a way to convert a ragged list to a matrix such that each item is its own row? Here is a simple example: a=list(); a[[1]] = c(a, b, c); a[[2]] = c(d, e); a[[3]] = c(f, g, h, i); I would like to convert the list to a matrix (or data frame) in this form, with the missing entries NA or something similar: [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i Any suggestions? Thanks! Ken kat...@psu.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert ragged list to matrix
Try this: unname(t(do.call(cbind, lapply(a, ts [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i On Mon, Jun 22, 2009 at 7:15 PM, Kenneth Takagikat...@psu.edu wrote: Hi, I have a list made up of character strings with each item a different length (each item is between 1and 6 strings long). Is there a way to convert a ragged list to a matrix such that each item is its own row? Here is a simple example: a=list(); a[[1]] = c(a, b, c); a[[2]] = c(d, e); a[[3]] = c(f, g, h, i); I would like to convert the list to a matrix (or data frame) in this form, with the missing entries NA or something similar: [,1] [,2] [,3] [,4] [1,] a b c NA [2,] d e NA NA [3,] f g h i Any suggestions? Thanks! Ken kat...@psu.edu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about using _apply and/or aggregate functions
On Jun 22, 2009, at 6:16 PM, Clifford Long wrote: Hi David, I appreciate the advice. I had coerced 'list4' to as.list, but forgot to specify list=() in the call to aggregate. I made the correction, and now get the following: slope.mult = simarray[,1] adj.slope.value = simarray[,2] adj.slope.level = simarray[,2] qc.run.violation = simarray[,5] simarray.part = cbind(slope.mult, adj.slope.value, qc.run.violation, adj.slope.level) list4 = as.list(simarray.part[,4]) agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean) Error in sort.list(unique.default(x), na.last = TRUE) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? ... I'm not sure what this means that I've done wrong. I did check 'list4' using is.list, and get TRUE back as an answer, so feel that my mistake is some other fundamental aspect of R that I'm failing to grasp. To your note on 'tapply' ... I did try this as well (actually, tried it first) with no initial success. On your recommendation, I gave tapply another go, and get something recognizable: vtt = tapply(simarray.part[,3], simarray.part[,2], mean) dim(vtt) [1] 50 length(vtt) [1] 50 vtt[1:5] 0.003132 0.006264 0.009396 0.012528 0.01566 0.29 0.24 0.23 0.16 0.22 vtt[1] 0.003132 0.29 vtt[[1]][1] [1] 0.29 I see that the output stored in vtt has one dimension with length=50. But each place in vtt appears to hold two values. Nope, that's just the output from an implicit print(vtt). vtt is an array with one row and an associated group of labels. If you doubt me (and I had some trouble with this myself) at that point, try is.matrix(vtt) and I predict will you get TRUE. I'll admit that I'm not sure how to access/reference the entirety of all 50 values = 0.29 0.24 0.23 0.16 0.22 (and so on). I don't appear to be able to access/reference only what appears to be an embedded index = 0.003132 0.006264 0.009396 etc. What am I missing? names(vtt) Is there a reference that I need to re-read? ?tapply Value When FUN is present, tapply calls FUN for each cell that has any data in it. If FUN returns a single atomic value for each such cell (e.g., functions mean or var) and when simplify is TRUE, tapplyreturns a multi-way array containing the values, and NA for the empty cells. I would like to be able to plot one against the other. plot(names(vtt), vtt) Thanks again for taking the time outside of your day job for your earlier reply! Cliff On Mon, Jun 22, 2009 at 11:28 AM, David Winsemiusdwinsem...@comcast.net wrote: On Jun 22, 2009, at 12:04 PM, Clifford Long wrote: Hi R-list, I'll apologize in advance for (1) the wordiness of my note (not sure how to avoid it) and (2) any deficiencies on my part that lead to my difficulties. I have an application with several stages that is meant to simulate and explore different scenarios with respect to product sales (in units sold per month). My session info is at the bottom of this note. The steps include (1) an initial linear regression model, (2) building an ARIMA model, (3) creating 12 months of simulated 'real' data - for various values of projected changes in slope from the linear regression - from the ARIMA model using arima.sim with subsequent undifferencing and appending to historical data, (3) one-step-ahead forecasting for 12 months using the 'fixed' arima model and simulated data, (4) taking the residuals from the forecasting (simulated minus forecast for each of the 12 months) and applying QC charting, and (5) counting the number (if any) of runs-rules violations detected. The simulation-aspect calculates 100 simulations for each of 50 values of slope. All of this seems to work fine (although I'm sure that the coding could be improved through functions, vectorization, etc. ... ). However, the problem I'm having is at the end where I had hoped that things would be easier. I want to summarize and graph the probability of detecting a runs-rule violation vs. the amount of the shift in slope (of logunit sales). The output data array passed from the qcc section at the end includes: - the adjustment made to the slope (a multiplier) - the actual value of the slope - the iteration number of the simulation loop (within each value of slope) - the count of QC charting limits violations - the count of QC charting runs rules violations My code is in the attached file (generic_code.R) and my initial sales data needed to prime the simulation is in the other attached file (generic_data.csv). The relevant section of my code is at the bottom of the .R file after the end of the outer loop. I've tried to provide meaningful comments. I've read the help files for _apply, aggregate, arrays and data types, and have also consulted with several texts (Maindonald and Braun; Spector; Venebles and Ripley for S-plus). Somehow I still don't get it. My attempts usually result in a message like the
Re: [R] Calculating row standard deviations
On Jun 22, 2009, at 6:19 PM, Mark Na wrote: Hi R-helpers, I have been struggling with calculating row and column statistics, e.g. standard deviation. I know that datac$Mean-rowMeans(datac,na.rm=TRUE) will give me row means. I have tried to replicate those row means with the apply function: datac$Mean2-apply(datac,2,mean) so that I can replace the function argument with sd (instead of mean) to get standard deviations. But, I'm running into this error: dim(datac) [1] 17 271 datac$Mean2-apply(datac,2,mean) Error in dimnames(x) - dn : length of 'dimnames' [2] not equal to array extent If you are trying to create a group means value for each element in an array or data.frame then the function to use is ave with its default function is mean. Other functions can be used but that is not necessary here. You could try: datac$Mean2-apply(datac,2,ave) Can anyone see what I'm doing wrong? Thanks! Mark Na David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Automatically Adding Categories
Greetings, I have two files which contain responses to a series of multiple choice questions. One file contains responses before an intervention and the other contains the responses afterward. There were three possible responses to each question: D, F, T (for Don't Know, False, and True). I would like to try McNemar's test to determine if there was any significant difference between before and after. I read the files. I create tables using: firstQuestion - table( PreSurveyData$q1, PostSurveyData$q2) for example. The problem is that for several of the questions not all of the possible responses appear. So I get a table like this: T D 6 F 2 T 12 Which cannot be used in mcnemar.test because it is not a square table and does not have enough rows and columns. Is there some way to specify that R should count the occurrences of D, F, and T even though they do not appear in the Data? Or some easy way to add the missing columns? Thank you, Jeffrey Edgington __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about using _apply and/or aggregate functions
On Jun 22, 2009, at 7:55 PM, David Winsemius wrote: On Jun 22, 2009, at 6:16 PM, Clifford Long wrote: Hi David, I appreciate the advice. I had coerced 'list4' to as.list, but forgot to specify list=() in the call to aggregate. I made the correction, and now get the following: slope.mult = simarray[,1] adj.slope.value = simarray[,2] adj.slope.level = simarray[,2] qc.run.violation = simarray[,5] simarray.part = cbind(slope.mult, adj.slope.value, qc.run.violation, adj.slope.level) list4 = as.list(simarray.part[,4]) agg.result = aggregate(simarray.part[,3], by=list(list4), FUN = mean) Error in sort.list(unique.default(x), na.last = TRUE) : 'x' must be atomic for 'sort.list' Have you called 'sort' on a list? ... I'm not sure what this means that I've done wrong. I did check 'list4' using is.list, and get TRUE back as an answer, so feel that my mistake is some other fundamental aspect of R that I'm failing to grasp. To your note on 'tapply' ... I did try this as well (actually, tried it first) with no initial success. On your recommendation, I gave tapply another go, and get something recognizable: vtt = tapply(simarray.part[,3], simarray.part[,2], mean) snipped other stuff... I would like to be able to plot one against the other. plot(names(vtt), vtt) Or perhaps: plot(as.numeric(names(vtt)), vtt) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automatically Adding Categories
Try this: table(factor(pre, levels = c(D, F, T)), factor(post, levels = c(D, F, T))) On Mon, Jun 22, 2009 at 8:56 PM, Jeffrey Edgington jedgi...@du.edu wrote: Greetings, I have two files which contain responses to a series of multiple choice questions. One file contains responses before an intervention and the other contains the responses afterward. There were three possible responses to each question: D, F, T (for Don't Know, False, and True). I would like to try McNemar's test to determine if there was any significant difference between before and after. I read the files. I create tables using: firstQuestion - table( PreSurveyData$q1, PostSurveyData$q2) for example. The problem is that for several of the questions not all of the possible responses appear. So I get a table like this: T D 6 F 2 T 12 Which cannot be used in mcnemar.test because it is not a square table and does not have enough rows and columns. Is there some way to specify that R should count the occurrences of D, F, and T even though they do not appear in the Data? Or some easy way to add the missing columns? Thank you, Jeffrey Edgington __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.