Re: [R] Storing and managing custom R functions for re-use
I think most of us are in a similar situation. I've usually kept mine in a file which is sourced when I start R. The main problem I have with this is that it clutters up my environment with a lot of stuff I don't need all the time. I'm in the process of creating a custom package which will be lazy-loaded. I believe a previous discussion of this topic suggested this as the preferred method. On 07/09/2011 07:30 AM, Simon Chamaillé-Jammes wrote: Dear all, sorry if this is a bit on the sidetrack for R-help. As a regular R user I have developed quite a lot of custom R functions, to the point of not always remembering what I have already programmed, where the file is and so on. I was wondering what other people do in this regards. A basic file with all your functions, or a custom R package, or directly integrated into a profile file ??? I'm considering that a blog with tagged posts may be a good solution (and really good ones could join R-bloggers maybe). If someone is happy to share what (s)he considers good practice, thanks. simon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence bands in ggplot2
You can easily do this by: qplot(x=as.factor(sch),y=est, geom='point', colour='red') + geom_pointrange(aes(x=as.factor(sch), y=est, ymin=lower.95ci, ymax=upper.95ci))+ xlab('School') + ylab(Value-added)+theme_bw() On 07/07/2011 05:55 PM, Christopher Desjardins wrote: Hi, I have the following data: est sch190 sch107 sch290 sch256 sch287 sch130 sch139 4.16656026 2.64306071 4.22579866 6.12024789 4.49624748 11.12799127 1.17353917 sch140 sch282 sch161 sch193 sch156 sch288 sch352 3.48197696 -0.29659410 -1.99194986 10.23489859 7.77342138 6.77624539 9.66795001 sch368 sch225 sch301 sch105 sch353 sch291 sch179 7.20229569 4.41989204 5.61586860 5.99460203 -2.65019242 -9.42614560 -0.25874193 sch134 sch135 sch324 sch360 bb1 3.26432479 10.52555091 -0.09637968 2.49668858 -3.24173545 se sch190sch107sch290sch256sch287sch130sch139sch140 3.165127 3.710750 4.680911 6.335386 3.896302 4.907679 4.426284 4.266303 sch282sch161sch193sch156sch288sch352sch368sch225 3.303747 4.550193 3.995261 5.787374 5.017278 7.820763 7.253183 4.483988 sch301sch105sch353sch291sch179sch134sch135sch324 4.076570 7.564359 10.456522 5.705474 4.247927 5.671536 10.567093 4.138356 sch360 bb1 4.943779 1.935142 sch [1] 190 107 290 256 287 130 139 140 282 161 193 156 288 [14] 352 368 225 301 105 353 291 179 134 135 324 360 BB From this data I have created 95% confidence intervals assuming a normal distribution. lower.95ci- est - se*qnorm(.975) upper.95ci- est + se*qnorm(.975) What I'd like to do is plot the estimate (est) and have lines attach to the points located in lower.95ci and upper.95ci. Presently I am doing the following: qplot(x=as.factor(sch),y=lower.95ci) + geom_point(aes(x=as.factor(sch),y=upper.95ci),colour=black) + geom_point(aes(x=as.factor(sch), y=est),colour=red) + ylab(Value-Added) + xlab(School) + theme_bw() Which creates this graph --- http://dl.dropbox.com/u/1501309/value_added_test.pdf That's fine except that it doesn't connect the points vertically. Does anyone know how I could make the 'black' points connect to the 'red' point, i.e. show confidence bands? Thanks, Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] linear regression in a data.frame using recast -- A fortunes candidate??
Seconded On 03/16/2011 05:37 PM, Bert Gunter wrote: Ha! -- A fortunes candidate? -- Bert If this is really a time series, then you will have serious validity problems due to auto-correlation among non-independent units. (But if you are just searching for a way to pull the wool over the eyes of the statistically uninformed, then I guess there's no stopping you.) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Revolution Analytics reading SAS datasets
I'm sure the legal ground is tricky. However, OpenOffice and LibreOffice and KWord have been able to open the (proprietary) MS Word doc format for a while now, and they are open source (and Libre Office might even be GPL'd), so the algorithm is in fact published in Jeremy's sense, and has been for several years. I figure the reason for keeping the SAS reading functionality proprietary is Revolution's (perfectly legitimate) wish to make money by separating their product from GNU R and adding features that would make people want to buy rather than just download from CRAN. Within GNU R there are of course sas.get in the Hmisc package (which requires SAS). It should also be quite easy to write a wrapper around dsread, a command-line closed source product freely downloadable in a limited form which will convert sas7bdat files to csv or tsv format (and SQL if you pay). This latter path won't require SAS locally. I'm also sure that SAS has a way to export its datasets into R, since the current version of IML Studio will in fact interact with R. On 02/10/2011 03:11 PM, Jeremy Miles wrote: On 10 February 2011 12:01, Matt Shotwellm...@biostatmatt.com wrote: On Thu, 2011-02-10 at 10:44 -0800, David Smith wrote: The SAS import/export feature of Revolution R Enterprise 4.2 isn't open-source, so we can't release it in open-source Revolution R Community, or to CRAN as we do with the ParallelR packages (foreach, doMC, etc.). Judging by the language of Dr. Nie's comments on the page linked below, it seems unlikely this feature is the result of a licensing agreement with SAS. Is that correct? There was some discussion of this on the SAS email list. People who seem to know what they were talking about said that they would have had to reverse engineer it to decode the file format. It's slightly tricky legal ground - the file format can't be copyrighted but publishing the algorigthm might not be allowed. I guess if they release it as open source, that could be construed as publishing the algorithm. (SPSS and WPS both can open SAS files, and I'd be surprised if SAS licensed to them. [Esp WPS, who SAS are (or were) suing for all kinds of things in court in London.) Jeremy __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Programmaticly finding number of processors by R code
If you have installed multicore (for unix/mac), you can find the number of cores by /*multicore:::detectCores()*/ On 10/3/10 1:03 PM, Ajay Ohri wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating R objects in Java
On 10/1/10 9:18 AM, lord12 wrote: How do you call R methods from Java? I want to create a GUI using Swing in Jaa that calls R methods in Java. Look in the documentation for the rJava package -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Code for paper?
Look at the qvalue package by Dabney and Storey, which might satisfy your last query On 09/30/2010 06:40 PM, Jim Silverton wrote: Does anyone has the Rcode for Gilbert's 2005 paper on the discrete FDR and Tarone's 1990 paper? And Storey's pFDR? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Code for paper?
Reading Gilbert's paper and references, and going on the web, I see that Gilbert provided Fortran source code for his method as well as Tarone's method. It might be possible to wrap this in R On 09/30/2010 06:40 PM, Jim Silverton wrote: Does anyone has the Rcode for Gilbert's 2005 paper on the discrete FDR and Tarone's 1990 paper? And Storey's pFDR? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speeding up regressions using ddply
There has been a recent addition of parallel processing capabilities to plyr (I believe v1.2 and later), along with a dataframe iterator construct. Both have improved performance of ddply greatly for multicore/cluster computing. So we now have the niceness of plyr's grammar with pretty good performance. From the plyr NEWS file: Version 1.2 (2010-09-09) -- NEW FEATURES * l*ply, d*ply, a*ply and m*ply all gain a .parallel argument that when TRUE, applies functions in parallel using a parallel backend registered with the foreach package: x - seq_len(20) wait - function(i) Sys.sleep(0.1) system.time(llply(x, wait)) # user system elapsed # 0.007 0.005 2.005 library(doMC) registerDoMC(2) system.time(llply(x, wait, .parallel = TRUE)) # user system elapsed # 0.020 0.011 1.038 On 9/22/10 10:41 AM, Ista Zahn wrote: Hi Alison, On Wed, Sep 22, 2010 at 11:05 AM, Alison Macaladya...@kmhome.org wrote: Hi, I have a data set that I'd like to run logistic regressions on, using ddply to speed up the computation of many models with different combinations of variables. In my experience ddply is not particularly fast. I use it a lot because it is flexible and has easy to understand syntax, not for it's speed. I would like to run regressions on every unique two-variable combination in a portion of my data set, but I can't quite figure out how to do using ddply. I'm not sure ddply is the tool for this job. The data set looks like this, with status as the binary dependent variable and V1:V8 as potential independent variables in the logistic regression: m- matrix(rnorm(288), nrow = 36) colnames(m)- paste('V', 1:8, sep = '') x- data.frame( status = factor(rep(rep(c('D','L'), each = 6), 3)), as.data.frame(m)) You can use combn to determine the combinations you want: Varcombos- combn(names(x)[-1], 2) From there you can do a loop, something like results- list() for(i in 1:dim(Varcombos)[2]) { log.glm- glm(as.formula(paste(status ~ , Varcombos[1,i], + , Varcombos[2,i], sep=)), family=binomial(link=logit), na.action=na.omit, data=x) glm.summary-summary(log.glm) aic- extractAIC(log.glm) coef- coef(glm.summary) results[[i]]- list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here names(results)[i]- paste(Varcombos[1,i], Varcombos[2,i], sep=_) } I'm sure you could replace the loop with something more elegant, but I'm not really sure how to go about it. I used melt to put my data frame into a more workable format require(reshape) xm- melt(x, id = 'status') Here is the basic shape of the function I'd like to apply to every combination of variables in the dataset: h- function(df) { attach(df) log.glm- (glm(status ~ value1+ value2 , family=binomial(link=logit), na.action=na.omit)) #What I can't figure out is how to specify 2 different variables (I've put value1 and value2 as placeholders) from the xm to include in the model glm.summary-summary(log.glm) aic- extractAIC(log.glm) coef- coef(glm.summary) list(Est1=coef[1,2], Est2=coef[3,2], AIC=aic[2]) #or whatever other output here } And then I'd like to use ddply to speed up the computations. require(pplyr) output-dddply(xm, .(variable), as.data.frame.function(h)) output I can easily do this using ddply when I only want to use 1 variable in the model, but can't figure out how to do it with two variables. I don't think this approach can work. You are saying split up xm by variable and then expecting to be able to reference different levels of variable within each split, an impossible request. Hope this helps, Ista Many thanks for any hints! Ali Alison Macalady Ph.D. Candidate University of Arizona School of Geography and Development Laboratory of Tree Ring Research __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating publication-quality plots for use in Microsoft Word
On 9/15/10 10:38 AM, dadrivr wrote: Hi everyone, I am trying to make some publication-quality plots for use in Microsoft Word, but I am having trouble creating high-quality plots that are supported by Microsoft Word. If I use the R plot function to create the figure, the lines are jagged, and the picture is not of high quality (same with JPEG(), TIFF(), and PNG() functions). I have tried using the Cairo package, but it distorts my dashed lines, and the win.metafile results in a picture of terrible quality. The only way I have succeeded in getting a high quality picture in a file is by using the pdf() function to save the plot as a pdf file, but all my attempts to convert the image in the pdf file to a TIFF or other file type accepted by Word result in considerably degraded quality. Do you have any suggestions for creating publication-quality plots in R that can be placed in Word documents? What packages, functions (along with options), and/or conversions would you use? Thanks so much for your help! Another option I've used is to export to PDF (which seems to give the best quality) and then use the (free) Imagemagick program to convert the PDF to high-resolution PNG. This worked for some involved heatmaps that were submitted to a journal. Imagemagick can be downloaded directly for Windows or via Cygwin. Suppose your figure is in fig1.pdf. You can use the following command (once Imagemagick is downloaded and in your path): system(convert -density 300x300 fig1.pdf fig1.png) -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate, by, *apply
I would approach this slightly differently. I would make func a function of x and y. func - function(x,y){ m - median(x) return(m 2 m y) } Now generate tmp just as you have. then: require(plyr) res - daply(tmp, .(z), summarise, res=func(x,y)) I believe this does the trick Abhijit On 9/15/10 5:45 PM, Mark Ebbert wrote: Dear R gurus, I regularly come across a situation where I would like to apply a function to a subset of data in a dataframe, but I have not found an R function to facilitate exactly what I need. More specifically, I'd like my function to have a context of where the data it's analyzing came from. Here is an example: ### BEGIN ### func-function(x){ m-median(x$x) if(m 2 m x$y){ return(T) } return(F) } tmp-data.frame(x=1:10,y=c(rep(34,3),rep(35,3),rep(34,4)),z=c(rep(a,3),rep(b,3),rep(c,4))) res-aggregate(tmp,list(z),func) ### END ### The values in the example are trivial, but the problem is that only one column is passed to my function at a time, so I can't determine how 'm' relates to 'x$y'. Any tips/guidance is appreciated. Mark T. W. Ebbert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving/loading custom R scripts
You can create a .First function in your .Rprofile file (which will be in ~/.Rprofile). For example .First - function(){ source(Friedman-Test-with-Post-Hoc.r.txt) } You can also create your own package (mylibrary) down the line (see the R manual for creating extensions at http://cran.fhcrc.org/doc/manuals/R-exts.pdf) which will be a collection of your custom scripts that you have written, and then you can automatically load them using .First - function(){ library(mylibrary) } Hope this helps. Abhijit On 9/8/10 3:25 AM, DrCJones wrote: Hi, How does R automatically load functions so that they are available from the workspace? Is it anything like Matlab - you just specify a directory path and it finds it? The reason I ask is because I found a really nice script that I would like to use on a regular basis, and it would be nice not to have to 'copy and paste' it into R on every startup: http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt This would be for Ubuntu, if that makes any difference. Cheers -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Something similar to layout in lattice or ggplot
Hi Thierry, It's really the latter I want..independent plots. I use faceting quite a bit, but I need things like a page of plots for simulations under different conditions. I suppose I can still use faceting combined with reshape, but I'd rather not go that route if I can help it. Abhijit On 9/7/10 10:44 AM, ONKELINX, Thierry wrote: Dear Abhijit, In ggplot you can use facetting (facet_grid() or facet_wrap()) to create subplot based on the same dataset. Or you can work with viewport() if you want several independent plots. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Abhijit Dasgupta Verzonden: dinsdag 7 september 2010 16:38 Aan: r-help@r-project.org Onderwerp: [R] Something similar to layout in lattice or ggplot Hi, Is there a function similar to the layout function in base graphics in either lattice or ggplot? I'm hoping someone has written a function wrapper to the appropriate commands in grid that would make this easier :) Abhijit [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Something similar to layout in lattice or ggplot
Thank you all for the suggestions. They have all been immensely helpful. Abhijit On 9/7/10 10:44 AM, ONKELINX, Thierry wrote: Dear Abhijit, In ggplot you can use facetting (facet_grid() or facet_wrap()) to create subplot based on the same dataset. Or you can work with viewport() if you want several independent plots. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Abhijit Dasgupta Verzonden: dinsdag 7 september 2010 16:38 Aan: r-help@r-project.org Onderwerp: [R] Something similar to layout in lattice or ggplot Hi, Is there a function similar to the layout function in base graphics in either lattice or ggplot? I'm hoping someone has written a function wrapper to the appropriate commands in grid that would make this easier :) Abhijit [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot inside cycle
You haven't wrapped p in the print command, which is one of the ways to make sure the plot gets printed when we need it. print(p+geom_point(aes(size=3))) does the trick On 08/26/2010 06:08 AM, Petr PIKAL wrote: Dear all I want to save several ggplots in one pdf document. I tried this for (i in names(iris)[2:4]) { p-ggplot(iris, aes(x=Sepal.Length, y=iris[,i], colour=Species)) p+geom_point(aes(size=3)) } with different variations of y input but was not successful. In past I used qplot in similar fashion which worked for(i in names(mleti)[7:15]) print(qplot(sito, mleti1[,i], facets=~typ,ylab=i, geom=c(point, line), colour=ordered(minuty), data=mleti1)) So I wonder if anybody used ggplot in cycle and how to solve input of variables throughout cycle Thank you Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove rows based on frequency of factor and then difference date scores
An answer to 1) x = data.frame(Type=c('A','A','B','B'), ID=c(1,1,3,1), Date = c('16/09/2010','23/09/2010','18/8/2010','13/5/2010'), Value=c(8,9,7,6)) x Type ID Date Value 1A 1 16/09/2010 8 2A 1 23/09/2010 9 3B 3 18/8/2010 7 4B 1 13/5/2010 6 x$Date = as.Date(x$Date,format='%d/%m/%Y') library(plyr) x$uniqueID = paste(x$Type, x$ID, sep='') nobs = daply(x, ~uniqueID, nrow) keep = names(nobs)[nobs1] newx = x[x$uniqueID %in% keep,] An answer to 2) require(plyr) ddply(newx, ~uniqueID, transform, newDate = as.numeric(Date - min(Date)+1)) On 08/24/2010 01:19 PM, Chris Beeley wrote: Hello- A basic question which has nonetheless floored me entirely. I have a dataset which looks like this: Type ID DateValue A 116/09/2020 8 A 1 23/09/2010 9 B 3 18/8/20107 B 1 13/5/20106 There are two Types, which correspond to different individuals in different conditions, and loads of ID labels (1:50) corresponding to the different individuals in each condition, and measurements at different times (from 1 to 10 measurements) for each individual. I want to perform the following operations: 1) Delete all individuals for whom only one measurement is available. In the dataset above, you can see that I want to delete the row Type B ID 3, and Type B ID 1, but without deleting the Type A ID 1 data because there is more than one measurement for Type A ID 1 (but not for Type B ID1) 2) Produce difference scores for each of the Dates, so each individual (Type A ID1 and all the others for whom more than one measurement exists) starts at Date 1 and goes up in integers according to how many days have elapsed. I just know there's some incredibly cunning R-ish way of doing this but after many hours of fiddling I have had to admit defeat. I would be very grateful for any words of advice. Many thanks, Chris Beeley, Institute of Mental Health, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove rows based on frequency of factor and then difference date scores
The only problem with this is that Chris's unique individuals are a combination of Type and ID, as I understand it. So Type=A, ID=1 is a different individual from Type=B,ID=1. So we need to create a unique identifier per person, simplistically by uniqueID=paste(Type, ID, sep=''). Then, using this new identifier, everything follows. On 08/24/2010 01:53 PM, David Winsemius wrote: On Aug 24, 2010, at 1:19 PM, Chris Beeley wrote: Hello- A basic question which has nonetheless floored me entirely. I have a dataset which looks like this: Type ID DateValue A 116/09/2020 8 A 1 23/09/2010 9 B 3 18/8/20107 B 1 13/5/20106 There are two Types, which correspond to different individuals in different conditions, and loads of ID labels (1:50) corresponding to the different individuals in each condition, and measurements at different times (from 1 to 10 measurements) for each individual. I want to perform the following operations: 1) Delete all individuals for whom only one measurement is available. In the dataset above, you can see that I want to delete the row Type B ID 3, and Type B ID 1, but without deleting the Type A ID 1 data because there is more than one measurement for Type A ID 1 (but not for Type B ID1) 2) Produce difference scores for each of the Dates, so each individual (Type A ID1 and all the others for whom more than one measurement exists) starts at Date 1 and goes up in integers according to how many days have elapsed. I just know there's some incredibly cunning R-ish way of doing this but after many hours of fiddling I have had to admit defeat. Not sure about terribly cunning. Let's assume your dataframe was read in with stringsAsFactors=FALSE and is called txt.df: txt.df$dt2 - as.Date(txt.df$Date, format=%d/%m/%Y) txt.df Type ID Date Valuedt2 1A 1 16/09/2020 8 2020-09-16 2A 1 23/09/2010 9 2010-09-23 3B 3 18/8/2010 7 2010-08-18 4B 1 13/5/2010 6 2010-05-13 txt.df$nn - ave(txt.df$ID,txt.df$ID, FUN=length) txt.df Type ID Date Valuedt2 nn 1A 1 16/09/2020 8 2020-09-16 3 2A 1 23/09/2010 9 2010-09-23 3 3B 3 18/8/2010 7 2010-08-18 1 4B 1 13/5/2010 6 2010-05-13 3 txt.df[ -which( txt.df$nn =1), ] Type ID Date Valuedt2 nn 1A 1 16/09/2020 8 2020-09-16 3 2A 1 23/09/2010 9 2010-09-23 3 4B 1 13/5/2010 6 2010-05-13 3 # Task #1 accomplished tapply(txt.df$dt2, txt.df$ID, function(x) x[1] -x) $`1` Time differences in days [1]0 3646 3779 $`3` Time difference of 0 days unlist( tapply(txt.df$dt2, txt.df$ID, function(x) x[1] -x) ) 11 12 133 0 3646 37790 txt.df$diffdays - unlist( tapply(txt.df$dt2, txt.df$ID, function(x) x[1] -x) ) txt.df Type ID Date Valuedt2 nn diffdays 1A 1 16/09/2020 8 2020-09-16 30 2A 1 23/09/2010 9 2010-09-23 3 3646 3B 3 18/8/2010 7 2010-08-18 1 3779 4B 1 13/5/2010 6 2010-05-13 30 I would be very grateful for any words of advice. Many thanks, Chris Beeley, Institute of Mental Health, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to remove rows based on frequency of factor and then difference date scores
The paste-y argument is my usual trick in these situations. I forget that tapply can take multiple ordering arguments :) Abhijit On 08/24/2010 02:17 PM, David Winsemius wrote: On Aug 24, 2010, at 1:59 PM, Abhijit Dasgupta, PhD wrote: The only problem with this is that Chris's unique individuals are a combination of Type and ID, as I understand it. So Type=A, ID=1 is a different individual from Type=B,ID=1. So we need to create a unique identifier per person, simplistically by uniqueID=paste(Type, ID, sep=''). Then, using this new identifier, everything follows. I see your point. I agree that a tapply method should present both factors in the indices argument. new.df - txt.df[ -which( txt.df$nn =1), ] new.df - new.df[ with(new.df, order(Type, ID) ), ] # and possibly needs to be ordered? new.df$diffdays - unlist( tapply(new.df$dt2, list(new.df$ID, new.df$Type), function(x) x[1] -x) ) new.df Type ID Date Valuedt2 nn diffdays 1A 1 16/09/2020 8 2020-09-16 30 2A 1 23/09/2010 9 2010-09-23 3 3646 4B 1 13/5/2010 6 2010-05-13 30 But do not agree that you need, in this case at least, to create a paste()-y index. Agreed, however, such a construction can be useful in other situations. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to apply apply?!
For 1, an easy way is dat - transform(dat, CLOSE2=2*CLOSE) For 2: apply(dat,1,fun) On 08/06/2010 03:06 PM, Raghuraman Ramachandran wrote: guRus I have say a dataframe, d and I wish to do the following: 1) For each row, I want to take one particular value of the row and multiply it by 2. How do I do it. Say the data frame is as below: OPEN HIGH LOW CLOSE 1931.2 1931.2 1931.2 1931.2 0 0 0 999.05 0 0 0 1052.5 0 0 0 987.8 0 0 0 925.6 0 0 0 866 0 0 0 1400.2 0 0 0 754.5 0 0 0 702.6 0 0 0 653.25 0 0 0 348 0 0 0 801 866.55 866.55 866.55 866.55 783.1 783.1 742.25 742.25 575 575 575 575 0 0 0 493 470 470 420 425 355 360 343 360 312.05 312.05 274 280.85 257.35 257.35 197 198.75 182 185.95 137 150.75 120.25 129 90.7 101.25 91.85 91.85 57 66.6 How do I multiply only the close of every row using the 'apply' function? And once multiplied how do I obtain a new table that also contains the new 2*CLOSE column (without cbind?). 2) Also, how do I run a generic function per row. Say for example I want to calculate the Implied Volatility for each row of this data frame ( using the RMterics package). How do I do that please using the apply function? I am focusing on apply because I like the vectorisation concept in R and I do not want to use a for loop etc. Many thanks for the enlightment, Raghu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract se(coef) from cph?
if the cph model fit is m1, you can try sqrt(diag(m1$var)) This is coded in print.cph.fit (library(rms)) On 08/05/2010 04:03 PM, Biau David wrote: Hello, I am modeling some survival data wih cph (Design). I have modeled a predictor which showed non linear effect with restricted cubic splines. I would like to retrieve the se(coef) for other, linear, predictors. This is just to make nice LateX tables automatically. I have the coefficients with coef(). How do I do that? Thanks, David Biau. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.