Re: [R] by function ??
Or, more general (if you need to include more than just one variable from TestData), something like by(TestData, LEAID, function(x) median(x$RATIO)) Agreed, this is less appealing for the given example than Ista's code, but might help to better understand by and to generalize its use to other situations. Michael -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ista Zahn Sent: Mittwoch, 9. Dezember 2009 02:54 To: L.A. Cc: r-help@r-project.org Subject: Re: [R] by function ?? Hi, I think you want by(TestData[ , RATIO], LEAID, median) -Ista On Tue, Dec 8, 2009 at 8:36 PM, L.A. ro...@millect.com wrote: I'm just learning and this is probably very simple, but I'm stuck. I'm trying to understand the by(). This works. by(TestData, LEAID, summary) But, This doesn't. by(TestData, LEAID, median(RATIO)) ERROR: could not find function FUN HELP! Thanks, LA -- View this message in context: http://n4.nabble.com/by-function-tp955789p955789.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Serial Correlation in panel data regression
Dear Sayan, no, unfortunately I don't think it will. Here's basically how coeftest() works: if you call the coeftest() function on a model object, say: 'mymodel', it will apply both a 'coef' and a 'vcov' method to mymodel in order to extract beta and vcov(beta) and do a Wald test. coeftest() works with many different kinds of models, represented by 'lm', 'glm', 'plm' objects and so on, each containing a 'standard' covariance matrix, so that the default behaviour is just to extract this latter. Alternatively, you can supply a vcov method of your choice to coeftest() and have it do robust testing etc., but it will still have to be one that fits your kind of model. So if 'mymodel' is a plm object, then coeftest(mymodel, vcov=vcovHC) will use the White-Arellano covariance matrix, which as observed is robust vs. serial correlation in its peculiar way, different from the Newey-West-based vcovHAC for 'lm' objects. I'm too ignorant of the subject to give advice on tobit models, but a quick glance (?tobit) reveals that 'tobit' class objects inherit from 'survreg' ones, so that's the direction in which to look. Maybe you are in a position to simply pool the data and use standard tobit and vcovHAC? Panel data would have N observations out of NT that are serially uncorrelated by construction, and of course this would imply the assumption of no individual effects whatsoever (but I am just guessing here...). Best wishes, Giovanni Da: sayan dasgupta [mailto:kitt...@gmail.com] Inviato: mercoledì 9 dicembre 2009 06:59 A: Millo Giovanni; Achim Zeileis; yves.croiss...@let.ish-lyon.cnrs.fr Cc: r-help@r-project.org Oggetto: Re: Serial Correlation in panel data regression Dear Sir, Thanks for your reply But still exists a trick . Basically I want to do Panel Tobit. I am using the tobit function from the package (AER) on a panel data . Suppose that Gasoline$lgaspcar is a 0 inflated data and I do m1- tobit (as.formula(paste(lgaspcar ~, rhs)), data=Gasoline) then if I do library(lmtest) coeftest(m1,vcovHC) Will it take account of the heteroskedasticity and serial correlation( within country ) of the data Regards Sayan Dasgupta On Tue, Dec 8, 2009 at 8:29 PM, Millo Giovanni giovanni_mi...@generali.com wrote: Dear Sayan, there is a vcovHC method for panel models doing the White-Arellano covariance matrix, which is robust vs. heteroskedasticity *and* serial correlation, although in a different way from that of vcovHAC. You can supply it to coeftest as well, just as you did. The point is in estimating the model as a panel model in the first place. So this should do what you need: data(Gasoline, package=plm) Gasoline$f.year=as.factor(Gasoline$year) library(plm) rhs - -1 + f.year + lincomep+lrpmg+lcarpcap pm1- plm(as.formula(paste(lgaspcar ~, rhs)), data=Gasoline, model=pooling) library(lmtest) coeftest(pm1, vcov=vcovHC) Please refer to the package vignette for 'plm' to check what it does exactly. Let me know if there are any issues. Best, Giovanni -Original Message- From: Achim Zeileis [mailto:achim.zeil...@wu-wien.ac.at] Sent: Tue 08/12/2009 13.48 To: sayan dasgupta Cc: r-help@R-project.org; yves.croiss...@let.ish-lyon.cnrs.fr; Millo Giovanni Subject: Re: Serial Correlation in panel data regression On Tue, 8 Dec 2009, sayan dasgupta wrote: Dear R users, I have a question here library(AER) library(plm) library(sandwich) ## take the following data data(Gasoline, package=plm) Gasoline$f.year=as.factor(Gasoline$year) Now I run the following regression rhs - -1 + f.year + lincomep+lrpmg+lcarpcap m1- lm(as.formula(paste(lgaspcar ~, rhs)), data=Gasoline) ###Now I want to find the autocorrelation,heteroskedasticity adjusted standard errors as a part of coeftest ### Basically I would like to take care of the within country serial correlaion ###that is I want to do coeftest(m1, vcov=function(x) vcovHAC(x,order.by=...)) Please suggest what should be the argument of order.by and whether that will give me the desired result Currently, the default vcovHAC() method just implements the time series case. A generalization to panel data is not yet available. Maybe Yves and Giovanni (authors of plm) have done something in that direction... sorry, Z Ai sensi del D.Lgs. 196/2003 si precisa che le informazioni contenute in questo messaggio sono riservate ed a uso esclusivo del
Re: [R] grep() exclude certain patterns?
Hi, Just a quick note regarding google and R: I use www.rseek.org almost exclusively, and it tends to give me the results I need. It is based on google, but uses a number of smart tricks to ferret out R-relevant information. /Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] {Lattice} cloud() help
Dear all, I'm a lattice graphics newbie. I'm trying to make a cut through the 3D scatterplot with cloud(), so adding a partially transparent plate perpendicular to the z-axis in order to separate the cloud. code: library(lattice) cloud(Sepal.Width~Petal.Length*Petal.Width, data=iris) Could someone give me some hints on how to manipulate the panel functions in this case? Thanks! Jimmy -- View this message in context: http://n4.nabble.com/Lattice-cloud-help-tp955956p955956.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exporting Contingency Tables with xtable
Try the latex function in the Hmisc package. Using the state.* variables built into R for sake of example: library(Hmisc) latex(table(state.division, state.region), rowlabel = X, collabel = Y, file = ) On Wed, Dec 9, 2009 at 12:04 AM, Na'im R. Tyson nty...@clovermail.net wrote: Dear R-philes: I am having an issue with exporting contingency tables with xtable(). I set up a contingency and convert it to a matrix for passing to xtable() as shown below. v.cont.table - table(v_lda$class, grps, dnn=c(predicted, observed)) v.cont.mat - as.matrix(v.cont.table) Both produce output as follows: observed predicted uh uh~ uh 201 30 uh~ 6 10 However, when I construct the latex table with xtable(v.cont.mat), I get a good table without the headings of predicted and observed. \begin{table}[ht] \begin{center} \begin{tabular}{rrr} \hline uh uh\~{} \\ \hline uh 201 30 \\ uh\~{} 6 10 \\ \hline \end{tabular} \end{center} \end{table} Question: is there any easy way to retain or re-insert the dimension names from the contingency table and matrix? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New version of ecological modelling software available with improved R interface
Dear R users, again with the hope that this application can also be useful for some R users, i like to announce a new Windows version of the ecological modeling software Bio7. For information: Bio7 uses a pure Rserve approach to interface Java and R and has a feature rich GUI to access and execute R methods from Java embedded in a Rich Client Platform based on Eclipse. In this release (1.4) a new R perspective is available with a spreadsheet component and the R-Shell view to import, export data from Excel, OpenOffice and CSV and to transfer them to a R workspace. In addition new methods have been added to the R-Shell interface to transfer data from and to the spreadsheet component. http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/rperspective.htm http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/rperspective.htm Furthermore Bio7 1.4 embeds several tools for: - Creation and analysis of spatial explicit simulation models. - Image Analysis (embedded ImageJ). Transfer of images to and from R is supported! For the limits see: http://n4.nabble.com/Transfer-images-limits-and-tests-td622869.html#a622869 http://n4.nabble.com/Transfer-images-limits-and-tests-td622869.html#a622869 - Fast communication between R and Java (with RServe) and the possibilty to use R methods inside Java, BeanShell and Groovy. - Interpretation of Java and script creation (with BeanShell). - Direct dynamic compilation of Java (Janino). - Creation of methods for Java, BeanShell, Groovy and R (integrated editors for Java, R, BeanShell+Groovy). - Sensitivity analysis with an embedded flowchart editor in which scripts, macros and compiled code can be dragged and executed. - Creation of 3d OpenGL (Jogl) models. Dynamic data visualization from R possible. - Visualizations and simulations on an embedded 3d globe (World Wind Java SDK) see: http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/worldwinddynamic.htm http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/worldwinddynamic.htm Overview of changes in 1.4: http://n4.nabble.com/Bio7-1-4-released-td931650.html#a931650 http://n4.nabble.com/Bio7-1-4-released-td931650.html#a931650 Bio7 1.4 is available for Windows (with R and JRE embedded) and Linux (JRE embedded) and can be downloaded from: http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/index.html http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/index.html With kind regards M. Austenfeld -- View this message in context: http://n4.nabble.com/New-version-of-ecological-modelling-software-available-with-improved-R-interface-tp955974p955974.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert a list of N dataframes to N dataframes
On Mon, 7 Dec 2009 13:23:06 -0600 Mark Na mtb...@gmail.com wrote: This worked very nicely (thanks for plyr, Hadley) but now I would like to unlist my list into the individual dataframes, preferably with their original names (data1, etc). I've tried to do this with: ldply(datalist,unlist) Are you perhaps looking for ?attach ? -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with split eating giga-bytes of memory
Here is an example: # create test data N - 100 x - data.frame(a=sample(LETTERS, N, TRUE), b=sample(letters, N, TRUE), + c=as.numeric(1:N), d=runif(N)) system.time({ + x.df - split(x, x$a) # split + print(sapply(x.df, function(a) sum(a$c))) + }) A B C D E F G H 19132375146 19261600080 19290064552 19355472666 19143448231 18973627622 19278423676 19362576931 I J K L M N O P 19405443596 19295695044 19052377988 19236047192 19143226220 19197703946 19297192525 19129252399 Q R S T U V W X 19272964991 19315856972 19355660155 19303178409 19242322477 19081573240 19309444512 19077003863 Y Z 19259313705 19228653862 user system elapsed 1.270.021.28 # now use indices system.time({ + x.indx - split(seq(nrow(x)), x$a) # create list of indices + print(sapply(x.indx, function(a) sum(x$c[a]))) + }) A B C D E F G H 19132375146 19261600080 19290064552 19355472666 19143448231 18973627622 19278423676 19362576931 I J K L M N O P 19405443596 19295695044 19052377988 19236047192 19143226220 19197703946 19297192525 19129252399 Q R S T U V W X 19272964991 19315856972 19355660155 19303178409 19242322477 19081573240 19309444512 19077003863 Y Z 19259313705 19228653862 user system elapsed 0.230.000.23 On Tue, Dec 8, 2009 at 10:26 PM, Mark Kimpel mwkim...@gmail.com wrote: Jim, could you provide a code snippit to illustrate what you mean? Hadley, good point, I did not know that. Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 11:00 PM, jim holtman jholt...@gmail.com wrote: Also instead of 'splitting' the data frame, I split the indices and then use those to access the information in the original dataframe. On Tue, Dec 8, 2009 at 9:54 PM, Mark Kimpel mwkim...@gmail.com wrote: Hadley, Just as you were apparently writing I had the same thought and did exactly what you suggested, converting all columns except the one that I want split to character. Executed almost instantaneously without problem. Thanks! Mark Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:48 PM, hadley wickham h.wick...@gmail.com wrote: Hi Mark, Why are you using factors? I think for this case you might find characters are faster and more space efficient. Alternatively, you can have a look at the plyr package which uses some tricks to keep memory usage down. Hadley On Tue, Dec 8, 2009 at 9:46 PM, Mark Kimpel mwkim...@gmail.com wrote: Charles, I suspect your are correct regarding copying of the attributes. First off, selectSubAct.df is my real data, which turns out to be of the same dim() as myDataFrame below, but each column is make up of strings, not simple letters, and there are many levels in each column, which I did not properly duplicate in my first example. I have ammended that below and with the split the new object size is now not 10X the size of the original, but 100X. My real data is even more complex than this, so I suspect that is where the problem lies. I need to search for a better solution to my problem than split, for which I will start a separate thread if I can't figure something out. Thanks for pointing me in the right direction, Mark myDataFrame - data.frame(matrix(paste(The rain in Spain, as.character(1:1400), sep = .), ncol = 7, nrow = 399000)) mySplitVar - factor(paste(Rainy days and Mondays, as.character(1:1400), sep = .)) myDataFrame - cbind(myDataFrame, mySplitVar) object.size(myDataFrame) ## 12860880 bytes # ~ 13MB myDataFrame.split - split(myDataFrame, myDataFrame$mySplitVar) object.size(myDataFrame.split) ## 1,274,929,792 bytes ~ 1.2GB object.size(selectSubAct.df) ## 52,348,272 bytes # ~ 52MB Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry Indiana University School of Medicine 15032 Hunter Court, Westfield, IN 46074 (317) 490-5129 Work, Mobile VoiceMail (317) 399-1219 Skype No Voicemail please On Tue, Dec 8, 2009 at 10:22 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Tue, 8 Dec 2009, Mark Kimpel wrote:
Re: [R] conditionally merging adjacent rows in a data frame
On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Here are a couple of solutions. The first uses by and the second sqldf: Brilliant! Now I have a whole collection of solutions. I did a simple performance comparison with a data frame that has 7929 lines. The results were as following (loading appropriate packages is not included in the measurements): times - c(0.248, 0.551, 41.080, 0.16, 0.190) names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply) barplot(times, log=y, ylab=log(s)) So sqldf clearly wins followed by tapply and aggregate. summaryBy is slower than necessary because it computes for x and dur both, mean /and/ sum. by+transform presumably suffers from the contruction of many intermediate data frames. Are there any canonical places where R-recipes are collected? If yes I would write-up a summary. These were the competitors: # Gary's and Nikhil's aggregate solution: aggregate.fixations1 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - aggregate(d$dur, list(idx), sum)[2] d2$x - aggregate(d$x, list(idx), mean)[2] d2 } # Marek's symmaryBy: library(doBy) aggregate.fixations2 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum, mean))[c(dur.sum, x.mean)] d2 } # Gabor's by+transform solution: aggregate.fixations3 - function(d) { idx - cumsum(c(TRUE,diff(d$roi)!=0)) d2 - do.call(rbind, by(d, idx, function(x) transform(x, dur = sum(dur), x = mean(x))[1,,drop = FALSE ])) d2 } # Gabor's sqldf solution: library(sqldf) aggregate.fixations4 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - sqldf(select sum(dur), avg(x) x from d group by idx) d2 } # Titus' solution using plain old tapply: aggregate.fixations5 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - tapply(d$dur, idx, sum) d2$x - tapply(d$x, idx, mean) d2 } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditionally merging adjacent rows in a data frame
On Wed, Dec 9, 2009 at 7:59 AM, Titus von der Malsburg malsb...@gmail.com wrote: On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Here are a couple of solutions. The first uses by and the second sqldf: Brilliant! Now I have a whole collection of solutions. I did a simple performance comparison with a data frame that has 7929 lines. The results were as following (loading appropriate packages is not included in the measurements): times - c(0.248, 0.551, 41.080, 0.16, 0.190) names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply) barplot(times, log=y, ylab=log(s)) So sqldf clearly wins followed by tapply and aggregate. summaryBy is slower than necessary because it computes for x and dur both, mean /and/ sum. by+transform presumably suffers from the contruction of many intermediate data frames. Are there any canonical places where R-recipes are collected? If yes I would write-up a summary. If you google for R wiki its the first hit. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditionally merging adjacent rows in a data frame
This is great!! Sqldf is exactly the kind of thing I was looking for, other stuff. I suppose you can speed up both functions 1 and 5 using aggregate and tapply only once, as was suggested earlier. But it comes at the expense of readability. Nikhil On 9 Dec 2009, at 7:59AM, Titus von der Malsburg wrote: On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Here are a couple of solutions. The first uses by and the second sqldf: Brilliant! Now I have a whole collection of solutions. I did a simple performance comparison with a data frame that has 7929 lines. The results were as following (loading appropriate packages is not included in the measurements): times - c(0.248, 0.551, 41.080, 0.16, 0.190) names(times) - c(aggregate,summaryBy,by +transform,sqldf,tapply) barplot(times, log=y, ylab=log(s)) So sqldf clearly wins followed by tapply and aggregate. summaryBy is slower than necessary because it computes for x and dur both, mean /and/ sum. by+transform presumably suffers from the contruction of many intermediate data frames. Are there any canonical places where R-recipes are collected? If yes I would write-up a summary. These were the competitors: # Gary's and Nikhil's aggregate solution: aggregate.fixations1 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - aggregate(d$dur, list(idx), sum)[2] d2$x - aggregate(d$x, list(idx), mean)[2] d2 } # Marek's symmaryBy: library(doBy) aggregate.fixations2 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum, mean))[c(dur.sum, x.mean)] d2 } # Gabor's by+transform solution: aggregate.fixations3 - function(d) { idx - cumsum(c(TRUE,diff(d$roi)!=0)) d2 - do.call(rbind, by(d, idx, function(x) transform(x, dur = sum(dur), x = mean(x))[1,,drop = FALSE ])) d2 } # Gabor's sqldf solution: library(sqldf) aggregate.fixations4 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - sqldf(select sum(dur), avg(x) x from d group by idx) d2 } # Titus' solution using plain old tapply: aggregate.fixations5 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - tapply(d$dur, idx, sum) d2$x - tapply(d$x, idx, mean) d2 } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bootstrapping in R
Dear all, I have some error trying to bootstrap from a matrix. The error message is Error in sample(n, n * R, replace = TRUE) : element 2 is empty; the part of the args list of '*' being evaluated was: (n, R) vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0) Reg - matrix(data=vv, nrow = 4, ncol = 2) bootcoeff - function(x){ coefficients(lm(x[,1]~x[,2]))[2]+1 } boot(Reg, bootcoeff) It is just an example, in reality I have a matrix in rows of which I have x and y for which I need to make a regression to find the slope coeff bootstrapping from rows. Thanks a lot for the help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrapping in R
I missed number of bootstrap replicates R boot(Reg, bootcoeff, R=10) but still it doesn't work Error in statistic(data, original, ...) : unused argument(s) (original) On Wed, Dec 9, 2009 at 2:46 PM, Trafim Vanishek rdapam...@gmail.com wrote: Dear all, I have some error trying to bootstrap from a matrix. The error message is Error in sample(n, n * R, replace = TRUE) : element 2 is empty; the part of the args list of '*' being evaluated was: (n, R) vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0) Reg - matrix(data=vv, nrow = 4, ncol = 2) bootcoeff - function(x){ coefficients(lm(x[,1]~x[,2]))[2]+1 } boot(Reg, bootcoeff) It is just an example, in reality I have a matrix in rows of which I have x and y for which I need to make a regression to find the slope coeff bootstrapping from rows. Thanks a lot for the help. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why cannot get the expected values in my function
On Tue, 2009-12-08 at 23:22 -0500, David Winsemius wrote: On Dec 8, 2009, at 11:07 PM, rusers.sh wrote: Hi, In the following function, i hope to save my simulated data into the result dataset, but why the final result dataset seems not to be generated. #Function simdata-function (nsim) { # Instead why not: cbind(x=runif(nsim), y=runif(nsim) ) or: m - matrix(runif(nsim*2), ncol = 2) ## if names on m needed colnames(m) - c(x,y) G } #simulation simdata(10) #correct result x y [1,] 0.2655087 0.372123900 [2,] 0.1848823 0.702374036 [3,] 0.1680415 0.807516399 [4,] 0.5858003 0.008945796 [5,] 0.2002145 0.685218596 [6,] 0.6062683 0.937641973 [7,] 0.9889093 0.397745453 [8,] 0.4662952 0.207823317 [9,] 0.2216014 0.024233910 [10,] 0.5074782 0.306768506 But, the dataset result wasnot assigned the above values. What is the problem? result #wrong result?? x y [1,] NA NA [2,] NA NA [3,] NA NA [4,] NA NA [5,] NA NA [6,] NA NA [7,] NA NA [8,] NA NA [9,] NA NA [10,] NA NA Thanks a lot. -- - Jane Chang Queen's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] code
Dear support, I want to compute the highest probability density for any data. please do you have any code to help me in this subject. I am looking forward to hearing from you as soon as possible. rahim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R echo code chunk runs off the page using Lyx and Sweav
I somehow missed the response posted by Ben Bolker. He is quite correct (happily for me!): \SweaveOpts{keep.source=TRUE} in your LaTeX code will (I think) keep whatever manual formatting you do, in all code chunks (or use keep.source=TRUE) for particular code chunks of concern This information has made its way into the latest Sweave manual at http://www.statistik.lmu.de/~leisch/Sweave/. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] code
?density ?max Rahim Alhamzawi rahimalhamz...@yahoo.co.uk Sent by: r-help-boun...@r-project.org 12/09/2009 05:23 AM To r-h...@stat.math.ethz.ch cc Subject [R] code Dear support, I want to compute the highest probability density for any data. please do you have any code to help me in this subject. I am looking forward to hearing from you as soon as possible. rahim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem in labeling the nodes of tree drawn by rpart
In the nodes of the tree, the values of the covariates are represented with a, b or c (tree attached). Try help('text.rpart'), and note the 'pretty' argument therein. There is often not enough room for long labels, and so the default is to do the severe truncation you speak of. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Printing 'k' levels of factors 'n' times each, but 'n' is unequal for all levels ?
A Singh wrote: Dear List, I need to print out each of 'k' levels of a factor 'n' times each, where 'n' is the number of elements belonging to each factor. I know that this can normally be done using the gl() command, but in my case, each level 'k' has an unequal number of elements. Example with code is as below: vc-read.table(P:\\Transit\\CORRECT files\\Everything-newest.csv,header=T, sep=,, dec=., na.strings=NA, strip.white=T) vcdf-data.frame(vc) tempdf-data.frame(cbind(vcdf[,1:3], vcdf[,429])) newtemp-na.exclude(tempdf) newtemp[,2]-factor(newtemp[,2]) groupmean-tapply(newtemp[,4], newtemp[,2], mean) newmark-factor(groupmean, exclude=(groupmean==0 | groupmean==1)) newmark This is what the output is (going up to 61 levels) 1 2 3 4 NA 0.142857142857143 0.444 NA 5 6 8 9 0.33 0.09090909090 0.3846153846NA . 61 NA The variable 'groupmean' calculates means for newtemp[,4] for 61 levels (k). Levels are specified in newtemp[,2]. I now want to be able to print out each value of 'groupmean' as many times as there are elements in the group for which each is calculated. So for E.g. if level 1 of newtemp[,2] has about 15 elements, NA should be printed 15 times, level 2 = 12 times 0.1428, and so on. Is there a way of specifying that a list needs to be populated with replicates of groupmeans based on values got from newtemp[,2]? See ?mapply and ?rep, hence mapply(rep, values, replicates) where values and replicates are corresponding vectors. Uwe Ligges I just can't seem to figure this out by myself. Many thanks for your help. Aditi -- A Singh aditi.si...@bristol.ac.uk School of Biological Sciences University of Bristol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrapping in R
On Dec 9, 2009, at 9:05 AM, Trafim Vanishek wrote: I missed number of bootstrap replicates R boot(Reg, bootcoeff, R=10) but still it doesn't work Error in statistic(data, original, ...) : unused argument(s) (original) On Wed, Dec 9, 2009 at 2:46 PM, Trafim Vanishek rdapam...@gmail.com wrote: Dear all, I have some error trying to bootstrap from a matrix. The error message is Error in sample(n, n * R, replace = TRUE) : element 2 is empty; the part of the args list of '*' being evaluated was: (n, R) vv - c(0.5,3.2,5.4,1.1,1.4,1.2,2.3,2.0) Reg - matrix(data=vv, nrow = 4, ncol = 2) bootcoeff - function(x){ coefficients(lm(x[,1]~x[,2]))[2]+1 } boot(Reg, bootcoeff) ?boot And in particular you need to read the Arguments material more closely. The boot function is more complicated that you expected. Then work through the examples. You may also get help by doing some searching in www.rseek.org or with the RSiteSearch function, e.g.: RsiteSearch(lm coef boot) It is just an example, in reality I have a matrix in rows of which I have x and y for which I need to make a regression to find the slope coeff bootstrapping from rows. Thanks a lot for the help. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Greek symbols on ylab= using barchart() {Lattice}
Hi All, I'm trying to write ug/m3 as y-label, with greek letter mu replacing u AND 3 going as a power. These commands works in general: plot.new() text(0.5, 0.5, expression(symbol(m))) But, I'm sure about how to do it using barchart() from Lattice. Can anyone help please? Thanks, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek symbols on ylab= using barchart() {Lattice}
Hi, try this, barchart(1:2, ylab=expression(mu*g/m^3)) ?plotmath baptiste 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi All, I'm trying to write ug/m3 as y-label, with greek letter mu replacing u AND 3 going as a power. These commands works in general: plot.new() text(0.5, 0.5, expression(symbol(m))) But, I'm sure about how to do it using barchart() from Lattice. Can anyone help please? Thanks, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Warning for data.table (with ref)?
I have following the message dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE, use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead when I loaded data.table. Is it from the package ref? Could it be fixed? Or there is something wrong with my installation? library(data.table) Loading required package: ref dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE, use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead sessionInfo() R version 2.10.0 (2009-10-26) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.2 ref_0.97 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 9, 2009, at 12:00 AM, Peng Yu wrote: On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames This bug has been found long time back and a package has been developed for it. Should the fix be integrated in data.frame rather than be implemented in an additional package? What bug? Is the slow speed in splitting a data.frame a performance bug? David. system.time(split(as.data.frame(x),f)) user system elapsed 1.700 0.010 1.786 system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 0.170 0.000 0.167 ### m=3 n=6 k=3000 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) system.time(lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assign variables in a loop to a list
Dear R helpers, I am new in R and I am having trouble with a function. I am proggraming a genetic analysis and there is a script that generates a lot of different matrix, for example x,y and z. And what I am trying to do is to loop the same script for different variables the user will define in the function. Therefore the matrix x, y and z are going to be created for each variable myfun - function(...){#the variables in (...) for example (DAP, Vol) Vec = matrix(c(...)) for (i in seq(along = Vec)){ ... ##generates x, y and z # The scripts are not here because they are big assign(paste(x,i,sep=),x) assign(paste(y,i,sep=),y) #this generates x1,y1,z1, x2, y2, and z2 for the example with two variables assign(paste(z,i,sep=),z) } Here is the step of my doubt where I can´t solve I want to assign those variables in a list function structure(list(...), class = genotype) ## In the example it would be ##structure(list(varX1 = x1, varX2 = x2, varY1 = y1, varY2 = y2, varZ1 = z1, varZ2 = z2), class = genotype) } #end of function However I don´t know how to assign those variables in this list because I don´t know how many variables will the user declare I am not sure if I was clear, I know it is hard without the whole script, but I think it wouldn´t make any difference. It could be considered 3 randomly matrix generated each time (each loop). Thank you very much for the help and for the time dispended -- View this message in context: http://n4.nabble.com/Assign-variables-in-a-loop-to-a-list-tp956207p956207.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can elements of a list be passed as multiple arguments?
On Tue, Dec 8, 2009 at 11:05 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:37 PM, Peng Yu wrote: I want to split a matrix, where both 'u' and 'w' are results of possible ways. However, whenever 'n' changes, the argument passed to mapply() has to change. Is there a way to pass elements of a list as multiple arguments? You need to explain what you want in more detail. In your example mapply did exactly what you told it to. No errors. Three matrices. What were you expecting when you gave it three lists in each argument? I want a general solution so that I don't have to always write v[[1]], v[[2]], ..., v[[n]] like in the following, because the following way would not work if 'n' is an arbitrary number. w=mapply(function(x,y) {cbind(x,y)}, v[[1]], v[[2]], ..., v[[n]]) One way that I can think of is to somehow expand a list (i.e., v in this case) to a set of arguments that can be passed to 'mapply()'. m=10 n=2 k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) u=split(as.data.frame(x),f) v=lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) w=mapply( function(x,y) { cbind(x,y) } , v[[1]], v[[2]] ) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek symbols on ylab= using barchart() {Lattice}
Hi Baptiste and Others, Thanks for your help. I'm writing: ylab=expression(Concentration(mu*g/m^3)) And its working fine, but is it possible to add a space between Concentration and (mu*g/m^3). Thanks again, Peng Cai On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Hi, try this, barchart(1:2, ylab=expression(mu*g/m^3)) ?plotmath baptiste 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi All, I'm trying to write ug/m3 as y-label, with greek letter mu replacing u AND 3 going as a power. These commands works in general: plot.new() text(0.5, 0.5, expression(symbol(m))) But, I'm sure about how to do it using barchart() from Lattice. Can anyone help please? Thanks, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Wed, 9 Dec 2009, Peng Yu wrote: On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 9, 2009, at 12:00 AM, Peng Yu wrote: On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames This bug has been found long time back and a package has been developed for it. Should the fix be integrated in data.frame rather than be implemented in an additional package? What bug? Is the slow speed in splitting a data.frame a performance bug? NO! The two computations are not equivalent. One is a list whose elements are split vectors, and the other is a list of data.frames containing those vectors. If you take the trouble to assemble that list of data frames from the list of split vectors you will see that it is very time consuming. Read up on memory management issues. Think about what the computer actually has to do in terms of memory access to split a data.frame versus split a vector. --- And even if it were simply a matter of having code that is slow for some application, that would not be a bug. Read the FAQ! Chuck David. system.time(split(as.data.frame(x),f)) user system elapsed 1.700 0.010 1.786 system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 0.170 0.000 0.167 ### m=3 n=6 k=3000 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) system.time(lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Announcing a new R news site: R-bloggers.com
Hello Elijah, You could have not made me happier with your letter. It is a very satisfying compliment for me to be appreciated in general, and specifically by you. I will start by responding to the simpler parts of your e-mail and the proceed into the more interesting parts. My response grew to be quite long, though I tried to remain fresh. I hope it turned out ok. First, thank you for the historical context of PlanetR and also for putting it up in the first place. I already took some steps as to contact people in the PlanetR list, but not all have responded to me. Now regarding the more general (implicit) questions that come up from your letter and from Dirk's: what is the purpose of R-bloggers.com and of that of PlanetR (especially when services like Goggle reader and other such sources are abundant). For me there are two audiences: One is that of the web 2.0 power users. That is, people who know what RSS is and use it, maybe evern write their own blogs. These people have only one problem (as I see it) that R-bloggers tries to solve, and that is to know who else lives in their ecosystem. Who else they should follow. For that, google reader recommendation system is great, but not enough. A much better system is if there was a one place where all R bloggers would go, write down their website, and all of us would know they exist. That is what R-bloggers offers for the power users. I think this is also why over 20 of them subscribed to the site RSS feed. BTW, The origin of this idea came to me when I was trying to find all the dance bloggers for my wife (who is a dance researcher and blogger herself). After a while we started http://www.dancebloggers.com/ while knowing of only 10 bloggers. They list now has over 80 bloggers, most of which we would have not known about without this hub. The same thing I am trying to do for the R community, that is way I hope more R bloggers would write about the service - so their network of readers which includes other R bloggers would add themselves and we will all know about them. If that was my only purpose, a simple directory would have been enough. But I also have a second one and that is to help the second audience. The second audience I am thinking of are people of our community who are not so much early adopters (and actually quite late adapters) of the new facilities that the new web (a.k.a: web 2.0) provides. To them the all RSS thing is too much to look at, and they are used to e-mails. And because of that they are (until now) disconected from many of the R bloggers out there, simply because it is in-efficient for them to go through all these blogs each day (or even week). So for them, to see all the content in one place (and even get an e-mail about it) would be (I hope) a service. I believe that's why 5 of them (so far) has subscribed via e-mail. I also hope teachers will direct their students to this as a resource for getting a sense of what people who are using R are doing. Another thing that hints me about the R community is seeing how the facebook fan box is still empty. Which tells me that (sadly) very few R users are actively using facebook as a means for connecting with the outer networks of people out there. All I wrote also explains why R-bloggers will only take feeds of bloggers and only (as much as can be said) their posts that are centered around R (hence the website name :) ). It both follows what Gabor talked about - having a site who's content is only about R. But also what I wish, which is to have content in the sense of articles to read (mostly). And not so much things like news feeds of wikipedia or new packages published. Regarding what you suggested of turning the site into being more of a community enterprise, I don't see how to do that. Right now, the adding of the feeds is a very simple process and the rate of people adding themselves is very low, so I don't think I will need help in that. I would more love to see more people in our community becoming even more social online, but I don't thing that R bloggers http://www.r-bloggers.com/should be the place for that but rather it should be on each of the blogs that write about R. And also on services like http://crantastic.org/ which I really hope will somehow be pushed more by the R core team so to serve all of us with more input from the R community of users. I hope this was at least an interesting read for some of you :) And Elijah, *thanks again* for your kind words! Best regards, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com/ (English) -- On Tue, Dec 8, 2009 at 7:12 PM, Elijah Wright elijah.wri...@gmail.comwrote: Hi Tal! First let me say that I deeply appreciate the work that
[R] equivalent of ifelse
Hi, Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which takes an atomic element as argument but returns vector since ifelse returns an object of the same length as its argument? x = c(1,2,3) y = c(4,5,6,7) z = 3 ifelse(z = 3,x,y) would return x and not 1 thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek symbols on ylab= using barchart() {Lattice}
barchart(1:2, ylab=expression(Concentration (*mu*g/m^3*))) 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi Baptiste and Others, Thanks for your help. I'm writing: ylab=expression(Concentration(mu*g/m^3)) And its working fine, but is it possible to add a space between Concentration and (mu*g/m^3). Thanks again, Peng Cai On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Hi, try this, barchart(1:2, ylab=expression(mu*g/m^3)) ?plotmath baptiste 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi All, I'm trying to write ug/m3 as y-label, with greek letter mu replacing u AND 3 going as a power. These commands works in general: plot.new() text(0.5, 0.5, expression(symbol(m))) But, I'm sure about how to do it using barchart() from Lattice. Can anyone help please? Thanks, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] equivalent of ifelse
Try this: list('TRUE' = x, 'FALSE' = y)[[as.character(as.name(z = 1))]] On Wed, Dec 9, 2009 at 3:40 PM, carol white wht_...@yahoo.com wrote: Hi, Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which takes an atomic element as argument but returns vector since ifelse returns an object of the same length as its argument? x = c(1,2,3) y = c(4,5,6,7) z = 3 ifelse(z = 3,x,y) would return x and not 1 thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greek symbols on ylab= using barchart() {Lattice}
Thanks, it worked! On Wed, Dec 9, 2009 at 12:46 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: barchart(1:2, ylab=expression(Concentration (*mu*g/m^3*))) 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi Baptiste and Others, Thanks for your help. I'm writing: ylab=expression(Concentration(mu*g/m^3)) And its working fine, but is it possible to add a space between Concentration and (mu*g/m^3). Thanks again, Peng Cai On Wed, Dec 9, 2009 at 12:02 PM, baptiste auguie baptiste.aug...@googlemail.com wrote: Hi, try this, barchart(1:2, ylab=expression(mu*g/m^3)) ?plotmath baptiste 2009/12/9 Peng Cai pengcaimaill...@gmail.com: Hi All, I'm trying to write ug/m3 as y-label, with greek letter mu replacing u AND 3 going as a power. These commands works in general: plot.new() text(0.5, 0.5, expression(symbol(m))) But, I'm sure about how to do it using barchart() from Lattice. Can anyone help please? Thanks, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] equivalent of ifelse
On Dec 9, 2009, at 12:40 PM, carol white wrote: Hi, Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which takes an atomic element as argument but returns vector since ifelse returns an object of the same length as its argument? x = c(1,2,3) y = c(4,5,6,7) z = 3 ifelse(z = 3,x,y) would return x and not 1 I worry that this is too simple, so wonder if you have expressed your intent clearly. if(z = 3) {x} else {y} [1] 1 2 3 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Wed, Dec 9, 2009 at 11:20 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Wed, 9 Dec 2009, Peng Yu wrote: On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 9, 2009, at 12:00 AM, Peng Yu wrote: On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames This bug has been found long time back and a package has been developed for it. Should the fix be integrated in data.frame rather than be implemented in an additional package? What bug? Is the slow speed in splitting a data.frame a performance bug? NO! The two computations are not equivalent. One is a list whose elements are split vectors, and the other is a list of data.frames containing those vectors. I made a comparable example below. Still splitting data.frame is much slower comparing with the second way that I'm showing. If you take the trouble to assemble that list of data frames from the list of split vectors you will see that it is very time consuming. It is not as I show in the example below. Read up on memory management issues. Think about what the computer actually has to do in terms of memory access to split a data.frame versus split a vector. I'd like to read more on how R do memory management. Would you please point me a good source? But again, R is not user friendly. It took me quite a long time to figure out that splitting a data.frame is a bottle neck in my program and reduce the problem into a test case. I don't know how memory management is done in R so that I don't know if it is possible to fix the problem for splitting a data.frame without perturbing the interface of data.frame. But if the speed of splitting data.frame is so slow, maybe it can be forbidden and an alternative can be documented somewhere. --- And even if it were simply a matter of having code that is slow for some application, that would not be a bug. Read the FAQ! The definition of a bug is on the FAQ is narrower than what I thought. No matter what a definition of a bug is, split() on a data.frame is perfectly legitimate operation (in terms of an interface). A quick fix to this problem is to at least single out the case where the argument is a data.frame, and to do what I have been doing below. Therefore, that is why I say this is a performance bug. Similar cases, where a faster alternative can be done but is not done, are perfect to call bugs, at least in many other languages. m=30 n=6 k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) user system elapsed 39.020 0.010 39.084 v=lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 2.520 0.000 2.526 system.time( + mapply( + function(...) { + cbind(...) + } + , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]] + ) + ) user system elapsed 0.920 0.000 0.927 David. system.time(split(as.data.frame(x),f)) user system elapsed 1.700 0.010 1.786 system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 0.170 0.000 0.167 ### m=3 n=6 k=3000 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) system.time(lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] equivalent of ifelse
On 09/12/2009 12:40 PM, carol white wrote: Hi, Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which takes an atomic element as argument but returns vector since ifelse returns an object of the same length as its argument? I don't understand what's wrong with if (cond) expr1 else expr2. It can be used in an expression, e.g. w - if (z = 3) x else y which is I think exactly what you are asking for. Duncan Murdoch x = c(1,2,3) y = c(4,5,6,7) z = 3 ifelse(z = 3,x,y) would return x and not 1 thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Series Rating Model
http://n4.nabble.com/file/n956255/TimeSeries%2BLikelihood.jpg From the attachment, I need to find out the maximum likelihood value of ξ, while I stuck to write in R codes. I will appreciate of experts advices. Thanks. ryusuke wrote: To R programming experts, I am a undergraduate student, and now doing research personally. I apply diagonal bivariate poisson (R package bivpois) with stochatics weighted function (refer to dixoncoles97 section 4.5 to 4.7). However I dont know how to fit this stochatical weighted function to the completed bivariate poisson model. I know that some other references for dynamic soccer team rating apply below two methods, while I am not familiar with these method:- 1. Brownian Motions ifs (CRAN package) sde (CRAN package) dvfBm (CRAN package) 2. Kalman Filters FKF (CRAN package) KFAS (CRAN package) Hereby I attach some references. I upload the model in R programming file model.RData in my skydrive as well. I will appreciate if prof would sharing your precious advice or suggestion. Thank you. Best Regards, Ryusuke A Soccer Scores Modelling Enthursiast _ USBã¡ã¢ãªä»£ããã«ã使ããã ãããç¡æã§ä½¿ãã25GBã __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/Time-Series-Rating-Model-tp930676p956255.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] equivalent of ifelse
David Winsemius wrote: On Dec 9, 2009, at 12:40 PM, carol white wrote: Hi, Is there any equivalent for ifelse (except if (cond) expr1 else expr2) which takes an atomic element as argument but returns vector since ifelse returns an object of the same length as its argument? x = c(1,2,3) y = c(4,5,6,7) z = 3 ifelse(z = 3,x,y) would return x and not 1 I worry that this is too simple, so wonder if you have expressed your intent clearly. if(z = 3) {x} else {y} [1] 1 2 3 I was wondering David, why is the {} necessary? if(z = 3) x else y [1] 1 2 3 since without {} it cames with the same result? Thanks MR. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/equivalent-of-ifelse-tp956232p956258.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] arrow plots
Thanks, all, for the help. Much obliged. I realize now that I should have said that I am using lattice graphics. The par() command has not been helpful in convincing lattice to plot outside of the default window. Any other advice is appreciated. Thanks again. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grep() exclude certain patterns?
I think that we are talking past each other here. You are clearly not understanding (or at least convinced by) what I am saying, and you are not convincing me (or possibly I am not understanding your arguments). So our efforts will probably be better spent on things other than continuing this discussion. After a few general comments on why I like where R is and the direction it is going, this is likely to be my last contribution to this conversation. You have expressed interest in books in the past, as the free documentation has not been sufficient for you, you may be interested in some of the books listed here: http://www.r-project.org/doc/bib/R-books.html Some that may fit your needs (others may as well, but I have not read everything on the list) include: S Programming An R and S-Plus Companion to Applied Regression Modern Applied Statistics with S A Handbook of Statistical Analyses Using R (this one has the 1st chapter and all example available for free). You say that R is written by statisticians rather than software engineers. 10 years ago I was in the last year of my graduate program and my job at the time had me traveling and interacting with a lot of people who had purchased and were using the same commercial package as I was. These were both stats professors and professional statisticians. Some of the discussions were about some added functionality that we wished for. When I tried to pass these suggestions on to one of the programmers (CS graduate) of the commercial package, he proceeded to give me a lecture on what the users really wanted. I'll take the package created by Statisticians for Statistician over the ones by Software Engineers who don't work in the field. You compared R to C++ a couple of times, but this is not really a valid comparison. C++ was never intended to be used interactively, S/R had this as a core concept from the beginning. As a slightly better comparison, one previous job that I had I was working with programs written in C, when I switched to using Perl (another language famous for and proud of nonstandard function calls and inconsistencies) my productivity doubled. This is not saying that C is bad, I still use it where appropriate, just that Perl was better for that job. I can see how someone could program a t-test in C++, but R would be a lot quicker, on the other hand, I would not choose R as the programming language if I were creating the a full accounting system, the next word processor, and improved spreadsheet, or the next hot game (though I am guilty of programming games in R). R is not perfect, if it were, there would not be all the new releases. But I am happy with it and the direction it is going. You would like more structure, more standards committees, etc. Here is one example of why I don't like the idea of those things. A couple of years ago I posted to this list with a question about something I was trying to do, I included an example of what I had tried, what I was trying to accomplish, and how the results differed from what I wanted. My post appeared on Friday. On Saturday a member of R core responded that the functions I was using were never intended to work the way that I was trying to make them work, and it was unlikely that I would ever get them to work that way. He did however mention that he could see a possibility of a new function that did what I wanted. On Sunday another person replied and said they would also be interested in the new function. On Monday, the member of R core wrote again saying that he had just committed! the new function, which did exactly what I asked for, to the development version of R. Contrast that with the last time I contacted tech support for a commercial package that I was paying maintenance fees for, it took them longer than that to get back to me with their first answer, which did not even work, and even longer to get back with a working answer that turned out to be more complicated than what I had worked out for myself in the meantime. So, I for one am very happy with R and the direction it is going. I am grateful to R core and all the others who are improving this great program. And I am trying to do my part in improving it. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choose.files limit?
Gunnar, Did you find a solution, I'm facing the same problem (would like to load 2868 files in one go). Is it a bug, or just something that should be documented in the help. Also, the error message is misleading, as it say it «cannot find» a certain file (the name given here is truncated in my case), instead of «too much files selected». I'm not really familiar with bug filling, could someone with more experience tell if it's really a bug ? It's easy to reproduce, just select a lot of files (I couldn't find the exact number unlike Gunnar) with choose.files(). Etienne Gunnar W. Schade a écrit : Howdy, When I use the choose.files command to read files a large number of file names to a character vector inside a function, used to access these files one after the other, there appears to be a limit. I do not know whether it is arbitrary, but in this case the limit was 991 files. The file names are long. Does that matter? The error message that appears says that it cannot find file #992 (giving the file name), although it is certainly there (I tried changing which file is #992 and it did not matter). Suggestions? - Gunnar --- Dr. Gunnar W. Schade Assistant Professor Texas AM University Department of Atmospheric Sciences 1104 Eller OM Building 3150 TAMU College Station, TX 77843-3150 USA ph.: 979 845 0633 Fax: 979 862 4466 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] code
There is an hpd function in the TeachingDemos package that may do what you want. There are other hpd related functions in other packages as well. Which will work best for you depends on details that you did not provide. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Rahim Alhamzawi Sent: Wednesday, December 09, 2009 3:24 AM To: r-h...@stat.math.ethz.ch Subject: [R] code Dear support, I want to compute the highest probability density for any data. please do you have any code to help me in this subject. I am looking forward to hearing from you as soon as possible. rahim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] .Rhistory in R.app
Dear R users, I am having a minor but annoying issue with R.app. It doesn't retain the history information from the previous sessions. By history, I mean a record of commands/functions entered into R rather than the list of objects--that is properly recorded in the .Rdata file as well as in a workspace file I save separately. System details: R version 2.9.0 R.app GUI 1.28 Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM) Things I've done: RPreferencesStartupHistory: Read history file on startup is checked; R history file directory is specified with a path to my preferred directory (~/Documents/...). I've tried it with the default setting, too--it makes no difference. I've checked the permissions on the .Rhistory file. The default .Rhistory file created by R has the permissions set at -rw-r--r--. I've moved the .Rhistory file to a different location (Desktop), so that R would create a new one. Makes no difference--command history is still empty at startup. R has kept track of history on my system in the past--the file I moved to the desktop has a record of my work from about a year ago. (By the way, that file's permissions are -rwx-.) Judging by what is in the old .Rhistory file, the problem started around the time of my upgrade from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle of a project, because every R upgrade I've done in the past has broken something, and I've had nothing but grief with my open source apps after upgrading to Snow Leopard. So if there is some kind of a fix that doesn't involve upgrading R, I'd love to hear about it. Maria Gouskova __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Population Histogram
How would I make a population histogram in R from an excel file? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can elements of a list be passed as multiple arguments?
On Dec 9, 2009, at 12:14 PM, Peng Yu wrote: On Tue, Dec 8, 2009 at 11:05 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:37 PM, Peng Yu wrote: I want to split a matrix, where both 'u' and 'w' are results of possible ways. However, whenever 'n' changes, the argument passed to mapply() has to change. Is there a way to pass elements of a list as multiple arguments? You need to explain what you want in more detail. In your example mapply did exactly what you told it to. No errors. Three matrices. What were you expecting when you gave it three lists in each argument? I want a general solution so that I don't have to always write v[[1]], v[[2]], ..., v[[n]] like in the following, because the following way would not work if 'n' is an arbitrary number. w=mapply(function(x,y) {cbind(x,y)}, v[[1]], v[[2]], ..., v[[n]]) One way that I can think of is to somehow expand a list (i.e., v in this case) to a set of arguments that can be passed to 'mapply()'. The functions illustrated on the help page for Reduce address the task of passing arbitrarily long lists of arguments to functions expecting two. It's possible that do.call might address this, but I have not come up with a strategy that deals with your structures. -- David. m=10 n=2 k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) u=split(as.data.frame(x),f) v=lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) w=mapply( function(x,y) { cbind(x,y) } , v[[1]], v[[2]] ) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
Geert Janssens janssens-geert at telenet.be writes: Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the Subset sum problem, which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my problem is called subset sum), but I can't seem to find an easily applicable implementation. I did search the list archive, the R website and used the help.search and apropos function. I'm afraid nothing obvious showed up for me. Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): # Exact algorithm for the SUBSET SUM problem exactSubsetSum - function(S, t) { S - S[S = t] if (sum(S) t) return(FALSE) S - sort(S, decreasing=TRUE) n - length(S) L - c(0) for (i in 1:n) { L - unique(sort(c(L, L + S[i]))) L - L[L = t] if (max(L) == t) return(TRUE) } return(FALSE) } # Example with a set of cardinality 64 amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # Timing is not that bad system.time( sol - exactSubsetSum(products, amount) ) # user system elapsed # 0.516 0.096 0.673 sol # [1] TRUE Thank you very much. Geert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] .Rhistory in R.app
I just experimented and I find that, exactly as Maria described, on my system no commands get added to .Rhistory when I start R using the GUI. The ``timestamp'' that is implemented in my .Rprofile gets added, but no commands that I typed in the GUI window appeared. I had never noticed this before since I never actually use the GUI. (Like ***all civilized*** people, I use the command line exclusciously. :-) ) I am running R 2.10.0, so updating is not the issue. Maria: Why don't you just start R from the command line, like a civilized person ( :-) ) and forget about the expletive deleted GUI, which only gets in the way of serious work? cheers, Rolf Turner P. S.: sessionInfo() R version 2.10.0 (2009-10-26) i386-apple-darwin8.11.1 locale: [1] en_NZ.UTF-8/en_NZ.UTF-8/C/C/en_NZ.UTF-8/en_NZ.UTF-8 attached base packages: [1] datasets utils stats graphics grDevices methods base other attached packages: [1] misc_0.0-11fortunes_1.3-6 MASS_7.3-3 On 10/12/2009, at 7:33 AM, Maria Gouskova wrote: Dear R users, I am having a minor but annoying issue with R.app. It doesn't retain the history information from the previous sessions. By history, I mean a record of commands/functions entered into R rather than the list of objects--that is properly recorded in the .Rdata file as well as in a workspace file I save separately. System details: R version 2.9.0 R.app GUI 1.28 Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM) Things I've done: RPreferencesStartupHistory: Read history file on startup is checked; R history file directory is specified with a path to my preferred directory (~/Documents/...). I've tried it with the default setting, too--it makes no difference. I've checked the permissions on the .Rhistory file. The default .Rhistory file created by R has the permissions set at -rw-r--r--. I've moved the .Rhistory file to a different location (Desktop), so that R would create a new one. Makes no difference--command history is still empty at startup. R has kept track of history on my system in the past--the file I moved to the desktop has a record of my work from about a year ago. (By the way, that file's permissions are -rwx-.) Judging by what is in the old .Rhistory file, the problem started around the time of my upgrade from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle of a project, because every R upgrade I've done in the past has broken something, and I've had nothing but grief with my open source apps after upgrading to Snow Leopard. So if there is some kind of a fix that doesn't involve upgrading R, I'd love to hear about it. Maria Gouskova ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] .Rhistory in R.app
Maria, Try changing the name of .Rhistory in the Startup preferences to something like .Rosxhistory. Press enter to make sure the change is accepted and try again. The problem is that R itself overwrites the file .Rhistory if it is told to save the workspace. Rob On Dec 9, 2009, at 10:33 AM, Maria Gouskova wrote: Dear R users, I am having a minor but annoying issue with R.app. It doesn't retain the history information from the previous sessions. By history, I mean a record of commands/functions entered into R rather than the list of objects--that is properly recorded in the .Rdata file as well as in a workspace file I save separately. System details: R version 2.9.0 R.app GUI 1.28 Mac OS 10.6.2 (MacBook, Intel 2.4 GHz, 4 Gb RAM) Things I've done: RPreferencesStartupHistory: Read history file on startup is checked; R history file directory is specified with a path to my preferred directory (~/Documents/...). I've tried it with the default setting, too--it makes no difference. I've checked the permissions on the .Rhistory file. The default .Rhistory file created by R has the permissions set at -rw-r--r--. I've moved the .Rhistory file to a different location (Desktop), so that R would create a new one. Makes no difference--command history is still empty at startup. R has kept track of history on my system in the past--the file I moved to the desktop has a record of my work from about a year ago. (By the way, that file's permissions are -rwx-.) Judging by what is in the old .Rhistory file, the problem started around the time of my upgrade from 2.7.x to 2.8. I am reluctant to upgrade to R 2.10 in the middle of a project, because every R upgrade I've done in the past has broken something, and I've had nothing but grief with my open source apps after upgrading to Snow Leopard. So if there is some kind of a fix that doesn't involve upgrading R, I'd love to hear about it. Maria Gouskova __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames 'data.table' doesn't seem to help. You can try the other set of m,n,k. In both case, using as.data.frame is faster than using as.data.table. Please let me know if I understand what you meant. m=10 n=6 k=3 #m=30 #n=6 #k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) library(data.table) Loading required package: ref dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE, use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead system.time(split(as.data.frame(x),f)) user system elapsed 0.000 0.000 0.003 system.time(split(as.data.table(x),f)) user system elapsed 0.010 0.000 0.011 system.time(split(as.data.frame(x),f)) user system elapsed 1.700 0.010 1.786 system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 0.170 0.000 0.167 ### m=3 n=6 k=3000 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) system.time(lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why cannot get the expected values in my function
Thanks very very much. It is really not easy to change from one language to another. :) 2009/12/9 Gavin Simpson gavin.simp...@ucl.ac.uk On Tue, 2009-12-08 at 23:22 -0500, David Winsemius wrote: On Dec 8, 2009, at 11:07 PM, rusers.sh wrote: Hi, In the following function, i hope to save my simulated data into the result dataset, but why the final result dataset seems not to be generated. #Function simdata-function (nsim) { # Instead why not: cbind(x=runif(nsim), y=runif(nsim) ) or: m - matrix(runif(nsim*2), ncol = 2) ## if names on m needed colnames(m) - c(x,y) G } #simulation simdata(10) #correct result x y [1,] 0.2655087 0.372123900 [2,] 0.1848823 0.702374036 [3,] 0.1680415 0.807516399 [4,] 0.5858003 0.008945796 [5,] 0.2002145 0.685218596 [6,] 0.6062683 0.937641973 [7,] 0.9889093 0.397745453 [8,] 0.4662952 0.207823317 [9,] 0.2216014 0.024233910 [10,] 0.5074782 0.306768506 But, the dataset result wasnot assigned the above values. What is the problem? result #wrong result?? x y [1,] NA NA [2,] NA NA [3,] NA NA [4,] NA NA [5,] NA NA [6,] NA NA [7,] NA NA [8,] NA NA [9,] NA NA [10,] NA NA Thanks a lot. -- - Jane Chang Queen's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- - Jane Chang Queen's [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] formula () problems
Hi. I am having difficulty creating a formula for use with glm() I have a matrix of an unknown number of columns and wish to estimate a coefficient for each column, and one for the each product of a column with another column. In the case of a five-column matrix this would be: x - matrix(rnorm(100),ncol=5) colnames(x) - letters[1:5] z - rnorm(20) lm(z~ -1+(a+b+c+d+e)^2,data=data.frame(x)) Call: lm(formula = z ~ -1 + (a + b + c + d + e)^2, data = data.frame(x)) Coefficients: a b c d e a:b a:c a:d -0.30021 -0.21465 0.12208 0.06308 0.28806 0.34482 -1.00072 0.48218 a:e b:c b:d b:e c:d c:e d:e 0.28786 -0.46306 0.39844 0.04436 0.32236 -0.09210 -1.06625 This is what I want: five single terms (a-e) and 5*(5-1)/2=10 (a:b to d:e) for the cross terms. If there were 6 columns I would want (a+b+c+d+e+f)^2 and have 21 (=6+15) terms. How do I create a formula that does this for an arbitrary number of columns? thanks Robin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] formula () problems
On Dec 9, 2009, at 2:22 PM, Dr R.K.S. Hankin wrote: Hi. I am having difficulty creating a formula for use with glm() I have a matrix of an unknown number of columns and wish to estimate a coefficient for each column, and one for the each product of a column with another column. In the case of a five-column matrix this would be: x - matrix(rnorm(100),ncol=5) colnames(x) - letters[1:5] z - rnorm(20) lm(z~ -1+(a+b+c+d+e)^2,data=data.frame(x)) Call: lm(formula = z ~ -1 + (a + b + c + d + e)^2, data = data.frame(x)) Coefficients: a b c d e a:b a:c a:d -0.30021 -0.21465 0.12208 0.06308 0.28806 0.34482 -1.00072 0.48218 a:e b:c b:d b:e c:d c:e d:e 0.28786 -0.46306 0.39844 0.04436 0.32236 -0.09210 -1.06625 This is what I want: five single terms (a-e) and 5*(5-1)/2=10 (a:b to d:e) for the cross terms. If there were 6 columns I would want (a+b+c+d+e+f)^2 and have 21 (=6+15) terms. How do I create a formula that does this for an arbitrary number of columns? thanks Robin Robin, Try this: lm(z ~ (.)^2 - 1, data = data.frame(x)) See the Details section of ?formula, which describes the use of '.' to refer to all columns not otherwise already in the formula. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Dec 9, 2009, at 2:59 PM, Peng Yu wrote: On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames 'data.table' doesn't seem to help. You can try the other set of m,n,k. In both case, using as.data.frame is faster than using as.data.table. Please let me know if I understand what you meant. I was only suggesting that you look at it because it appeared in other situation to have efficiency advantages. As it turned out, that structure offered no advantage, when I tested it. -- David. m=10 n=6 k=3 #m=30 #n=6 #k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) library(data.table) Loading required package: ref dim(refdata) and dimnames(refdata) no longer allow parameter ref=TRUE, use dim(derefdata(refdata)), dimnames(derefdata(refdata)) instead system.time(split(as.data.frame(x),f)) user system elapsed 0.000 0.000 0.003 system.time(split(as.data.table(x),f)) user system elapsed 0.010 0.000 0.011 system.time(split(as.data.frame(x),f)) user system elapsed 1.700 0.010 1.786 system.time(lapply( + 1:dim(x)[[2]] + , function(i) { + split(x[,i],f) + } + ) + ) user system elapsed 0.170 0.000 0.167 ### m=3 n=6 k=3000 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) system.time(split(as.data.frame(x),f)) system.time(lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) } ) ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new version of RobASt-family of packages
The new version 0.7 of our RobASt-family of packages is available on CRAN for several days. As there were many changes we will only sketch the most important ones here. For more details see the corresponding NEWS files (e.g. news(package = RobAStBase) or using function NEWS from package startupmsg i.e. NEWS(RobAStBase)). First of all the new package RobLoxBioC was added to the family which includes S4 classes and methods for preprocessing omics data, in particular gene expression data. ## ## All packages (RandVar, RobAStBase, RobLox, RobLoxBioC, ## ROptEst, ROptEstOld, ROptRegTS, RobRex) ## - TOBEDONE file was added as a starting point for collaborations. The file can be displayed via function TOBEDONE from package startupmsg; e.g. TOBEDONE(distr) - tests/Examples folder for some automatic testing was introduced ## ## Package RandVar ## - mainly fixing of warnings and bugs ## ## Package RobAStBase ## - enhanced plotting, in particular methods for qqplot - unified treatment of NAs - extended implementation for total variation neighbourhoods - implementation of k-step estimator construction extended ## ## Package RobLox ## - na.rm argument added - introduction of finite-sample correction ## ## Package ROptEst ## - optional use of alternative algorithm to obtain Lagrange multipliers using duality based optimization - extended implementation for total variation neighbourhoods - solutions for general parameter transformations with nuisance components - several extensions to the examples in folder scripts - implementation of k-step estimator construction extended ## ## Package ROptEstOld ## - still needed for packages ROptRegTS and RobRex - removed Symmetry and DistributionSymmetry implementation to make ROptEstOld compatible with distr 2.2 ## ## Package ROptRegTS ## - still depends on ROptEstOld ## ## Packages RobRex ## - moved some of the examples in \dontrun{} to reduce check time ... - some minor corrections in ExamplesEstimation.R in folder scripts Best Peter Matthias -- Dr. Matthias Kohl www.stamats.de ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Significant performance difference between split of a data.frame and split of vectors
On Wed, 9 Dec 2009, Peng Yu wrote: On Wed, Dec 9, 2009 at 11:20 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Wed, 9 Dec 2009, Peng Yu wrote: On Tue, Dec 8, 2009 at 11:06 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 9, 2009, at 12:00 AM, Peng Yu wrote: On Tue, Dec 8, 2009 at 10:37 PM, David Winsemius dwinsem...@comcast.net wrote: On Dec 8, 2009, at 11:28 PM, Peng Yu wrote: I have the following code, which tests the split on a data.frame and the split on each column (as vector) separately. The runtimes are of 10 time difference. When m and k increase, the difference become even bigger. I'm wondering why the performance on data.frame is so bad. Is it a bug in R? Can it be improved? You might want to look at the data.table package. The author calinms significant speed improvements over dta.frames This bug has been found long time back and a package has been developed for it. Should the fix be integrated in data.frame rather than be implemented in an additional package? What bug? Is the slow speed in splitting a data.frame a performance bug? NO! The two computations are not equivalent. One is a list whose elements are split vectors, and the other is a list of data.frames containing those vectors. I made a comparable example below. Still splitting data.frame is much slower comparing with the second way that I'm showing. If you take the trouble to assemble that list of data frames from the list of split vectors you will see that it is very time consuming. It is not as I show in the example below. You are comparing creating a matrix to creating a data.frame. system.time( + spl- mapply( + function(...) { + cbind(...) + } + , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]] + ) + ) user system elapsed 1.204 0.016 1.478 system.time( + spl- mapply( + function(...) { + data.frame(...) + } + , v[[1]], v[[2]], v[[3]], v[[4]], v[[5]], v[[6]],SIMPLIFY=FALSE + ) + ) user system elapsed 56.088 0.104 56.478 If you just want a list of matrices, use system.time(split.data.frame(x,f)) user system elapsed 0.524 0.016 0.927 Read up on memory management issues. Think about what the computer actually has to do in terms of memory access to split a data.frame versus split a vector. I'd like to read more on how R do memory management. Would you please point me a good source? I see now that the timing issue was not one of memory, but of doing more work (see Rprof results below) to create a data.frame. But if you are interested you might look at Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Johns Hopkins, ISBN 978-0-8018-5414-9 . and/or Google BLAS memory But again, R is not user friendly. It took me quite a long time to figure out that splitting a data.frame is a bottle neck in my program and reduce the problem into a test case. See ?Rprof and note where the 'self.time's are largest below( not in split or split.data.frame) : Rprof() res - split(as.data.frame(x),f) Rprof(NULL) summaryRprof() $by.self self.time self.pct total.time total.pct attr 33.66 72.9 33.66 72.9 [.data.frame 3.26 7.1 45.70 98.9 inherits 1.52 3.3 2.06 4.5 anyDuplicated 1.04 2.3 1.42 3.1 [[.data.frame 1.00 2.2 4.76 10.3 [[ 0.74 1.6 5.50 11.9 match 0.66 1.4 2.96 6.4 Anonymous0.66 1.4 0.72 1.6 sys.call 0.46 1.0 0.46 1.0 all0.38 0.8 0.38 0.8 anyDuplicated.default 0.36 0.8 0.38 0.8 %in% 0.32 0.7 3.26 7.1 names 0.26 0.6 0.26 0.6 is.factor 0.24 0.5 2.30 5.0 length 0.20 0.4 0.20 0.4 attr- 0.18 0.4 0.18 0.4 as.character 0.16 0.3 0.16 0.3 [ 0.14 0.3 45.84 99.2 - 0.14 0.3 0.14 0.3 ! 0.12 0.3 0.12 0.3 .Call 0.12 0.3 0.12 0.3 != 0.10 0.2 0.10 0.2 vector 0.06 0.1 0.26 0.6 as.data.frame.matrix 0.06 0.1 0.08 0.2 | 0.06 0.1 0.06 0.1 lapply 0.04 0.1 46.12 99.8 0.04 0.1 0.04 0.1 any0.04 0.1 0.04 0.1 is.na
[R] [R-pkgs] doMPI 0.1-3
I'd like to announce the availability of the new doMPI package, a parallel backend for the foreach package, which acts as an adaptor to the Rmpi package. The package has been uploaded to CRAN and is now available. Like the doSNOW package, doMPI allows you to execute foreach loops in parallel using Rmpi as the underlying transport. But I was interested in experimenting with using Rmpi directly so that data that was used in all iterations of a foreach loop could be broadcast to the cluster workers using the Rmpi mpi.bcast function. I also wanted to write the package so it could fetch arguments and process results dynamically, allowing it to handle an arbitrary number of tasks in a memory efficient way. The package includes a number of example scripts and an introductory vignette, in addition to the standard help documentation. The vignette also attempts to explain how to run doMPI scripts using the Open MPI orterun command, which I hope helps people who are new to Rmpi get started running parallel R programs. - Steve Weston ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re-ordering labels issue using barchart().
Hi All, I have a question regarding re-ordering legend labels, I'm not sure if it would be possible to do what I'm thinking... Using the following code, I reordered labels for the legend: Now in the stacked bars, we see 1931 values are plotted below 1932 whereas in legend 1931 comes above 1932. So in the process if we look at stacked bars, color blue is below pink whereas at the same time if one looks at the legends its the other way around. Is there way to re-order legends in a way that colors look in same order? Code: library(lattice) barley$year - factor(barley$year, levels=c(1931,1932)) barchart(yield ~ variety | site, data = barley, groups = year, layout = c(1,6), stack = TRUE, auto.key = list(points = FALSE, rectangles = TRUE, space = right), ylab = Barley Yield (bushels/acre), scales = list(x = list(rot = 45))) Thanks alot, Peng Cai [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Evaluating a function within a pre-defined environment?
Hi all, I have a somewhat confusing question that I was wondering if someone could help with. I have a pre-defined environment with some variables, and I would like to define a function, such that when it is called, it actually manipulates the variables in that environment, leaving them to be examined later. I see from the R language definition that When a function is called, a new environment (called the evaluation environment) is created, whose enclosure (see Environment objects) is the environment from the function closure. This new environment is initially populated with the unevaluated arguments to the function; as evaluation proceeds, local variables are created within it. So basically, I think I am asking if it is possible to pre-create my own evaluation environment and have it retain the state that it was in at the end of the function call? Example: e - new.env() e$x - 3 f - function(xx) x - x + xx can I then call f(2) and have it leave e$x at 5 after the function returns? I know that environment(f) - e goes part of the way, but I would like to let the function also write to the environment. Thanks for any advice. --David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the development cycle where there are code in tests/ for package development?
I see 'library(stats)' at the beginning of R-2.10.0/src/library/stats/tests/nls.R. I'm wondering if I am developing my own package 'mypackage' whether I should put 'library(mypackage)' in a .R file in mypackage/tests/? If I do, then it seems awkward to me, because to use 'library(mypackage)', I have to first get 'mypackage' installed. So the development cycle is: try test cases in tests- see bugs in 'mypackage' - modify the code in 'mypackage' - install 'mypackage'-try test cases in tests again But I think it would faster if the step of installing the package is avoid. So instead of using 'library(mypackage)', I'd think to use 'source(some_file_in_mypackage.R)' in any file in tests/. Could somebody let me know what is the current standard way of developing package. Why 'library(mypackage)' rather than 'source(some_file_in_mypackage.R)' is used? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Evaluating a function within a pre-defined environment?
e - new.env() e$x - 2 f - function(a, e) { e$x - e$x + a; e$x } f(3, e) e$x # 5 Another way to accomplish this is to use the proto package which puts the whole thing into an object oriented framework. See http://r-proto.googlecode.com library(proto) p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x }) p$f(3) # 5 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote: Hi all, I have a somewhat confusing question that I was wondering if someone could help with. I have a pre-defined environment with some variables, and I would like to define a function, such that when it is called, it actually manipulates the variables in that environment, leaving them to be examined later. I see from the R language definition that When a function is called, a new environment (called the evaluation environment) is created, whose enclosure (see Environment objects) is the environment from the function closure. This new environment is initially populated with the unevaluated arguments to the function; as evaluation proceeds, local variables are created within it. So basically, I think I am asking if it is possible to pre-create my own evaluation environment and have it retain the state that it was in at the end of the function call? Example: e - new.env() e$x - 3 f - function(xx) x - x + xx can I then call f(2) and have it leave e$x at 5 after the function returns? I know that environment(f) - e goes part of the way, but I would like to let the function also write to the environment. Thanks for any advice. --David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Evaluating a function within a pre-defined environment?
Ideally I would like to be able to use the function f (in my example) as-is, without having to designate the environment as an argument, or to otherwise have to use e$x in the function body. thanks for any further advice... On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: e - new.env() e$x - 2 f - function(a, e) { e$x - e$x + a; e$x } f(3, e) e$x # 5 Another way to accomplish this is to use the proto package which puts the whole thing into an object oriented framework. See http://r-proto.googlecode.com library(proto) p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x }) p$f(3) # 5 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote: Hi all, I have a somewhat confusing question that I was wondering if someone could help with. I have a pre-defined environment with some variables, and I would like to define a function, such that when it is called, it actually manipulates the variables in that environment, leaving them to be examined later. I see from the R language definition that When a function is called, a new environment (called the evaluation environment) is created, whose enclosure (see Environment objects) is the environment from the function closure. This new environment is initially populated with the unevaluated arguments to the function; as evaluation proceeds, local variables are created within it. So basically, I think I am asking if it is possible to pre-create my own evaluation environment and have it retain the state that it was in at the end of the function call? Example: e - new.env() e$x - 3 f - function(xx) x - x + xx can I then call f(2) and have it leave e$x at 5 after the function returns? I know that environment(f) - e goes part of the way, but I would like to let the function also write to the environment. Thanks for any advice. --David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Population Histogram
1. Read the posting guide: http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html 2. Did you installed R? 2.1. Case yes, go to the help, click on Manual in PDF bests miltinho On Wed, Dec 9, 2009 at 1:47 PM, terry johnson terry.johnson@gmail.comwrote: How would I make a population histogram in R from an excel file? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
On Wednesday 9 December 2009, Hans W Borchers wrote: Geert Janssens janssens-geert at telenet.be writes: Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the Subset sum problem, which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my problem is called subset sum), but I can't seem to find an easily applicable implementation. I did search the list archive, the R website and used the help.search and apropos function. I'm afraid nothing obvious showed up for me. Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): Unfortunatly no. I do need the numbers in the subset. But thank you for presenting this code. Geert # Exact algorithm for the SUBSET SUM problem exactSubsetSum - function(S, t) { S - S[S = t] if (sum(S) t) return(FALSE) S - sort(S, decreasing=TRUE) n - length(S) L - c(0) for (i in 1:n) { L - unique(sort(c(L, L + S[i]))) L - L[L = t] if (max(L) == t) return(TRUE) } return(FALSE) } # Example with a set of cardinality 64 amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # Timing is not that bad system.time( sol - exactSubsetSum(products, amount) ) # user system elapsed # 0.516 0.096 0.673 sol # [1] TRUE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Evaluating a function within a pre-defined environment?
You could write a wrapper function that accepts the output of the function you don't want to change and then sets the values. # f is function we don't want to change f - function(a) x + a wrapper - function(x, e) { environment(f) - e e$x - f(x) } e - new.env() e$x - 2 wrapper(3, e) e$x # 5 or with proto: library(proto) p - proto(x = 2, f = f, wrapper = function(this, x) this$x - with(this, f)(x) ) p$wrapper(3) p$x # 5 On Wed, Dec 9, 2009 at 5:48 PM, David Reiss dre...@systemsbiology.org wrote: Ideally I would like to be able to use the function f (in my example) as-is, without having to designate the environment as an argument, or to otherwise have to use e$x in the function body. thanks for any further advice... On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: e - new.env() e$x - 2 f - function(a, e) { e$x - e$x + a; e$x } f(3, e) e$x # 5 Another way to accomplish this is to use the proto package which puts the whole thing into an object oriented framework. See http://r-proto.googlecode.com library(proto) p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x }) p$f(3) # 5 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote: Hi all, I have a somewhat confusing question that I was wondering if someone could help with. I have a pre-defined environment with some variables, and I would like to define a function, such that when it is called, it actually manipulates the variables in that environment, leaving them to be examined later. I see from the R language definition that When a function is called, a new environment (called the evaluation environment) is created, whose enclosure (see Environment objects) is the environment from the function closure. This new environment is initially populated with the unevaluated arguments to the function; as evaluation proceeds, local variables are created within it. So basically, I think I am asking if it is possible to pre-create my own evaluation environment and have it retain the state that it was in at the end of the function call? Example: e - new.env() e$x - 3 f - function(xx) x - x + xx can I then call f(2) and have it leave e$x at 5 after the function returns? I know that environment(f) - e goes part of the way, but I would like to let the function also write to the environment. Thanks for any advice. --David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: conditionally merging adjacent rows in a data frame
I've also made some comparisons and taking into account execution time, sqldf wins. SummaryBy is better then aggregate in some specific situations I met in practice. I present this situation below. It assumes, that there are at least two groups with high number of levels. n-10; grp1-sample(1:750, n, replace=T) grp2-sample(1:750, n, replace=T) d-data.frame(x=rnorm(n), y=rnorm(n), grp1=grp1, grp2=grp2, n, replace=T) # sqldf library(sqldf) Rprof('prof'); sqldf(select grp1, grp2, avg(x), avg(y) from d group by grp1, grp2) Rprof(NULL); summaryRprof('prof') #by #do.call(rbind, by(d, list(d$grp1, d$grp2), function(x) transform(x, x = mean(x), y = mean(y))[1,,drop = FALSE ])) #doBy library(doBy) Rprof('prof'); summaryBy(x+y~grp1+grp2, data=d, FUN=c(mean)) Rprof(NULL); summaryRprof('prof') #aggregate Rprof('prof'); aggregate(d, list(d$grp1, d$grp2), function(x)mean(x)) Rprof(NULL); summaryRprof('prof') -- Forwarded message -- From: Nikhil Kaza nikhil.l...@gmail.com Date: 2009/12/9 Subject: Re: [R] conditionally merging adjacent rows in a data frame To: Titus von der Malsburg malsb...@gmail.com DW: r-help@r-project.org This is great!! Sqldf is exactly the kind of thing I was looking for, other stuff. I suppose you can speed up both functions 1 and 5 using aggregate and tapply only once, as was suggested earlier. But it comes at the expense of readability. Nikhil On 9 Dec 2009, at 7:59AM, Titus von der Malsburg wrote: On Wed, Dec 9, 2009 at 12:11 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Here are a couple of solutions. The first uses by and the second sqldf: Brilliant! Now I have a whole collection of solutions. I did a simple performance comparison with a data frame that has 7929 lines. The results were as following (loading appropriate packages is not included in the measurements): times - c(0.248, 0.551, 41.080, 0.16, 0.190) names(times) - c(aggregate,summaryBy,by+transform,sqldf,tapply) barplot(times, log=y, ylab=log(s)) So sqldf clearly wins followed by tapply and aggregate. summaryBy is slower than necessary because it computes for x and dur both, mean /and/ sum. by+transform presumably suffers from the contruction of many intermediate data frames. Are there any canonical places where R-recipes are collected? If yes I would write-up a summary. These were the competitors: # Gary's and Nikhil's aggregate solution: aggregate.fixations1 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - aggregate(d$dur, list(idx), sum)[2] d2$x - aggregate(d$x, list(idx), mean)[2] d2 } # Marek's symmaryBy: library(doBy) aggregate.fixations2 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - summaryBy(dur+x~idx, data=d, FUN=c(sum, mean))[c(dur.sum, x.mean)] d2 } # Gabor's by+transform solution: aggregate.fixations3 - function(d) { idx - cumsum(c(TRUE,diff(d$roi)!=0)) d2 - do.call(rbind, by(d, idx, function(x) transform(x, dur = sum(dur), x = mean(x))[1,,drop = FALSE ])) d2 } # Gabor's sqldf solution: library(sqldf) aggregate.fixations4 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] d$idx - cumsum(idx) d2$r - sqldf(select sum(dur), avg(x) x from d group by idx) d2 } # Titus' solution using plain old tapply: aggregate.fixations5 - function(d) { idx - c(TRUE,diff(d$roi)!=0) d2 - d[idx,] idx - cumsum(idx) d2$dur - tapply(d$dur, idx, sum) d2$x - tapply(d$x, idx, mean) d2 } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Marek __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting frequency curve over histogram
Hello, This is a problem for which there seem to be several solutions online, but not really. My question was about plotting a curve over the histogram. All the previous posts and messages talk about generating a *density histogram*using (freq=F) and then plotting the density curve. However, I find that that seriously distorts my data and the plot becomes confounding to the viewer. I was wondering if there's a way to do the following 2 things: 1) Plot both histogram and the overlying frequency curve in one plot 2) Plot multiple frequency curves in a single plot I have been using the hist function for my job. I'd appreciate if anyone could help me with the solution Thanks, Gaurav [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difficulty with terminal properly displaying help function in an ESS remote session
Hi all, I'm logging into a Debian server and running R remotely using ESS. The steps I use to do this are below (pasted from my webpage). However, we're having a problem whenever we want to use the help function, e.g., ?hist The remote buffer gives a warning: WARNING: terminal is not fully functional - (press RETURN) At this point we can't get back to our normal R session. When I use ESS locally on my MacPro, help screens open up in a split buffer below the one I'm working on, which is fine. So there are two issues: 1) How do we switch from the help screen back to the R session? 2) Is it possible to have help screens open up in separate buffers or in split buffers when using ESS remotely? Any help is appreciated! Matt Steps we take to use ESS remotely: 1) Open up your *.R script you’d like to use 2) Open a shell inside Emacs by typing “M-x shell” 3) From within this shell, ssh to the server you want to use. When doing this, you need to make sure to specify two important ssh options: compression (which compress data coming to you, making the connection seem *much* faster) and X11 forwarding (which allows you to use interactive graphing features via X11). E.g.: ssh -XC usern...@servername.colorado.edu 4) You should now be logged into the server, just as you wold be if you’d used terminal rather than emacs. Now open up R as you usually would on that server. E.g.: R --arch=x86_64 5) You should be in R now. To allow this R session to be linked to your *.R script, use this command in the remote R session: M-x ess-remote In the Emacs mini-buffer prompt, type: r 6) Now you should be able to send code from your *.R script to the remote R session as you normally would (e.g., C-c C-j). 7) Last, you might need to change the options in your remote R session to graph using X11 rather than whatever default driver is being used. To do this in R, type: options(device=’x11’) 8) That’s it. Make sure it all works by typing something like: hist(rnorm(50)) #which should return a histogram of rnorm to your screen! -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exporting Contingency Tables with xtable
Gabor, Thanks for the advice. Using the 'rowlabel' switch works, but when used with the 'collabel' switch, I received the following error in Latex 2e: ! LaTeX Error: Illegal character in array arg. The line that LaTeX has issue with is: \multicolumn{1}{observed}{uh} The entire table looks like: \begin{table}[!tbp] \begin{center} \begin{tabular}{lrr}\hline\hline \multicolumn{1}{l}{predicted} \multicolumn{1}{observed}{uh} \multicolumn{1}{l}{uh~} \tabularnewline \hline uh$201$$30$\tabularnewline uh~$ 6$$10$\tabularnewline \hline \end{tabular} \end{center} \end{table} I know that LaTeX has issues with unescaped tildes, but it does not explain why I get this error. Na'im On Dec 9, 2009, at 5:44 AM, Gabor Grothendieck wrote: Try the latex function in the Hmisc package. Using the state.* variables built into R for sake of example: library(Hmisc) latex(table(state.division, state.region), rowlabel = X, collabel = Y, file = ) On Wed, Dec 9, 2009 at 12:04 AM, Na'im R. Tyson nty...@clovermail.net wrote: Dear R-philes: I am having an issue with exporting contingency tables with xtable(). I set up a contingency and convert it to a matrix for passing to xtable() as shown below. v.cont.table - table(v_lda$class, grps, dnn=c(predicted, observed)) v.cont.mat - as.matrix(v.cont.table) Both produce output as follows: observed predicted uh uh~ uh 201 30 uh~ 6 10 However, when I construct the latex table with xtable(v.cont.mat), I get a good table without the headings of predicted and observed. \begin{table}[ht] \begin{center} \begin{tabular}{rrr} \hline uh uh\~{} \\ \hline uh 201 30 \\ uh\~{}6 10 \\ \hline \end{tabular} \end{center} \end{table} Question: is there any easy way to retain or re-insert the dimension names from the contingency table and matrix? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting frequency curve over histogram
On 10/12/2009, at 12:52 PM, Gaurav Moghe wrote: Hello, This is a problem for which there seem to be several solutions online, but not really. My question was about plotting a curve over the histogram. All the previous posts and messages talk about generating a *density histogram*using (freq=F) and then plotting the density curve. However, I find that that seriously distorts my data and the plot becomes confounding to the viewer. How does this ``distort'' your data? You are simply changing the scale on the y-axis. I was wondering if there's a way to do the following 2 things: 1) Plot both histogram and the overlying frequency curve in one plot If you want to keep your histogram on the ``count'' scale, just multiply your density curve by the constant by which you would have divided the histogram values to change counts into density values. 2) Plot multiple frequency curves in a single plot ?lines I have been using the hist function for my job. I'd appreciate if anyone could help me with the solution cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting frequency curve over histogram
On 09-Dec-09 23:52:20, Gaurav Moghe wrote: Hello, This is a problem for which there seem to be several solutions online, but not really. My question was about plotting a curve over the histogram. All the previous posts and messages talk about generating a *density histogram* using (freq=F) and then plotting the density curve. However, I find that that seriously distorts my data and the plot becomes confounding to the viewer. I was wondering if there's a way to do the following 2 things: 1) Plot both histogram and the overlying frequency curve in one plot 2) Plot multiple frequency curves in a single plot I have been using the hist function for my job. I'd appreciate if anyone could help me with the solution Thanks, Gaurav You presumably mean that the viewer expects to see a histogram of counts, with the corresponding estimated curve of expected counts for each bin-interval (NB *not* density!!) plotted over it. The following is an example of how to achieve this. set.seed(54321) N - 1000 x - rnorm(N) H - hist(x,breaks=50) dx - (H$breaks[2]-H$breaks[1]) m - mean(x) s - sd(x) x0 - H$breaks x1 - c(x0[1]-dx/2,x0+dx/2) y0 - H$counts lines(x1,N*dnorm((x1 - m)/s)*dx) In the above, m and s are the estimated Mean and SD of the fitted Normal distgribution. Therefore the estimated *density* at x is dnorm((x - m)/s)*dx and a good approximation to the probability contained in a given bin whose midpoint is at x1 is dnorm((x1 - m)/s)*dx, where dx is the width of the bin. The total sample size being N, the expected count for that bin is N*dnorm((x1 - m)/s)*dx. With this explanation, the above should now be clear! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 10-Dec-09 Time: 00:51:58 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Evaluating a function within a pre-defined environment?
On Wed, 9 Dec 2009, David Reiss wrote: Ideally I would like to be able to use the function f (in my example) as-is, without having to designate the environment as an argument, or to otherwise have to use e$x in the function body. thanks for any further advice... Perhaps you want something along the lines of the open.account example of R-intro 10.7 Scope ?? Chuck On Wed, Dec 9, 2009 at 2:36 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: e - new.env() e$x - 2 f - function(a, e) { e$x - e$x + a; e$x } f(3, e) e$x # 5 Another way to accomplish this is to use the proto package which puts the whole thing into an object oriented framework. See http://r-proto.googlecode.com library(proto) p - proto(x = 2, f = function(this, a) { this$x - this$x + a; this$x }) p$f(3) # 5 On Wed, Dec 9, 2009 at 4:54 PM, David Reiss dre...@systemsbiology.org wrote: Hi all, I have a somewhat confusing question that I was wondering if someone could help with. I have a pre-defined environment with some variables, and I would like to define a function, such that when it is called, it actually manipulates the variables in that environment, leaving them to be examined later. I see from the R language definition that When a function is called, a new environment (called the evaluation environment) is created, whose enclosure (see Environment objects) is the environment from the function closure. This new environment is initially populated with the unevaluated arguments to the function; as evaluation proceeds, local variables are created within it. So basically, I think I am asking if it is possible to pre-create my own evaluation environment and have it retain the state that it was in at the end of the function call? Example: e - new.env() e$x - 3 f - function(xx) x - x + xx can I then call f(2) and have it leave e$x at 5 after the function returns? I know that environment(f) - e goes part of the way, but I would like to let the function also write to the environment. Thanks for any advice. --David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to figure out which the version of split is used?
There are a number of functions that are dispatched to from split(). methods('split') [1] split.data.frame split.Date split.defaultsplit.POSIXct Is there a way to figure out which of these variants is actually dispatched to when I call split? I know that if the argument is of the type data.frame, split.data.frame will be called? Is it the case that if the argument is not of type data.frame, Date or POSIXct, split.default will be called? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
On Dec 9, 2009, at 9:21 PM, casperyc wrote: does no one know this? Have you read the Posting Guide? -- View this message in context: http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956658.html Sent from the R help mailing list archive at Nabble.com. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Have you used RGoogleDocs and RGoogleData?
Both of these applications fulfill a great need of mine: to read data directly from google spreadsheets that are private to myself and one or two collaborators. Thanks to the authors. I had been using RGoogleDocs for the about 6 months (maybe more) but have had to stop using it in the past month since for some reason that I do not understand it no longer reads google spreadsheets. I loved it. Its loss depresses me. I started using RGoogleData which works. I have noticed that both packages read data slowly. RGoogleData is much slower than RGoogleDocs used to be. Both seem a lot slower than if one manually downloaded a google spreadsheet as a csv and then used read.csv function - but then I would not be able to use scripts and execute without finding and futzing. Can anyone explain in English why these packages read slower than a csv download? Can anyone explain what the core difference is between the two packages? Can anyone share their experience with reading Google data straight into R? Farrel Buchinsky Google Voice Tel: (412) 567-7870 Sent from Pittsburgh, Pennsylvania, United States [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
I think the help page are exactly the same... I just want to verify the confidence interval manually. That's all I want. Thanks. casper brestat wrote: This functions are different. I advice you study them: ?confint # profile likelihood ?confint.default # t-distribution Walmes Zeviani - Brazil casperyc wrote: Hi, I have a glm gives summary as follows, Estimate Std. Errorz valuePr(|z|) (Intercept) -2.03693352 1.449574526 -1.405194 0.159963578 A0.01093048 0.006446256 1.695633 0.089955471 N0.41060119 0.224860819 1.826024 0.067846690 S -0.20651005 0.067698863 -3.050421 0.002285206 then I use confint(k.glm) to obtain a confidnece interval for the estimates. confint(k.glm,level=0.97) Waiting for profiling to be done... 1.5 % 98.5 % (Intercept) -5.471345995 0.94716503 A -0.002340863 0.02631582 N -0.037028592 0.95590178 S -0.365570347 -0.06573675 while reading the help for 'confint', i found something like confint.glm for general linear model. I load the MASS package by clicking on the Menu( or otherwise how should I load the package?) then I still cant use the confint.glm command, what have I dont wrong? How do I calculate this confidence interval for glm estimate manually?? for A, I use 0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77) which is a different interval i got from the confint(k.glm,level=0.97) above. To be short, what's the right command to find the confidence interval for glm estimats? How do I verify it manully? Thanks. casper -- View this message in context: http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
On Dec 9, 2009, at 9:50 PM, casperyc wrote: I think the help page are exactly the same... I cannot tell what you maen by this. I just want to verify the confidence interval manually. That's all I want. Then provide some reproducible code and data. ... as the Posting Guide explains and provides explicit examples of the manner for most effectively presenting data objects. -- David Thanks. casper brestat wrote: This functions are different. I advice you study them: ?confint # profile likelihood ?confint.default # t-distribution Walmes Zeviani - Brazil casperyc wrote: Hi, I have a glm gives summary as follows, Estimate Std. Errorz value Pr(|z|) (Intercept) -2.03693352 1.449574526 -1.405194 0.159963578 A0.01093048 0.006446256 1.695633 0.089955471 N0.41060119 0.224860819 1.826024 0.067846690 S -0.20651005 0.067698863 -3.050421 0.002285206 then I use confint(k.glm) to obtain a confidnece interval for the estimates. confint(k.glm,level=0.97) Waiting for profiling to be done... 1.5 % 98.5 % (Intercept) -5.471345995 0.94716503 A -0.002340863 0.02631582 N -0.037028592 0.95590178 S -0.365570347 -0.06573675 while reading the help for 'confint', i found something like confint.glm for general linear model. I load the MASS package by clicking on the Menu( or otherwise how should I load the package?) then I still cant use the confint.glm command, what have I dont wrong? How do I calculate this confidence interval for glm estimate manually?? for A, I use 0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77) which is a different interval i got from the confint(k.glm,level=0.97) above. To be short, what's the right command to find the confidence interval for glm estimats? How do I verify it manully? Thanks. casper -- View this message in context: http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p956671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] split() is slow on data.frame (PR#14123)
I make a version for matrix. Because, it would be more efficient to split each column of a matrix than to convert a matrix to a data.frame then call split() on the data.frame. Note that the version for a matrix and a data.frame is slightly different. Would somebody add this in R as well? split.matrix-function(x,f) { #print('processing matrix') v=lapply( 1:dim(x)[[2]] , function(i) { base:::split.default(x[,i],f)#the difference is here } ) w=lapply( seq(along=v[[1]]) , function(i) { result=do.call( cbind , lapply(v, function(vj) { vj[[i]] } ) ) colnames(result)=colnames(x) return(result) } ) names(w)=names(v[[1]]) return(w) } On Wed, Dec 9, 2009 at 5:44 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Wed, 9 Dec 2009, William Dunlap wrote: Here are some differences between the current and proposed split.data.frame. Adding 'drop=FALSE' fixes this case. See in line correction below. Chuck d-data.frame(Matrix=I(matrix(1:10, ncol=2)), Named=c(one=1,two=2,three=3,four=4,five=5), row.names=as.character(1001:1005)) group-c(A,B,A,A,B) split.data.frame(d,group) $A Matrix.1 Matrix.2 Named 1001 1 6 1 1003 3 8 3 1004 4 9 4 $B Matrix.1 Matrix.2 Named 1002 2 7 2 1005 5 10 5 mysplit.data.frame(d,group) # lost row.names and 2nd column of Matrix [1] processing data.frame $A Matrix Named [1,] 1 1 [2,] 3 3 [3,] 4 4 $B Matrix Named [1,] 2 2 [2,] 5 5 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of pengyu...@gmail.com Sent: Wednesday, December 09, 2009 2:10 PM To: r-de...@stat.math.ethz.ch Cc: r-b...@r-project.org Subject: [Rd] split() is slow on data.frame (PR#14123) Please see the following code for the runtime comparison between split() and mysplit.data.frame() (they do the same thing semantically). mysplit.data.frame() is a fix of split() in term of performance. Could somebody include this fix (with possible checking for corner cases) in future version of R and let me know the inclusion of the fix? m=30 n=6 k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) mysplit.data.frame-function(x,f) { print('processing data.frame') v=lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) Change to: split(x[,i,drop=FALSE],f) } ) w=lapply( seq(along=v[[1]]) , function(i) { result=do.call( cbind , lapply(v, function(vj) { vj[[i]] } ) ) colnames(result)=colnames(x) return(result) } ) names(w)=names(v[[1]]) return(w) } system.time(split(as.data.frame(x),f)) system.time(mysplit.data.frame(as.data.frame(x),f)) __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] split() is slow on data.frame (PR#14123)
Sorry. I sent this to r-help by mistake. Could somebody help delete it from the archive? On Wed, Dec 9, 2009 at 9:29 PM, Peng Yu pengyu...@gmail.com wrote: I make a version for matrix. Because, it would be more efficient to split each column of a matrix than to convert a matrix to a data.frame then call split() on the data.frame. Note that the version for a matrix and a data.frame is slightly different. Would somebody add this in R as well? split.matrix-function(x,f) { #print('processing matrix') v=lapply( 1:dim(x)[[2]] , function(i) { base:::split.default(x[,i],f)#the difference is here } ) w=lapply( seq(along=v[[1]]) , function(i) { result=do.call( cbind , lapply(v, function(vj) { vj[[i]] } ) ) colnames(result)=colnames(x) return(result) } ) names(w)=names(v[[1]]) return(w) } On Wed, Dec 9, 2009 at 5:44 PM, Charles C. Berry cbe...@tajo.ucsd.edu wrote: On Wed, 9 Dec 2009, William Dunlap wrote: Here are some differences between the current and proposed split.data.frame. Adding 'drop=FALSE' fixes this case. See in line correction below. Chuck d-data.frame(Matrix=I(matrix(1:10, ncol=2)), Named=c(one=1,two=2,three=3,four=4,five=5), row.names=as.character(1001:1005)) group-c(A,B,A,A,B) split.data.frame(d,group) $A Matrix.1 Matrix.2 Named 1001 1 6 1 1003 3 8 3 1004 4 9 4 $B Matrix.1 Matrix.2 Named 1002 2 7 2 1005 5 10 5 mysplit.data.frame(d,group) # lost row.names and 2nd column of Matrix [1] processing data.frame $A Matrix Named [1,] 1 1 [2,] 3 3 [3,] 4 4 $B Matrix Named [1,] 2 2 [2,] 5 5 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of pengyu...@gmail.com Sent: Wednesday, December 09, 2009 2:10 PM To: r-de...@stat.math.ethz.ch Cc: r-b...@r-project.org Subject: [Rd] split() is slow on data.frame (PR#14123) Please see the following code for the runtime comparison between split() and mysplit.data.frame() (they do the same thing semantically). mysplit.data.frame() is a fix of split() in term of performance. Could somebody include this fix (with possible checking for corner cases) in future version of R and let me know the inclusion of the fix? m=30 n=6 k=3 set.seed(0) x=replicate(n,rnorm(m)) f=sample(1:k, size=m, replace=T) mysplit.data.frame-function(x,f) { print('processing data.frame') v=lapply( 1:dim(x)[[2]] , function(i) { split(x[,i],f) Change to: split(x[,i,drop=FALSE],f) } ) w=lapply( seq(along=v[[1]]) , function(i) { result=do.call( cbind , lapply(v, function(vj) { vj[[i]] } ) ) colnames(result)=colnames(x) return(result) } ) names(w)=names(v[[1]]) return(w) } system.time(split(as.data.frame(x),f)) system.time(mysplit.data.frame(as.data.frame(x),f)) __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ r-de...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning variables into an environment.
I am working with a somewhat complicated structure in which I need to deal with a function that takes ``basic'' arguments and also depends on a number of parameters which change depending on circumstances. I thought that a sexy way of dealing with this would be to assign the parameters as objects in the environment of the function in question. The following toy example gives a bit of the flavour of what I am trying to do: foo - function(x,zeta) { for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar)) bar(x) } bar - function(x) { alpha + beta*exp(gamma*x) } v - c(alpha=2,beta=3,gamma=-4) ls() [1] bar foo v foo(0.1,v) alpha 4.01096 2+3*exp(-4*0.1) [1] 4.01096 # Check; yes it's working; but ... ls() [1] alpha bar beta foo gamma v The parameters got assigned in the global environment (as well as in the environment of bar()? Or instead of?). I didn't want that to happen. Questions: (a) What did I do wrong? (b) What am I not understanding about environments? (c) How can I get the parameters to be assigned in the environment of bar() and ***NOT*** in the global environment? (d) Is it time to go to the pub yet? [Please don't make suggestions about doing it all some other way, e.g. using the ``...'' argument facility. I know there are other ways to attack my problem (but they may not be so efficacious in the context of my real --- as opposed to toy --- example). I want to try to get the environment idea to work, and I want to understand more about environments and how they work.] Thanks for any insights. cheers, Rolf Turner P. S. Are there any articles about environments (in the sense used above) out there in the literature? I searched the contents of the R Journal/R News but turned up nothing on environments. R. T. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make the assignment in a for-loop not affect variables outside the loop?
2009/11/22 Uwe Ligges lig...@statistik.tu-dortmund.de: Either use local as in: n=10 local(for(i in 1:n){ n=3 print(n) }) print(n) 'local()' makes everything inside it unavailable outside of it. Is there a way to make 'n' unavailable outside but still make 'b' available outside without using 'function'? n=10 b=1 local( for(i in 1:n) { n=3 print(n) b=b*i } ) print(n) print(b) or write a function that is evaluated in its own environment: n=10 MyLoopFoo - function(){ for(i in 1:n){ n - 3 print(n) } } MyLoopFoo() print(n) Uwe Ligges Peng Yu wrote: I know that R is a dynamic programming language. But I'm wondering if there is a way to make the assignment in a for-loop not affect variables outside the loop. n=10 for(i in 1:n){ + n=3 + print(n) + } [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 [1] 3 print(n) [1] 3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the function to test if a vector is ordered or not?
I did a search on www.rseek.org to look for the function to test if a vector is ordered or not. But I don't find it. Could somebody let me know what function I should use? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Counting Frequencies
Hi - I'm having difficulty with frequencies in R. I have a table with a variable (column) called difference 600 observations (rows). I would like to know how many values are -0.5 as well as how many are 0.5. The rest are obviously in the middle. In SAS I could this immediately but am unable to do it in R. Thanks for your help. -- View this message in context: http://n4.nabble.com/Counting-Frequencies-tp956556p956556.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] confint for glm (general linear model)
This functions are different. I advice you study them: ?confint # profile likelihood ?confint.default # t-distribution Walmes Zeviani - Brazil casperyc wrote: Hi, I have a glm gives summary as follows, Estimate Std. Errorz valuePr(|z|) (Intercept) -2.03693352 1.449574526 -1.405194 0.159963578 A0.01093048 0.006446256 1.695633 0.089955471 N0.41060119 0.224860819 1.826024 0.067846690 S -0.20651005 0.067698863 -3.050421 0.002285206 then I use confint(k.glm) to obtain a confidnece interval for the estimates. confint(k.glm,level=0.97) Waiting for profiling to be done... 1.5 % 98.5 % (Intercept) -5.471345995 0.94716503 A -0.002340863 0.02631582 N -0.037028592 0.95590178 S -0.365570347 -0.06573675 while reading the help for 'confint', i found something like confint.glm for general linear model. I load the MASS package by clicking on the Menu( or otherwise how should I load the package?) then I still cant use the confint.glm command, what have I dont wrong? How do I calculate this confidence interval for glm estimate manually?? for A, I use 0.01093048 + c(-1,1) * 0.006446256 * qt(0.985,df=77) which is a different interval i got from the confint(k.glm,level=0.97) above. To be short, what's the right command to find the confidence interval for glm estimats? How do I verify it manully? Thanks. casper -- View this message in context: http://n4.nabble.com/confint-for-glm-general-linear-model-tp954071p95.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
On Wednesday 9 December 2009, Hans W Borchers wrote: Geert Janssens janssens-geert at telenet.be writes: Hi, I'm quite new to the R-project. I was suggested to look into it because I am trying to solve the Subset sum problem, which basically is: Given a set of integers and an integer s, does any non-empty subset sum to s? (See http://en.wikipedia.org/wiki/Subset_sum_problem) I have been searching the web for quite some time now (which is how I eventually discovered that my problem is called subset sum), but I can't seem to find an easily applicable implementation. I did search the list archive, the R website and used the help.search and apropos function. I'm afraid nothing obvious showed up for me. Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): Unfortunatly no. I do need the numbers in the subset. But thank you for presenting this code. Geert # Exact algorithm for the SUBSET SUM problem exactSubsetSum - function(S, t) { S - S[S = t] if (sum(S) t) return(FALSE) S - sort(S, decreasing=TRUE) n - length(S) L - c(0) for (i in 1:n) { L - unique(sort(c(L, L + S[i]))) L - L[L = t] if (max(L) == t) return(TRUE) } return(FALSE) } # Example with a set of cardinality 64 amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # Timing is not that bad system.time( sol - exactSubsetSum(products, amount) ) # user system elapsed # 0.516 0.096 0.673 sol # [1] TRUE -- Kobalt W.I.T. Web Information Technology Brusselsesteenweg 152 1850 Grimbergen Tel : +32 479 339 655 Email: i...@kobaltwit.be __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting frequency curve over histogram
Hi, Guarav go for this site.He is the one designed R. http://www.stat.auckland.ac.nz/~ihaka/courses/120/lectures.html http://www.stat.auckland.ac.nz/~ihaka/?Teaching It might be helpful.I am not sure. thanks. -- View this message in context: http://n4.nabble.com/Plotting-frequency-curve-over-histogram-tp956565p956592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple data manipulation question
Hi there I have a dataframe of a whole lot of variables lets say, one of my variables is gender how do I simply get an average of all other variables by gender? -- View this message in context: http://n4.nabble.com/simple-data-manipulation-question-tp956600p956600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help to forecasting the data of the time series .
Hi, This is the time series data collected from 2001 to 2008 by every month.so,there are 96 entries.I have done basic statistics.I need to find a model fitted to forecast this data.This is the mixedpaper collection for recycling in the campus. 13251 13754 19061 12631 17414 21350 25384 23646 20312 20740 14007 17175 13910 17191 17113 20250 35003 11975 19665 20490 20436 22885 17075 18205 15720 25264 16258 33430 31598 19764 21006 29210 35750 27881 25751 27601 27316 20893 27308 27182 28178 28057 35623 51094 36365 29301 18718 22683 53898 40339 28462 31555 32484 40497 28547 40509 31220 48399 38998 44489 41588 47240 57035 54919 50513 42296 39124 36217 43173 56311 50726 49621 52430 56236 59573 66819 66345 44838 45847 51066 49688 52978 45205 51043 48693 65470 45073 55923 58766 41289 50514 45901 51198 63914 57128 50702 Please advice me analysis tips for me to understand this data very well. thanks, -- View this message in context: http://n4.nabble.com/Need-help-to-forecasting-the-data-of-the-time-series-tp956593p956593.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the function to test if a vector is ordered or not?
On Dec 9, 2009, at 10:10 PM, Peng Yu wrote: I did a search on www.rseek.org to look for the function to test if a vector is ordered or not. But I don't find it. Could somebody let me know what function I should use? If by ordered, you mean sorted, then ?is.unsorted is.unsorted(c(1, 4, 2, 6, 7)) [1] TRUE is.unsorted(sort(c(1, 4, 2, 6, 7))) [1] FALSE If you mean to test a factor to see if it is an ordered factor, then ? is.ordered is.ordered(factor(letters)) [1] FALSE is.ordered(factor(letters, ordered = TRUE)) [1] TRUE HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subset sum problem.
Geert Janssens janssens-geert at telenet.be writes: On Wednesday 9 December 2009, Hans W Borchers wrote: Geert Janssens janssens-geert at telenet.be writes: [ ... ] Has anybody tackled this issue before in R ? If so, I would be very grateful if you could share your solution with me. Is it really true that you only want to see a Yes or No answer to this question whether a subset sums up to s --- without learning which numbers this subset is composed of (the pure SUBSET SUM problem)? Then the following procedure does that in a reasonable amount of time (returning 'TRUE' or 'FALSE' instead of Y-or-N): Unfortunately no. I do need the numbers in the subset. But thank you for presenting this code. Geert Okay then, here we go. But don't tell later that your requirement was to generate _all_ subsets that add up to a certain amount. I will generate only one (with largest elements). For simplicity I assume that the set is prepared s.t. it is decreasingly ordered, has no elements larger than the amount given, and has a total sum larger than this amount. # Assume S decreasing, no elements t, total sum = t solveSubsetSum - function(S, t) { L - c(0) inds - NULL for (i in 1:length(S)) { # L - unique(sort(c(L, L + S[i]))) L - c(L, L+S[i]) L - L[L = t] if (max(L) == t) { inds - c(i) t - t - S[i] while (t 0) { K - c(0) for (j in 1:(i-1)) { K - c(K, K+S[j]) if (t %in% K) break } inds - c(inds, j) t - t - S[j] } break } } return(inds) } # former example amount - 4748652 products - c(30500,30500,30500,30500,42000,42000,42000,42000, 42000,42000,42000,42000,42000,42000,71040,90900, 76950,35100,71190,53730,456000,70740,70740,533600, 83800,59500,27465,28000,28000,28000,28000,28000, 26140,49600,77000,123289,27000,27000,27000,27000, 27000,27000,8,33000,33000,55000,77382,48048, 51186,4,35000,21716,63051,15025,15025,15025, 15025,80,111,59700,25908,829350,1198000,1031655) # prepare set prods - products[products = amount] # no elements amount prods - sort(prods, decreasing=TRUE) # decreasing order # now find one solution system.time(is - solveSubsetSum(prods, amount)) # user system elapsed # 0.320 0.032 0.359 prods[is] # [1] 70740 70740 71190 76950 77382 8 83800 # [8] 90900 456000 533600 829350 111 1198000 sum(prods[is]) == amount # [1] TRUE Note that running times and memory needs will be much higher when more summands are needed. To mention that too: I have not tested the code extensively. Regards Hans Werner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning variables into an environment.
R uses lexical scoping, not dynamic scoping. It does not matter where bar is called from. What matters is where bar was defined and since bar was defined in the global environment that is where its free variables are looked up thus environment(bar) is the global environment. Try changing foo to this which creates a new function, also called bar, which only exists for the duration of the call to foo and its free variables are looked up within foo. We assign the variables within foo and then call this newly created bar. foo - function(x,zeta) { environment(bar) - environment() for(nm in names(zeta)) assign(nm,zeta[nm]) bar(x) } If we regard bar as a method acting on an object whose variables are alpha, beta and gamma then we can use the proto package like this: library(proto) # see http://r-proto.googlecode.com p - proto(alpha = 2, beta = 3, gamma = -4, bar = function(this, x) this$alpha + this$beta * exp(this$gamma + x)) p$bar(0.1) # we could change any of the variables and run again # e.g. p$alpha - 3; p$bar(0.1) # or we could create a child of p, q. # q inherits all of p's variables and methods # however, let's explicitly override alpha q - p$proto(alpha = 12) q$bar(0.1) On Wed, Dec 9, 2009 at 10:43 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: I am working with a somewhat complicated structure in which I need to deal with a function that takes ``basic'' arguments and also depends on a number of parameters which change depending on circumstances. I thought that a sexy way of dealing with this would be to assign the parameters as objects in the environment of the function in question. The following toy example gives a bit of the flavour of what I am trying to do: foo - function(x,zeta) { for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar)) bar(x) } bar - function(x) { alpha + beta*exp(gamma*x) } v - c(alpha=2,beta=3,gamma=-4) ls() [1] bar foo v foo(0.1,v) alpha 4.01096 2+3*exp(-4*0.1) [1] 4.01096 # Check; yes it's working; but ... ls() [1] alpha bar beta foo gamma v The parameters got assigned in the global environment (as well as in the environment of bar()? Or instead of?). I didn't want that to happen. Questions: (a) What did I do wrong? (b) What am I not understanding about environments? (c) How can I get the parameters to be assigned in the environment of bar() and ***NOT*** in the global environment? (d) Is it time to go to the pub yet? [Please don't make suggestions about doing it all some other way, e.g. using the ``...'' argument facility. I know there are other ways to attack my problem (but they may not be so efficacious in the context of my real --- as opposed to toy --- example). I want to try to get the environment idea to work, and I want to understand more about environments and how they work.] Thanks for any insights. cheers, Rolf Turner P. S. Are there any articles about environments (in the sense used above) out there in the literature? I searched the contents of the R Journal/R News but turned up nothing on environments. R. T. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the function to test if a vector is ordered or not?
Try all(diff(order(yourVector)) == 1) On Wed, Dec 9, 2009 at 10:10 PM, Peng Yu pengyu...@gmail.com wrote: I did a search on www.rseek.org to look for the function to test if a vector is ordered or not. But I don't find it. Could somebody let me know what function I should use? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning variables into an environment.
On Wed, Dec 9, 2009 at 9:43 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: I am working with a somewhat complicated structure in which I need to deal with a function that takes ``basic'' arguments and also depends on a number of parameters which change depending on circumstances. I thought that a sexy way of dealing with this would be to assign the parameters as objects in the environment of the function in question. The following toy example gives a bit of the flavour of what I am trying to do: foo - function(x,zeta) { for(nm in names(zeta)) assign(nm,zeta[nm],envir=environment(bar)) bar(x) } bar - function(x) { alpha + beta*exp(gamma*x) } v - c(alpha=2,beta=3,gamma=-4) ls() [1] bar foo v foo(0.1,v) alpha 4.01096 2+3*exp(-4*0.1) [1] 4.01096 # Check; yes it's working; but ... ls() [1] alpha bar beta foo gamma v The parameters got assigned in the global environment (as well as in the environment of bar()? Or instead of?). I didn't want that to happen. Questions: (a) What did I do wrong? The environment of bar is the environment in which it exists - the global environment. (b) What am I not understanding about environments? See above. (c) How can I get the parameters to be assigned in the environment of bar() and ***NOT*** in the global environment? Define foo inside bar and rely on the usual lexical scoping rules. (d) Is it time to go to the pub yet? Yes. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting Frequencies
x - runif(10, 0, 1) x2 - x 0.5 x2 [1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE TRUE FALSE table(x2) x2 FALSE TRUE 5 5 On Wed, Dec 9, 2009 at 6:36 PM, BIGBEEF martin.beze...@gmail.com wrote: Hi - I'm having difficulty with frequencies in R. I have a table with a variable (column) called difference 600 observations (rows). I would like to know how many values are -0.5 as well as how many are 0.5. The rest are obviously in the middle. In SAS I could this immediately but am unable to do it in R. Thanks for your help. -- View this message in context: http://n4.nabble.com/Counting-Frequencies-tp956556p956556.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- == WenSui Liu Blog : statcompute.spaces.live.com Tough Times Never Last. But Tough People Do. - Robert Schuller == [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Population Histogram
On 12/10/2009 05:47 AM, terry johnson wrote: How would I make a population histogram in R from an excel file? Thanks Hi Terry, library(gdata) ozpop-read.xls(ozpop.xls) xycol-color.gradient(c(0,0,0.5,1),c(0,0,0.5,1),c(1,1,0.5,1),18) xxcol-color.gradient(c(1,1,0.5,1),c(0.5,0.5,0.5,1),c(0.5,0.5,0.5,1),18) library(plotrix) par(mar=pyramid.plot(ozpop$Male,ozpop$Female,labels=ozpop$Age, main=Australian population pyramid 2002,xycol=xycol,xxcol=xxcol)) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting Frequencies
On 12/10/2009 10:36 AM, BIGBEEF wrote: Hi - I'm having difficulty with frequencies in R. I have a table with a variable (column) called difference 600 observations (rows). I would like to know how many values are -0.5 as well as how many are 0.5. The rest are obviously in the middle. In SAS I could this immediately but am unable to do it in R. Hi Martin, difference-runif(100)*2-1 table(cut(difference, breaks=c(min(difference)-0.1,-0.5,0.5,max(difference The -0.1 is to ensure that the lowest value is included in the cut. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Population Histogram
On 12/10/2009 04:39 PM, Jim Lemon wrote: On 12/10/2009 05:47 AM, terry johnson wrote: How would I make a population histogram in R from an excel file? Thanks Hi Terry, library(gdata) ozpop-read.xls(ozpop.xls) xycol-color.gradient(c(0,0,0.5,1),c(0,0,0.5,1),c(1,1,0.5,1),18) xxcol-color.gradient(c(1,1,0.5,1),c(0.5,0.5,0.5,1),c(0.5,0.5,0.5,1),18) library(plotrix) par(mar=pyramid.plot(ozpop$Male,ozpop$Female,labels=ozpop$Age, main=Australian population pyramid 2002,xycol=xycol,xxcol=xxcol)) Jim Oops, sorry, XLS files must be verboten, as the example data I sent seems to have disappeared. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.