Re: [R] vectorisation
On 02-02-2013, at 17:38, Brett Robinson brett.robin...@7dials.com wrote: Hi I'm trying to set up a simulation problem without resorting to (m)any loops. I want to set entries in a data frame of zeros ('starts' in the code below) to 1 at certain points and the points have been randomly generated and stored in a separate data.frame ('sl'), which has the same number of columns. An example of the procedure is as follows: ml - data.frame(matrix(sample(1:50,80, replace=TRUE),20,4)) mm - apply(ml, 2, cumsum) starts- data.frame(matrix(0,600,4)) I can achieve the result I want with a loop: for (i in 1:4){ lstarts[,i][mm[,i]] -1 } But as I want to use a large number of columns I would like to do away with the loop Can anyone suggest how this might be done? Another way is this f2 - function(starts, mm) { mn - cbind(as.vector(mm),rep(1:ncol(mm),each=nrow(mm))) x - as.matrix(starts) x[mn] - 1 as.data.frame(x) } starts2 - f2(starts,mm) # identical(starts2,starts1) # [1] TRUE Collect all the options presented so far in functions, use the compiler package to see if that helps and do some speed tests with Arun's parameters. # Brett f1 - function(starts, mm) { for (i in 1:ncol(mm)){ starts[,i][mm[,i]] -1 } starts } # Berend f2 - function(starts, mm) { mn - cbind(as.vector(mm),rep(1:ncol(mm),each=nrow(mm))) x - as.matrix(starts) x[mn] - 1 as.data.frame(x) } # Rui f3 - function(s2,mm) { s2[] - lapply(seq_len(ncol(mm)), function(i) {s2[,i][mm[,i]] - 1; s2[,i]}) s2 } # Arun f4 - function(starts,mm) { starts2 - as.data.frame(do.call(cbind,lapply(1:ncol(mm),function(i) {starts[,i][mm[,i]]-1;starts[,i]}))) colnames(starts2)- colnames(starts) starts2 } library(compiler) f1c - cmpfun(f1) f2c - cmpfun(f2) f3c - cmpfun(f3) f4c - cmpfun(f4) library(rbenchmark) # Arun's test set.seed(11) starts - data.frame(matrix(0,1e6,4)) ml - data.frame(matrix(sample(1:1e4,1e3, replace=TRUE),100,4)) mm - apply(ml, 2, cumsum) z1 - f1(starts,mm) z2 - f2(starts,mm) z3 - f3(starts,mm) z4 - f4(starts,mm) z1c - f1c(starts,mm) z2c - f2c(starts,mm) z3c - f3c(starts,mm) z4c - f4c(starts,mm) identical(z2,z1) identical(z3,z1) identical(z4,z1) identical(z1c,z1) identical(z2c,z1) identical(z3c,z1) identical(z4c,z1) benchmark( f1(starts,mm) , f2(starts,mm), f1c(starts,mm), f2c(starts,mm), f3(starts,mm) , f4(starts,mm), f3c(starts,mm), f4c(starts,mm), replications=1,order=relative, columns=c(test,relative,elapsed,replications)) Result: # identical(z2,z1) # [1] TRUE # identical(z3,z1) # [1] TRUE # identical(z4,z1) # [1] TRUE # identical(z1c,z1) # [1] TRUE # identical(z2c,z1) # [1] TRUE # identical(z3c,z1) # [1] TRUE # identical(z4c,z1) # [1] TRUE # # benchmark( f1(starts,mm) , f2(starts,mm), # +f1c(starts,mm), f2c(starts,mm), # +f3(starts,mm) , f4(starts,mm), # +f3c(starts,mm), f4c(starts,mm), # +replications=1,order=relative, columns=c(test,relative,elapsed,replications)) # test relative elapsed replications # 2 f2(starts, mm)1.000 0.1951 # 4 f2c(starts, mm)1.005 0.1961 # 1 f1(starts, mm)2.990 0.5831 # 3 f1c(starts, mm)3.082 0.6011 # 7 f3c(starts, mm)3.903 0.7611 # 5 f3(starts, mm)3.949 0.7701 # 8 f4c(starts, mm)4.436 0.8651 # 6 f4(starts, mm)4.462 0.8701 Compiling doesn't deliver significant speed gains in this case. Function f2 is the quickest. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split xts data set into weeks
Hi, Jeff Thank you for your advice. Your example of the problem is not reproducible [1]. This behavior could arise due to small discrepancies in the index values, or from specifying frequency instead of f as the second argument, our perhaps you have found a bug that only your data triggers. Any verification of what your problem is will require a reproducible example. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example I tried to make a piece of reproducible codes. Would you please paste the following codes to R console and make a confirmation? #Codes start from here library(quantmod) tmp-structure(c(112.34, 112.89, 112.75, 113.5, 115.16, 115.21, 114.84, 114.93, 115.05, 114.46, 113.34, 113.71, 113.56, 115.08, 115.97, 115.26, 115.22, 115.24, 115.24, 114.98, 111.96, 112.75, 112.5, 113.1, 114.85, 114.55, 114.55, 114.75, 114.2, 112.92, 112.87, 112.8, 113.54, 115.05, 115.06, 114.85, 114.93, 115.09, 114.28, 113.92), class = c(xts, zoo), .indexCLASS = Date, tclass = Date, .indexTZ = , tzone = , index = structure(c(1298818800, 1298905200, 1298991600, 1299078000, 1299164400, 1299423600, 129951, 1299596400, 1299682800, 1299769200), tzone = , tclass = Date), .Dim = c(10L, 4L), .Dimnames = list(NULL, c(Open, High, Low, Close))) class(tmp) (res1-split(tmp,f=weeks)) (res2-split(tmp,frequency=weeks)) #Codes end here the original data is saved in tmp and the split() results are saved in res1 and res2. res1 is the result of f and res2 is the resut of frequency, both break the week started from 2011-02-28 and result of frequency is even worse. BTW, my R version is as following: version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 15.2 year 2012 month 10 day26 svn rev61015 language R version.string R version 2.15.2 (2012-10-26) nickname Trick or Treat Thank you. Seimizu Joukan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative Risk in logistic regression
At 10:49 30/01/2013, aminreza Aamini wrote: Hi all, I am very grateful to all those who write to me 1) how i can obtain relative risk (risk ratio) in logistic regression in R. @TECHREPORT{lumley06, author = {Lumley, T and Kronmal, R and Ma, S}, year = 2006, title = {Relative risk regression in medical research: models, contrasts, estimators, and algorithms}, number = 293, institution = {{UW} Biostatistics Working Paper Series}, keywords = {glm, Poisson}, url = {http://www.bepress.com/uwbiostat/paper293} } 2) how to obtain the predicted risk for a certain individual using fitted regression model in R. Many thanks, in advance, for your help. Amin. Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split xts data set into weeks
On Sun, Feb 3, 2013 at 6:57 AM, Seimizu Joukan saim...@gmail.com wrote: Would you please paste the following codes to R console and make a confirmation? Indeed, well done and much appreciated. #Codes start from here library(quantmod) tmp-structure(c(112.34, 112.89, 112.75, 113.5, 115.16, 115.21, 114.84, 114.93, 115.05, 114.46, 113.34, 113.71, 113.56, 115.08, 115.97, 115.26, 115.22, 115.24, 115.24, 114.98, 111.96, 112.75, 112.5, 113.1, 114.85, 114.55, 114.55, 114.75, 114.2, 112.92, 112.87, 112.8, 113.54, 115.05, 115.06, 114.85, 114.93, 115.09, 114.28, 113.92), class = c(xts, zoo), .indexCLASS = Date, tclass = Date, .indexTZ = , tzone = , index = structure(c(1298818800, 1298905200, 1298991600, 1299078000, 1299164400, 1299423600, 129951, 1299596400, 1299682800, 1299769200), tzone = , tclass = Date), .Dim = c(10L, 4L), .Dimnames = list(NULL, c(Open, High, Low, Close))) class(tmp) (res1-split(tmp,f=weeks)) (res2-split(tmp,frequency=weeks)) Looking at args(split.xts) I think you actually do want split(..., f = ) here, not split(..., frequency = ), which would ignore and default to months. I get the following for res1, running R-Devel on OS X 10.6.8: res1 [[1]] Open HighLow Close 2011-02-27 112.34 113.34 111.96 112.87 [[2]] Open HighLow Close 2011-02-28 112.89 113.71 112.75 112.80 2011-03-01 112.75 113.56 112.50 113.54 2011-03-02 113.50 115.08 113.10 115.05 2011-03-03 115.16 115.97 114.85 115.06 2011-03-06 115.21 115.26 114.55 114.85 [[3]] Open HighLow Close 2011-03-07 114.84 115.22 114.55 114.93 2011-03-08 114.93 115.24 114.75 115.09 2011-03-09 115.05 115.24 114.20 114.28 2011-03-10 114.46 114.98 112.92 113.92 so I think it's likely a timezone issue. Try setting indexTZ(tmp) - GMT or something similar and giving it another shot. You might also want to move to the R-SIG-Finance class where the authors of xts are more frequently seen. It might also help to report Sys.timezone() in addition to your specific linux distro. Cheers, MW __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Split xts data set into weeks
Hi, Michael Thank you very much! Looking at args(split.xts) I think you actually do want split(..., f = ) here, not split(..., frequency = ), which would ignore and default to months. yes, split(...,f=) is what I want. so I think it's likely a timezone issue. Try setting indexTZ(tmp) - GMT yes, I think you are right. when I set indexTZ(tmp) to GMT or JST (Japan), I got the same result to yours. when I do Sys.timezone(), I got , I am afraid that R failed to get my ubuntu's system environment variable. perhaps because I am using ubuntu on vmware, there are some problems with timezone. not sure! :( I know little about timezone, I will go on to investigate and learn something about it. thank you for your help! Best regards! Seimizu Joukan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question: write an R script with help information available to the user
On Sun, Feb 3, 2013 at 1:50 AM, Bert Gunter gunter.ber...@gene.com wrote: A related approach which, if memory serves, was originally in S eons ago, is to define a doc attribute of any function (or object, for that matter) that you wish to document that contains text for documentation and a doc() function of the form: doc - function(obj) cat(attr(obj,doc)) used as: f- function(x) NULL attr(f,doc) - Some text\n\n doc(f) doc(f) Some text This is pretty primitive, but I suppose you could instead have the attribute point to something like an HTML file and the doc() function open it in a web browser, which is basically what R's built-in package document system does anyway. Except you wouldn't have to build a package and don't have to learn or follow R's procedures. Which means you don't get R's standardization and organization and no one but a private bunch of users will be able to use your function. But maybe that's sufficient for your needs. To further build on this try the above idea using the comment function: f - function() NULL comment(f) - Help goes here comment(f) [1] Help goes here or combine it with the redefinition of ? like like this: `?` - function(...) if (!is.null(doc - comment(get(match.call()[[2]] cat(doc, \n) else help(...) ?f# displays: Help goes here ?dim # normal help -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative Risk in logistic regression
Dear Coleagues , As my friend John mentined,* the measure of association from a logistic regression is the odds ratio, not the relative risk*. but the point is in follow-up studies, it is commonly preferred to estimate a risk ratio rather than an odds ratio. Thats why im looking for RR in logistic models. Bytheway thank you all for ur consideration. Amin On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey i...@aghmed.fsnet.co.ukwrote: At 10:49 30/01/2013, aminreza Aamini wrote: Hi all, I am very grateful to all those who write to me 1) how i can obtain relative risk (risk ratio) in logistic regression in R. @TECHREPORT{lumley06, author = {Lumley, T and Kronmal, R and Ma, S}, year = 2006, title = {Relative risk regression in medical research: models, contrasts, estimators, and algorithms}, number = 293, institution = {{UW} Biostatistics Working Paper Series}, keywords = {glm, Poisson}, url = {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293 } } 2) how to obtain the predicted risk for a certain individual using fitted regression model in R. Many thanks, in advance, for your help. Amin. Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative Risk in logistic regression
Amin, It is incorrect to use the relative risk as a measure of association in a logistic regression. The measure of association in a logistic regression is the odds ratio. The odds ratio is an approximation of the relative risk. The approximation becomes progressively better as the disease becomes progressively rarer. Regardless of whether the disease is rare or not, inferences drawn from a logistic regression are valid. Please do not report a logistic regression using relative risk. It is not correct to do so. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) aminreza Aamini amin.r@gmail.com 2/3/2013 9:15 AM Dear Coleagues , As my friend John mentined,* the measure of association from a logistic regression is the odds ratio, not the relative risk*. but the point is in follow-up studies, it is commonly preferred to estimate a risk ratio rather than an odds ratio. Thats why im looking for RR in logistic models. Bytheway thank you all for ur consideration. Amin On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey i...@aghmed.fsnet.co.ukwrote: At 10:49 30/01/2013, aminreza Aamini wrote: Hi all, I am very grateful to all those who write to me 1) how i can obtain relative risk (risk ratio) in logistic regression in R. @TECHREPORT{lumley06, author = {Lumley, T and Kronmal, R and Ma, S}, year = 2006, title = {Relative risk regression in medical research: models, contrasts, estimators, and algorithms}, number = 293, institution = {{UW} Biostatistics Working Paper Series}, keywords = {glm, Poisson}, url = {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293 } } 2) how to obtain the predicted risk for a certain individual using fitted regression model in R. Many thanks, in advance, for your help. Amin. Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relative Risk in logistic regression
On Feb 3, 2013, at 8:15 AM, aminreza Aamini wrote: Dear Coleagues , As my friend John mentined,* the measure of association from a logistic regression is the odds ratio, not the relative risk*. but the point is in follow-up studies, it is commonly preferred to estimate a risk ratio rather than an odds ratio. Thats why im looking for RR in logistic models. Bytheway thank you all for ur consideration. Amin I agree that the relative risk is generally preferred in presenting the results of follow-up studies. The question should be: why do you want to use a logistic link? The technical report out of the University of Washington Biostatistics Depeartment explains a variety of approaches including using a log-binomial model and Poisson regression. Either of those can be done in R with glm. The Poisson regression model is particularly simple to develop. You should explain a) what sort of data you have in greater detail and b) your reasons for using the logistic link when arguably better alternatives are available if you want a more specific answer. -- David On Sun, Feb 3, 2013 at 1:42 PM, Michael Dewey i...@aghmed.fsnet.co.ukwrote: At 10:49 30/01/2013, aminreza Aamini wrote: Hi all, I am very grateful to all those who write to me 1) how i can obtain relative risk (risk ratio) in logistic regression in R. @TECHREPORT{lumley06, author = {Lumley, T and Kronmal, R and Ma, S}, year = 2006, title = {Relative risk regression in medical research: models, contrasts, estimators, and algorithms}, number = 293, institution = {{UW} Biostatistics Working Paper Series}, keywords = {glm, Poisson}, url = {http://www.bepress.com/**uwbiostat/paper293http://www.bepress.com/uwbiostat/paper293 } } 2) how to obtain the predicted risk for a certain individual using fitted regression model in R. Many thanks, in advance, for your help. Amin. Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/**home.htmlhttp://www.aghmed.fsnet.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Empty cluster / segfault using vanilla kmeans with version 2.15.2
Dear experts, I am encountering a version-dependent issue. My laptop runs Ubuntu 12.04 LTS 64-bit, R 2.14.1; the issue explained below never occurred with this version of R My desktop runs Ubuntu 11.10 64-bit, R 2.13.2; what follows applies to this setup. The data I'm clustering is constituted by the rows of a 320 x 6 matrix containing integers ranging from 1 to 7, no missing data. I applied kmeans() to this matrix, literally, 256 x 10ⶠtimes using R version 2.13.2 or 2.14.1, without never experiencing the slightest problem. My usual setup is with k=5, nstart=256, iter.max=50. Upgrading to R 2.15.2, I experienced either a warning message ('Empty cluster. Choose a better set of initial centers') or a catastrophic segfault. The only way I can get a solution whatsoever is putting nstart to its default value, i.e. 1. However, just repeating the clustering, the same issue still happen. Moreover, this is vastly suboptimal, because the risk of local minima. Something similar was reported many years ago, see https://stat.ethz.ch/pipermail/r-help/2003-November/041784.html. It was then suggested that R's behaviour was correct. I'm not familiar with such an early R version, but the up-to-date documentation of kmeans clearly states that Except for the Lloyd-Forgy method, k clusters will always be returned if a number is specified.. I am using the default Hartigan-Wong, and I specify an exact number k: thus, k clusters should be returned. They aren't, and the empty cluster is then more likely the symptom of a bug rather than the outcome of a 'true' local minimum. Using synaptic, I managed to downgrade R to version 2.13.2. The problem disappeard, i.e. the previous message/segfault didn't occur anymore. Summarizing: given the same dataset, either an unreasonable message or a segfault regularly happen in version 2.15.2 by invoking kmeans() on an Ubuntu 11.10 64bit machine. This does not happen at all in previous versions of R, on the same machine and operating system. I respectfully suggest that the behaviour shown in the aforementioned versions 2.13.2 and 2.14.1 should be considered 'normal', and that version 2.15.2 should revert to that. Kind regards, Luca Nanetti. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fractional logit in GLM?
Hi, Does anyone know of a function in R that can handle a fractional variable as the dependent variable? The catch is that the function has to be inclusive of 0 and 1, which betareg() does not. It seems like GLM might be able to handle the fractional logit model, but I can't figure it out. How do you format GLM to do so? Best, Rachael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looping through rows of all elements of a list that has variable length
Dear R-ers, I have a list of data frames such that the length of the list is unknown in advance (it could be 1 or 2 or more). Each element of the list contains a data frame. I need to loop through all rows of the list element 1 AND (if applicable) of the list element 2 etc. and do something at each iteration. I am trying to figure out how to write a code that is generic, i.e., loops through the rows of all elements of my lists even if the total number of the list elments is unknown in advance. Below is an example. a=expand.grid(1:2,1:2) b=expand.grid(1:2,1:2,1:2) # # My list that can have 1 element, e.g.: l.short-vector(list,1) l.short[[1]]-a # I need to loop through rows of l.short[[1]] and do somethinig (it's unimportant what exactly) with them, e.g.: out-vector(list,nrow(l.short[[1]])) for(i in 1:nrow(l.short[[1]])){ # i-1 out[[i]]-sum(l.short[[1]][i,]) } (out) # # Or my list could have 1 elements, e.g., 2 like below (or 3 or more). # The total length of my list varies. l.long-list(a,b) # I need to loop through rows of l.long[[1]] AND of l.long[[2]] simultaneously # and do something with both, - see example below. # Below, I am doing it manually by using expand.grid to create all combinations of rows of 2 elements of 'l.long': mygrid-expand.grid(1:nrow(l.long[[1]]),1:nrow(l.long[[2]])) out-vector(list,nrow(mygrid)) for(gridrow in 1:nrow(mygrid)){ # gridrow-1 row.a-mygrid[gridrow,1] row.b-mygrid[gridrow,2] out[[gridrow]]-sum(l.long[[1]][row.a,])+sum(l.long[[2]][row.b,]) } Thank you very much for any suggestions! -- Dimitri Liakhovitski gfk.com http://marketfusionanalytics.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Compare each element of a list to a vector
Hello R-helpers, I have a vector x-c(1,2,3) and a list that contains vectors datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) and I would like to identify those list elements that are identical to x. I tried datalist %in% x [1] FALSE FALSE FALSE FALSE but I am obviously using %in% incorrectly. I also tried messing around with lapply but I can't figure out how to specify the function within lapply. I would appreciate any suggestions you may have. Many thanks! Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare each element of a list to a vector
try this: x-c(1,2,3) datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) result - sapply(datalist, function(.vec){ + all(.vec == x) + }) result [1] TRUE FALSE FALSE FALSE On Sun, Feb 3, 2013 at 1:15 PM, mtb...@gmail.com wrote: Hello R-helpers, I have a vector x-c(1,2,3) and a list that contains vectors datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) and I would like to identify those list elements that are identical to x. I tried datalist %in% x [1] FALSE FALSE FALSE FALSE but I am obviously using %in% incorrectly. I also tried messing around with lapply but I can't figure out how to specify the function within lapply. I would appreciate any suggestions you may have. Many thanks! Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare each element of a list to a vector
Try datalist %in% list(x) [1] TRUE FALSE FALSE FALSE Both arguments, e1 and e2, of e1 %in% e2 should be of the same type: e1 %in% e2 is comparing e1[i] and e2[j]. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of mtb...@gmail.com Sent: Sunday, February 03, 2013 10:15 AM To: r-help@r-project.org Subject: [R] Compare each element of a list to a vector Hello R-helpers, I have a vector x-c(1,2,3) and a list that contains vectors datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) and I would like to identify those list elements that are identical to x. I tried datalist %in% x [1] FALSE FALSE FALSE FALSE but I am obviously using %in% incorrectly. I also tried messing around with lapply but I can't figure out how to specify the function within lapply. I would appreciate any suggestions you may have. Many thanks! Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare each element of a list to a vector
My attempt similar to Jim's is: which(sapply(datalist, function(z) all(z == x))) However, a safer approach is: which(sapply(datalist, function(z) isTRUE(all.equal(z, x This latter approach avoids Circle 1 of 'The R Inferno'. http://www.burns-stat.com/documents/books/the-r-inferno/ Pat On 03/02/2013 18:24, jim holtman wrote: try this: x-c(1,2,3) datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) result - sapply(datalist, function(.vec){ + all(.vec == x) + }) result [1] TRUE FALSE FALSE FALSE On Sun, Feb 3, 2013 at 1:15 PM, mtb...@gmail.com wrote: Hello R-helpers, I have a vector x-c(1,2,3) and a list that contains vectors datalist-list(c(1,2,3),c(2,3,4),c(3,4,5),c(4,5,6)) and I would like to identify those list elements that are identical to x. I tried datalist %in% x [1] FALSE FALSE FALSE FALSE but I am obviously using %in% incorrectly. I also tried messing around with lapply but I can't figure out how to specify the function within lapply. I would appreciate any suggestions you may have. Many thanks! Mark Na [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @burnsstat @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of: 'Impatient R' 'The R Inferno' 'Tao Te Programming') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare each element of a list to a vector
Thanks Jim, William and Patrick for your ideas. I appreciate your help. Avoiding a circle of the R Inferno sounds good, so I'm going to use Patrick's 2nd suggestion for now but I learned something from the others too. Cheers, Mark On Sun, Feb 3, 2013 at 12:33 PM, Patrick Burns pbu...@pburns.seanet.comwrote: My attempt similar to Jim's is: which(sapply(datalist, function(z) all(z == x))) However, a safer approach is: which(sapply(datalist, function(z) isTRUE(all.equal(z, x This latter approach avoids Circle 1 of 'The R Inferno'. http://www.burns-stat.com/**documents/books/the-r-inferno/http://www.burns-stat.com/documents/books/the-r-inferno/ Pat On 03/02/2013 18:24, jim holtman wrote: try this: x-c(1,2,3) datalist-list(c(1,2,3),c(2,3,**4),c(3,4,5),c(4,5,6)) result - sapply(datalist, function(.vec){ + all(.vec == x) + }) result [1] TRUE FALSE FALSE FALSE On Sun, Feb 3, 2013 at 1:15 PM, mtb...@gmail.com wrote: Hello R-helpers, I have a vector x-c(1,2,3) and a list that contains vectors datalist-list(c(1,2,3),c(2,3,**4),c(3,4,5),c(4,5,6)) and I would like to identify those list elements that are identical to x. I tried datalist %in% x [1] FALSE FALSE FALSE FALSE but I am obviously using %in% incorrectly. I also tried messing around with lapply but I can't figure out how to specify the function within lapply. I would appreciate any suggestions you may have. Many thanks! Mark Na [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @burnsstat @portfolioprobe http://www.portfolioprobe.com/**blog http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of: 'Impatient R' 'The R Inferno' 'Tao Te Programming') [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RandomForest, Party and Memory Management
Dear All, For a data mining project, I am relying heavily on the RandomForest and Party packages. Due to the large size of the data set, I have often memory problems (in particular with the Party package; RandomForest seems to use less memory). I really have two questions at this point 1) Please see how I am using the Party and RandomForest packages. Any comment is welcome and useful. myparty - cforest(SalePrice ~ ModelID+ ProductGroup+ ProductGroupDesc+MfgYear+saledate3+saleday+ salemonth, data = trainRF, control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE)) rf_model - randomForest(SalePrice ~ ModelID+ ProductGroup+ ProductGroupDesc+MfgYear+saledate3+saleday+ salemonth, data = trainRF,na.action = na.omit, importance=TRUE, do.trace=100, mtry=3,ntree=300) 2) I have another question: sometimes R crashes after telling me that it is unable to allocate e.g. an array of 1.5 Gb. However, I have 4Gb of ram on my box, so...technically the memory is there, but is there a way to enable R to use more of it? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RandomForest, Party and Memory Management
Neither of your questions meets the Posting Guidelines (see footer of any email). 1) Not reproducible. [1] 2) Very operating-system specific and a FAQ. You have not indicated what your OS is (via sessionInfo), nor what reading you have done to address memory problems already (use a search engine... or begin with the FAQs in R help or on CRAN). [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, For a data mining project, I am relying heavily on the RandomForest and Party packages. Due to the large size of the data set, I have often memory problems (in particular with the Party package; RandomForest seems to use less memory). I really have two questions at this point 1) Please see how I am using the Party and RandomForest packages. Any comment is welcome and useful. myparty - cforest(SalePrice ~ ModelID+ ProductGroup+ ProductGroupDesc+MfgYear+saledate3+saleday+ salemonth, data = trainRF, control = cforest_unbiased(mtry = 3, ntree=300, trace=TRUE)) rf_model - randomForest(SalePrice ~ ModelID+ ProductGroup+ ProductGroupDesc+MfgYear+saledate3+saleday+ salemonth, data = trainRF,na.action = na.omit, importance=TRUE, do.trace=100, mtry=3,ntree=300) 2) I have another question: sometimes R crashes after telling me that it is unable to allocate e.g. an array of 1.5 Gb. However, I have 4Gb of ram on my box, so...technically the memory is there, but is there a way to enable R to use more of it? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using relaimpo or relimp with PLM
Dears, Unfortunatelly, the packages relaimpo and relimp do not seem to work with plm function (plm package). Have anyone know about any workaround for those incompatibilities, or at least of any ideas on that? Thanks in advance! Richard A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractional logit in GLM?
glm() will handle fractional logit with some tweaks. below is copied from my blog in a python example. however, you should be able to see the R code from it. In [12]: # Address the same type of model with R by Pyper In [13]: import pyper as pr In [14]: r = pr.R(use_pandas = True) In [15]: r.r_data = data In [16]: # Indirect Estimation of Discrete Dependent Variable Models In [17]: r('data - rbind(cbind(r_data, y = 1, wt = r_data$LEV_LT3), cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))') Out[17]: 'try({data - rbind(cbind(r_data, y = 1, wt = r_data$LEV_LT3), cbind(r_data, y = 0, wt = 1 - r_data$LEV_LT3))})\n' In [18]: r('mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A, weights = wt, subset = (wt 0), data = data, family = binomial)') Out[18]: 'try({mod - glm(y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A, weights = wt, subset = (wt 0), data = data, family = binomial)})\nWarning message:\nIn eval(expr, envir, enclos) : non-integer #successes in a binomial glm!\n' In [19]: print r('summary(mod)') try({summary(mod)}) Call: glm(formula = y ~ COLLAT1 + SIZE1 + PROF2 + LIQ + IND3A, family = binomial, data = data, weights = wt, subset = (wt 0)) Deviance Residuals: Min 1Q Median 3Q Max -1.0129 -0.4483 -0.3173 -0.1535 2.5379 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -7.249790.56734 -12.779 2e-16 *** COLLAT1 1.237150.26012 4.756 1.97e-06 *** SIZE10.359010.03746 9.584 2e-16 *** PROF2 -3.143130.73895 -4.254 2.10e-05 *** LIQ -1.382490.35749 -3.867 0.00011 *** IND3A0.546580.14136 3.867 0.00011 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 2692.0 on 5536 degrees of freedom Residual deviance: 2456.4 on 5531 degrees of freedom AIC: 1995.4 Number of Fisher Scoring iterations: 6 On Sun, Feb 3, 2013 at 11:17 AM, Rachael Garrett rachaeldgarr...@gmail.comwrote: Hi, Does anyone know of a function in R that can handle a fractional variable as the dependent variable? The catch is that the function has to be inclusive of 0 and 1, which betareg() does not. It seems like GLM might be able to handle the fractional logit model, but I can't figure it out. How do you format GLM to do so? Best, Rachael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- == WenSui Liu Credit Risk Manager, 53 Bancorp wensui@53.com 513-295-4370 == [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Adding complex new columns to data frame depending on existing column
Hello I have a data frame as below V1 V2 V3V4 V5 V6 chr1 18884 C C 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 TTG 1 0 chr1 602541 TTTAT 2 0 on which I want to perform complex rearrangement such that: if V3 is a string 1 (i.e line 2) then I generate 2 new columns where first new column = V2-1 second new column = V2+(length of string in V3)+1 therefore, for line 2 output would look like: chr1 135419 TATACA T 2 0 135418 135426 if length of string in V3 = 1 and V4=string of length1 (i.e. line 1) then first new column = V2 second new column = V2+2 output for line 1 would be: chr1 18884 C C 2 0 18884 18886 I am not sure: a) how to use R to substitute the length of the string in V3 with the number representing this length b) whether apply would be best to use here Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package installation error in Mac OS X
Hi, I installed R in Mac OS X, and trying to installa package. R is not allowing me to install the meboot package. Below is the exact message I got from R: installation of package ‘meboot’ had non-zero exit status trying URL 'http://cran.ma.imperial.ac.uk/src/contrib/meboot_1.1-5.tar.gz' Content type 'application/x-gzip' length 411681 bytes (402 Kb) opened URL == downloaded 402 Kb * installing *source* package ‘meboot’ ... ** package ‘meboot’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘meboot’ * removing ‘/Users/ravshonbek/Library/R/2.15/library/meboot’ The downloaded source packages are in ‘/private/var/folders/c7/jrjv78_x6f53l3sw_w715gk0gn/T/RtmphnbWao/downloaded_packages’ Can anyone please shed some light on it? Thank you -- View this message in context: http://r.789695.n4.nabble.com/package-installation-error-in-Mac-OS-X-tp4657422.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumulative sum by group and under some criteria
Hi, If you need to extract only the columns `m1` and `n1` which satisfy the condition. res2[,1:2][res2$cterm1_P1L0.01 res2$cterm1_P1L!=0,] # m1 n1 #20 3 2 #21 3 2 # If you wanted structure() as shown below for `d`, use dput(res2) A.K. - Original Message - From: zjoanna2...@gmail.com zjoanna2...@gmail.com To: smartpink...@yahoo.com Cc: Sent: Sunday, February 3, 2013 3:58 PM Subject: Re: cumulative sum by group and under some criteria Hi, Let me restate my questions. I need to get the m1 and n1 that satisfy some criteria, for example in this case, within each group, the maximum cterm1_p1L ( the last row in this group) 0.01. I need to extract m1=3, n1=2, I only need m1, n1 in the row. Also, how to create the structure from the data.frame, I am new to R, I need to change the maxN and run the loop to different data. Thanks very much for your help! quote author='arun kirshna' HI, I think this should be more correct: maxN-9 c11-0.2 c12-0.2 p0L-0.05 p0H-0.05 p1L-0.20 p1H-0.20 d - structure(list(m1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), n1 = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), x1 = c(0, 0, 0, 1, 1, 1, 2, 2, 2, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3), y1 = c(0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2), Fmm = c(0, 0, 0, 0.7, 0.59, 0.64, 1, 1, 1, 0, 0, 0, 0, 0.63, 0.7, 0.74, 0.68, 1, 1, 1, 1, 0, 0, 0, 0.62, 0.63, 0.6, 0.63, 0.6, 0.68, 1, 1, 1), Fnn = c(0, 0.64, 1, 0, 0.51, 1, 0, 0.67, 1, 0, 0.62, 0.69, 1, 0, 0.54, 0.62, 1, 0, 0.63, 0.73, 1, 0, 0.63, 1, 0, 0.7, 1, 0, 0.7, 1, 0, 0.58, 1), Qm = c(1, 1, 1, 0.65, 0.45, 0.36, 0.5, 0.165, 0, 1, 1, 1, 1, 0.685, 0.38, 0.32, 0.32, 0.5, 0.185, 0.135, 0, 1, 1, 1, 0.69, 0.37, 0.4, 0.685, 0.4, 0.32, 0.5, 0.21, 0), Qn = c(1, 0.36, 0, 0.65, 0.45, 0, 0.5, 0.165, 0, 1, 0.38, 0.31, 0, 0.685, 0.38, 0.32, 0, 0.5, 0.185, 0.135, 0, 1, 0.37, 0, 0.69, 0.3, 0, 0.685, 0.3, 0, 0.5, 0.21, 0), term1_p0 = c(0.81450625, 0.0857375, 0.00225625, 0.0857375, 0.009025, 0.0002375, 0.00225625, 0.0002375, 6.25e-06, 0.7737809375, 0.1221759375, 0.0064303124999, 0.0001128125, 0.081450625, 0.012860625, 0.000676875, 1.1875e-05, 0.0021434375, 0.0003384375, 1.78125e-05, 3.125e-07, 0.7737809375, 0.081450625, 0.0021434375, 0.1221759375, 0.012860625, 0.0003384375, 0.0064303124999, 0.000676875, 1.78125e-05, 0.0001128125, 1.1875e-05, 3.125e-07), term1_p1 = c(0.4096, 0.2048, 0.0256, 0.2048, 0.1024, 0.0128, 0.0256, 0.0128, 0.0016, 0.32768, 0.24576, 0.06144, 0.00512, 0.16384, 0.12288, 0.03072, 0.00256, 0.02048, 0.01536, 0.00384, 0.00032, 0.32768, 0.16384, 0.02048, 0.24576, 0.12288, 0.01536, 0.06144, 0.03072, 0.00384, 0.00512, 0.00256, 0.00032)), .Names = c(m1, n1, x1, y1, Fmm, Fnn, Qm, Qn, term1_p0, term1_p1), row.names = c(NA, 33L), class = data.frame) library(zoo) lst1- split(d,list(d$m1,d$n1)) res2-do.call(rbind,lapply(lst1[lapply(lst1,nrow)!=0],function(x){ x[,11:14]-NA; x[,11:12][x$Qm=c11,]-cumsum(x[,9:10][x$Qm=c11,]); x[,13:14][x$Qn=c12,]-cumsum(x[,9:10][x$Qn=c12,]); colnames(x)[11:14]- c(cterm1_P0L,cterm1_P1L,cterm1_P0H,cterm1_P1H); x1-na.locf(x); x1[,11:14][is.na(x1[,11:14])]-0; x1})) row.names(res2)- 1:nrow(res2) res2 # m1 n1 x1 y1 Fmm Fnn Qm Qn term1_p0 term1_p1 cterm1_P0L cterm1_P1L cterm1_P0H cterm1_P1H #1 2 2 0 0 0.00 0.00 1.000 1.000 0.8145062500 0.40960 0.00 0.0 0.00 0.0 #2 2 2 0 1 0.00 0.64 1.000 0.360 0.0857375000 0.20480 0.00 0.0 0.00 0.0 #3 2 2 0 2 0.00 1.00 1.000 0.000 0.0022562500 0.02560 0.00 0.0 0.0022562500 0.02560 #4 2 2 1 0 0.70 0.00 0.650 0.650 0.0857375000 0.20480 0.00 0.0 0.0022562500 0.02560 #5 2 2 1 1 0.59 0.51 0.450 0.450 0.009025 0.10240 0.00 0.0 0.0022562500 0.02560 #6 2 2 1 2 0.64 1.00 0.360 0.000 0.0002375000 0.01280 0.00 0.0 0.0024937500 0.03840 #7 2 2 2 0 1.00 0.00 0.500 0.500 0.0022562500 0.02560 0.00 0.0 0.0024937500 0.03840 #8 2 2 2 1 1.00 0.67 0.165 0.165 0.0002375000 0.01280 0.0002375000 0.01280 0.0027312500 0.05120 #9 2 2 2 2 1.00 1.00 0.000 0.000 0.062500 0.00160 0.0002437500 0.01440 0.0027375000 0.05280 #10 3 2 0 0 0.00 0.00 1.000 1.000 0.7737809375 0.32768 0.00 0.0 0.00 0.0 #11 3 2 0 1 0.00 0.63 1.000 0.370 0.0814506250 0.16384 0.00 0.0 0.00 0.0 #12 3 2 0 2 0.00 1.00 1.000 0.000 0.0021434375 0.02048 0.00 0.0 0.0021434375 0.02048 #13 3 2 1 0 0.62 0.00 0.690 0.690 0.1221759375 0.24576 0.00 0.0 0.0021434375 0.02048
[R] ggplot2 plotting errorbars.
Hi, i'm using this lines of code: dodge -position_dodge(width=0.9) ggplot(dfm,aes(x = X,y = value)) + geom_bar(aes(fill = variable), position=dodge, stat=identity) + geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, position=dodge,stat=identity) to plot this data frame X variable valueer 1 A X4 58.74 9.44 2 B X4 52.41 10.01 3 C X4 95.52 4.88 4 A X1 75.51 8.54 5 B X1 0.73 23.20 6 C X1 96.66 1.18 7 A X5 76.70 9.60 8 B X5 0.56 34.50 9 C X5 100.58 10.87 result: As you see the error bars are still very much wrongly positioned. How do i solve this? thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fortan to R
eliza botto eliza_botto at hotmail.com writes: Dear UseRs, How can i connect my FTN95 fortran compiler with R in window 7? Take a look at the R extensions manual http://cran.r-project.org/doc/manuals/R-exts.html section 5 ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package installation error in Mac OS X
Hi, Did you install the Xcode Developer Tools on your machine? HTH, Pascal Le 04/02/2013 03:14, londonphd a écrit : Hi, I installed R in Mac OS X, and trying to installa package. R is not allowing me to install the meboot package. Below is the exact message I got from R: installation of package ‘meboot’ had non-zero exit status trying URL 'http://cran.ma.imperial.ac.uk/src/contrib/meboot_1.1-5.tar.gz' Content type 'application/x-gzip' length 411681 bytes (402 Kb) opened URL == downloaded 402 Kb * installing *source* package ‘meboot’ ... ** package ‘meboot’ successfully unpacked and MD5 sums checked ** libs *** arch - i386 sh: make: command not found ERROR: compilation failed for package ‘meboot’ * removing ‘/Users/ravshonbek/Library/R/2.15/library/meboot’ The downloaded source packages are in ‘/private/var/folders/c7/jrjv78_x6f53l3sw_w715gk0gn/T/RtmphnbWao/downloaded_packages’ Can anyone please shed some light on it? Thank you -- View this message in context: http://r.789695.n4.nabble.com/package-installation-error-in-Mac-OS-X-tp4657422.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 plotting errorbars.
Hi, it seems to be a problem about using aes both in ggplot as also in geom_bar. You could specify fill property for your geom_bar in ggplot initialization, in order to avoid this issue (you could also do the same thing for ymin and ymax properties for errorbar :P), i.e: dodge -position_dodge(width=0.9) ggplot(dfm, aes(x=X, y=value, fill=variable, ymin=value-er, ymax=value+er)) + geom_bar(position=dodge) + geom_errorbar(position=dodge, width=0.25) Hope it helps. On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote: Hi, i'm using this lines of code: dodge -position_dodge(width=0.9) ggplot(dfm,aes(x = X,y = value)) + geom_bar(aes(fill = variable), position=dodge, stat=identity) + geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, position=dodge,stat=identity) to plot this data frame X variable valueer 1 A X4 58.74 9.44 2 B X4 52.41 10.01 3 C X4 95.52 4.88 4 A X1 75.51 8.54 5 B X1 0.73 23.20 6 C X1 96.66 1.18 7 A X5 76.70 9.60 8 B X5 0.56 34.50 9 C X5 100.58 10.87 result: As you see the error bars are still very much wrongly positioned. How do i solve this? thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rafael R. On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote: Hi, i'm using this lines of code: dodge -position_dodge(width=0.9) ggplot(dfm,aes(x = X,y = value)) + geom_bar(aes(fill = variable), position=dodge, stat=identity) + geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, position=dodge,stat=identity) to plot this data frame X variable valueer 1 A X4 58.74 9.44 2 B X4 52.41 10.01 3 C X4 95.52 4.88 4 A X1 75.51 8.54 5 B X1 0.73 23.20 6 C X1 96.66 1.18 7 A X5 76.70 9.60 8 B X5 0.56 34.50 9 C X5 100.58 10.87 result: As you see the error bars are still very much wrongly positioned. How do i solve this? thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rafael R. On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote: Hi, i'm using this lines of code: dodge -position_dodge(width=0.9) ggplot(dfm,aes(x = X,y = value)) + geom_bar(aes(fill = variable), position=dodge, stat=identity) + geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, position=dodge,stat=identity) to plot this data frame X variable valueer 1 A X4 58.74 9.44 2 B X4 52.41 10.01 3 C X4 95.52 4.88 4 A X1 75.51 8.54 5 B X1 0.73 23.20 6 C X1 96.66 1.18 7 A X5 76.70 9.60 8 B X5 0.56 34.50 9 C X5 100.58 10.87 result: As you see the error bars are still very much wrongly positioned. How do i solve this? thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rafael R. On Sun, Feb 3, 2013 at 5:01 PM, Pieter Coussement dencous...@gmail.com wrote: Hi, i'm using this lines of code: dodge -position_dodge(width=0.9) ggplot(dfm,aes(x = X,y = value)) + geom_bar(aes(fill = variable), position=dodge, stat=identity) + geom_errorbar(aes(ymin=value-er, ymax=value+er),width=0.25, position=dodge,stat=identity) to plot this data frame X variable valueer 1 A X4 58.74 9.44 2 B X4 52.41 10.01 3 C X4 95.52 4.88 4 A X1 75.51 8.54 5 B X1 0.73 23.20 6 C X1 96.66 1.18 7 A X5 76.70 9.60 8 B X5 0.56 34.50 9 C X5 100.58 10.87 result: As you see the error bars are still very much wrongly positioned. How do i solve this? thanks for the help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Rafael R. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adding complex new columns to data frame depending on existing column
Hi, May be this helps: df1-read.table(text= V1 V2 V3 V4 V5 V6 chr1 18884 C C 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 TTTA T 2 0 ,header=TRUE,sep=,stringsAsFactors=FALSE) df1$newCol1- ifelse(nchar(df1$V3)1,df1$V2-1,ifelse(nchar(df1$V3)==1 nchar(df1$V4)1, df1$V2,NA)) df1$newCol2- ifelse(nchar(df1$V3)1,df1$V2+nchar(df1$V3)+1,ifelse(nchar(df1$V3)==1 nchar(df1$V4)1, df1$V2+2,NA)) df1 # V1 V2 V3 V4 V5 V6 newCol1 newCol2 #1 chr1 18884 C C 2 0 18884 18886 #2 chr1 135419 TATACA T 2 0 135418 135426 #3 chr1 332045 T TTG 0 2 332045 332047 #4 chr1 453838 T TAC 2 0 453838 453840 #5 chr1 567652 T TG 1 0 567652 567654 #6 chr1 602541 TTTA T 2 0 602540 602546 A.K. - Original Message - From: Tom Oates toate...@gmail.com To: r-help@r-project.org Cc: Sent: Sunday, February 3, 2013 12:20 PM Subject: [R] Adding complex new columns to data frame depending on existing column Hello I have a data frame as below V1 V2 V3 V4 V5 V6 chr1 18884 C C 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 TTTA T 2 0 on which I want to perform complex rearrangement such that: if V3 is a string 1 (i.e line 2) then I generate 2 new columns where first new column = V2-1 second new column = V2+(length of string in V3)+1 therefore, for line 2 output would look like: chr1 135419 TATACA T 2 0 135418 135426 if length of string in V3 = 1 and V4=string of length1 (i.e. line 1) then first new column = V2 second new column = V2+2 output for line 1 would be: chr1 18884 C C 2 0 18884 18886 I am not sure: a) how to use R to substitute the length of the string in V3 with the number representing this length b) whether apply would be best to use here Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rJava works with 32-bit but not 64
Hello: rJava works for me under 32-bit but under not 64-bit R; see below. Suggestions? Thanks, Spencer library(rJava) Error : .onLoad failed in loadNamespace() for 'rJava', details: call: stop(No CurrentVersion entry in ', key, '! Try re-installing Java and make sure R and Java have matching architectures.) error: object 'key' not found Error: package/namespace load failed for 'rJava' sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ## library(rJava) sessionInfo() R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.9-3 -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Wide character in print?
Hello: I get Wide character in print from trying read.xls(22_data.xls) in the gdata package, with 22_data.xls downloaded from Varieties_Country_A-E.xls at http://www.reinhartandrogoff.com/data/browse-by-topic/topics/7/: library(gdata) read.xls(22_data.xls) Wide character in print at C:/Users/sgraves/pgms/R/R-2.15.2/library/gdata/perl/xls2csv.pl line 270. sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] gdata_2.12.0 loaded via a namespace (and not attached): [1] gtools_2.7.0 I get the same message from xls2sep(22_data.xls). It's only a comment, so I suppose I could ignore it. However, it's generated by a function I'm adding to the Ecdat package, and I'd rather find a way to avoid it. (I suppose I could dump it to sink, but that's pretty extreme and could mask other problems.) Thanks, Spencer -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rJava works with 32-bit but not 64
Hello, Do you have a 64-bit version of Java? rJava says to you: call: stop(No CurrentVersion entry in ', key, '! Try re-installing Java and make sure R and Java have matching architectures.) Regards, Pascal Le 04/02/2013 14:27, Spencer Graves a écrit : Hello: rJava works for me under 32-bit but under not 64-bit R; see below. Suggestions? Thanks, Spencer library(rJava) Error : .onLoad failed in loadNamespace() for 'rJava', details: call: stop(No CurrentVersion entry in ', key, '! Try re-installing Java and make sure R and Java have matching architectures.) error: object 'key' not found Error: package/namespace load failed for 'rJava' sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ## library(rJava) sessionInfo() R version 2.15.2 (2012-10-26) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rJava_0.9-3 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gettext weirdness
Hi, I am trying to use the gettext() function to translate some text. I have never used this function before, so, it's entirely possible that I am doing something wrong. The issue that I am encountering is that gettext() properly translates some text, but not some other. Natural language was compiled in my R (installed from the Debian repositories): $ R R version 2.15.1 (2012-06-22) -- Roasted Marshmallows [...] Natural language support but running in an English locale [...] q() Here is some text that has some translation in the file ./po/fr.po: #: src/main/errors.c:290 msgid invalid option \warning.expression\ msgstr option incorrecte \warning.expression\ [...] #: src/main/errors.c:582 msgid Error in msgstr Erreur dans Start R in French and see if I can get something translated to French: $ LANG=fr_FR.UTF8 R stop('This is an error') Erreur : This is an error bindtextdomain(R) # does not seem necessary, but just to be safe... [1] /usr/share/R/share/locale gettext(Error in , domain=R) [1] Error in invalid option \warning.expression\ - msg; gettext(msg, domain=R) [1] option incorrecte \warning.expression\ So, the stop() function successfully translates. I can also manually translate some entries, but why can does it not work for gettext(Error in , domain=R)? Any idea? Thanks Florent __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.