Re: [R] storing element number of a list in a column data frame
yes, I like this: a very elegant and neat solution (in my very umble opinion) sometime is so difficult to me to think of a solution in such a simple and effective terms: less is more! thank you max Il 03/10/2013 17:12, David Carlson ha scritto: Try this i=which(!sapply(mytest, is.null)) n=do.call(rbind, mytest[i]) mydf - data.frame(i, n) mydf i n 1 1 45 2 3 18 3 5 99 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Massimo Bressan Sent: Thursday, October 3, 2013 9:42 AM To: r-help@r-project.org Subject: [R] storing element number of a list in a column data frame #let's suppose I have a list like this mytest-list(45, NULL, 18, NULL, 99) #to note that this is just an amended example because in fact #I'm dealing with a long list (more than 400 elements) #with no evident pattern of the NULL values #I want to end up with a data frame like the following data.frame(i=c(1,3,5), n=c(45,18,99)) #i.e. a data frame storing in #column i the number of corresponding element list #column n the unique component of that element #I've been trying with do.call(rbind, mytest) #or do.call(rbind.data.frame, mytest) #but this approach is not properly achieving the desired result #now I'm in trouble on how to store each element number of the list in the first column data frame #any help for this? #thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing odf file into R
On Fri, Oct 4, 2013 at 5:57 AM, Peter Maclean pmaclean2...@yahoo.com wrote: Anyone aware of a package or technique to import odf data file into R, I will appreciate his/her help. A quick scan of R-help points me here: http://www.omegahat.org/ROpenOffice/ Reports of that working (or not) would be appreciated - I've not tried it. Alternatively, there's python code for reading ODF so you could interface to that in a number of ways. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice multi-panel layout
Hi, thanks. the printing one by one seems the only working solution. I also tried the grid.arrange function but couldnt output what I am after. Now the plots are placed in one page but I got the message error using packet 1: promise already under evalution: recursive default arguments reference or earlier problems? this error seems to be associated with the panel option I pass to the xyplot function: plott[[i]] - xyplot( bin_avg ~ dist , type=(p) ,panel = automap:::autokrige.vgm.panel ,labels = as.character(pop), shift = 0.03 ,model = auto_Sph_h[[i]]$var_model ) if the panel option is commented out the error disappears. Any further suggestion? Thanks -- View this message in context: http://r.789695.n4.nabble.com/lattice-multi-panel-layout-tp4677500p4677566.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Projection
On 3 Oct 2013, at 22:39, Monaghan, David dmonag...@gc.cuny.edu wrote: I was wondering, has anyone has encountered an R package that performs random projection/random mapping? RP is a procedure that is akin to Principal Components Analysis in that it accomplishes dimensionality reduction, but is far more computationally efficient. I have been searching for some time, but haven't seen anything on CRAN-r yet. The experimental wordspace package available from R-Forge has an implementation of RI in the dsm.projection() function. See https://r-forge.r-project.org/R/?group_id=783 for download / installation. Best, Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing odf file into R
Assuming you want to read in data from an AOO or LO spreadsheet, have a look at the gnumeric package. I have only used it once or twice but it seems to work well and is quite flexible. John Kane Kingston ON Canada -Original Message- From: pmaclean2...@yahoo.com Sent: Thu, 3 Oct 2013 21:57:32 -0700 (PDT) To: r-help@r-project.org Subject: Re: [R] Importing odf file into R Anyone aware of a package or technique to import odf data file into R, I will appreciate his/her help. Peter Maclean Department of Economics UDSM [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prcomp - surprising structure
On Oct 3, 2013, at 16:30 , Hermann Norpois wrote: Thanks for answering. I already started hunting. But my first doubt was if I used prcomp correctly (and this is in the moment my most important point). So far as I understood your answer is yes. Is that correct? Yes. There are a couple of slightly dubious aspects: The missing value imputation, the scaling could be wrong, and the data are highly discrete, but I don't see how either of those would explain the effect. On the off chance that it could be a bug triggered by the large number of columns, you could consider disregarding SNP columns (say) 1:5 and 25:30, rerun the prcomp(), and see if you still get the columns in the same positions. I am puzzled by the fact that these columns are more or less in the middle of my snp-data. ...or to be precise, there are narrow _ranges_ of SNPs in which the eigenvectors have many extreme coordinate values (positive or negative). I suspect that you need to consider what these ranges represent, both in biological terms and in technological terms. Are they parts of chromosomes, or maybe something to do with the SNP-chip layout? (Unless you come up with an actual bug, we probably should not continue the discussion on this list.) -pd However, it could also be a biological effect. Are your ids by any chance from the same pedigree? If so, you might be seeing something like the effect of a crossover event in a distant ancestor. No there is no such pedigree scheme. Things like this are ruled out by IBD-measurement. (Further, the data is checked by an EIGENSTRAT analysis.) (b) the scaling by sqrt(pi*(1-pi)) implicitly requiring Hardy-Weinberg equilibrium, so if your data are all 0 or 2 (aa or AA) there will be overdispersion. This is a good point. But why do find such effects in the middle of my data? Thanks Hermann 2013/10/3 peter dalgaard pda...@gmail.com It's not so obvious to me that this is an artifact. What prcomp() says is that some of the eigenvectors have a lot of activity in some relatively narrow ranges of SNPs (on the same chromosome, perhaps?). If something artificial is going on, I could imagine effects not so much of centering columns but maybe one of (a) imputing zero for missing values (b) the scaling by sqrt(pi*(1-pi)) implicitly requiring Hardy-Weinberg equilibrium, so if your data are all 0 or 2 (aa or AA) there will be overdispersion. However, it could also be a biological effect. Are your ids by any chance from the same pedigree? If so, you might be seeing something like the effect of a crossover event in a distant ancestor. (Talk to a geneticist, I just play one on TV.) To investigate further, you could go looking at the individual scores and see who is having extreme values on component 2-4 and then go back and see if there is something peculiar about their SNPs in the strange region. Of course, you might have stumbled upon a bug in R, but I doubt so. Happy hunting! -pd On Oct 3, 2013, at 11:41 , Hermann Norpois wrote: Hello, I did a pca with over 20 snps for 340 observations (ids). If I plot the eigenvectors (called rotation in prcomp) 2,3 and 4 (e.g. plot (rotation[,2]) I see a strange column in my data (see attachment). I suggest it is an artefact (but of what?). Suggestion: I used prcomp this way: prcomp (mat), where mat is a matrix with the column means already substracted followed by a normalisation procedure (see below for details). Is that okay? Or does prcomp repeat substraction steps? Originally my approach was driven by the idea to compute a covariation matrix followed by the use of eigen, but the covariation matrix was to huge to handle. So I switched to prcomp. As I guess that the columns in my plots reflect some artefact production I hope to get some help. For the case that my use of prcomp was not okay, could you please give me instructions how to use it - including with the normalisation procedure that I need to include before doing a pca. Thanks Hermann # # mat: matrix with genotypes coded as 0,1 and 2 (columns); IDs (observations) as rows. # prcomp.snp - function (mat) { m - ncol (mat) n - nrow (mat) snp.namen - colnames (mat) for (i in 1:m) { # snps in columns ui - mat[,i] n - length (which (!is.na(ui))) # see methods Price et al. as correction pi - (1+ sum(ui, na.rm=TRUE))/(2+2*n) # substract mean ui - ui - mean (ui, na.rm=TRUE) # NAs set to zero ui[is.na(ui)] - 0 # normalisation of the genotype for each ID important normalisation step ui - ui/ (sqrt (pi*(1-pi))) # fill matrix with ui
Re: [R] lattice multi-panel layout
Hi I have never used the automap package and your syntax for xyplot does not seem to be in the correct format for lattice. A quick search showed that vgm.panel.xyplot from gstat package may give you some ideas. It appears that there are some particular adaptations for lattice for spatial plots and I am not up on them. Regards Duncan -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of efx Sent: Friday, 4 October 2013 18:38 To: r-help@r-project.org Subject: Re: [R] lattice multi-panel layout Hi, thanks. the printing one by one seems the only working solution. I also tried the grid.arrange function but couldnt output what I am after. Now the plots are placed in one page but I got the message error using packet 1: promise already under evalution: recursive default arguments reference or earlier problems? this error seems to be associated with the panel option I pass to the xyplot function: plott[[i]] - xyplot( bin_avg ~ dist , type=(p) ,panel = automap:::autokrige.vgm.panel ,labels = as.character(pop), shift = 0.03 ,model = auto_Sph_h[[i]]$var_model ) if the panel option is commented out the error disappears. Any further suggestion? Thanks -- View this message in context: http://r.789695.n4.nabble.com/lattice-multi-panel-layout-tp4677500p4677566.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting numbers in R
I have a set of data and I need to find out how many points are below a certain value but R will not calculate this properly for me. R will. But you aren't. Negative numbers seem to be causing the issue. You haven't got any negative numbers in your data set. In fact, you haven't got any numbers. It's all character strings. Is there a reason for that? Assuming there is, if you have your data in a data frame 'A' and just want the count: table(as.numeric(A$Tm_ugL) = 0.0002) If you just want a complete vector of TRUE or FALSE as.numeric(d$Tm_ugL) = 0.0002) does that. If you want to add that to your data frame (is it called A?) that looks like A$Censored - as.numeric(d$Tm_ugL) = 0.0002) But you really shouldn't have numbers in character format; read it as numeric. Then it's just table(d$Tm_ugL = 0.0002) and so on. If it's refusing to read as numeric, find out why and fix the data. And some comments on code, while I'm here: for (i in one:nrow(A)) ... if (A[i,two]=A_LLD) Variables called 'one' and 'two' look like a really bad idea. If they are equal to 1 and 2, use 1 and 2 (or 1L and 2L if you want to be _sure_ they are integer). If not, the names are going to be pretty confusing, no? (A_Censored[i,two]-TRUE) Why use a character string like TRUE that R can't interpret as logical instead of the logical values TRUE and FALSE? S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SSweibull() : problems with step factor and singular gradient
I think you have chosen a model that is ill-suited to the data. My initial thoughts were simply that the issue was the usual nls() singular gradient (actually jacobian if you want to be understood in the optimization community) woes, but in this case the jacobian really is bad. My quick and dirty tries give some insight, but do not provide a satisfactory answer. Note that the last two columns of the nlxb summary are the gradient and the Jacobian singular values, so one can see how bad things are. days - c(163,168,170,175,177,182,185,189,196,203,211,217,224) height - c(153,161,171,173,176,173,185,192,195,187,195,203,201) dat - as.data.frame(cbind(days,height)) fit - try(nls(y ~ SSweibull(x, Asym, Drop, lrc, pwr), data = dat, trace=T, control=nls.control(minFactor=1/10))) ## failed fdata-data.frame(x=days, y=height) require(nlmrt) strt2-c(Asym=250, Drop=1, lrc=1, pwr=1) fit2-nlxb(y ~ Asym - (Drop * ( exp(-(exp(lrc)*(x^pwr), data=fdata, start=strt2, trace=TRUE) strt3-c(Asym=250, Drop=.5, lrc=.1, pwr=2) fit3-nlxb(y ~ Asym - (Drop * ( exp(-(exp(lrc)*(x^pwr), data=fdata, start=strt3, trace=TRUE) strt4-c(Asym=200, Drop=.5, lrc=.1, pwr=2) fit4-nlxb(y ~ Asym - (Drop * ( exp(-(exp(lrc)*(x^pwr), data=fdata, start=strt4, trace=TRUE, masked=c(Asym)) d50-days-160 fd2-data.frame(x=d50, y=height) fit5-nlxb(y ~ Asym - (Drop * ( exp(-(exp(lrc)*(x^pwr), data=fd2, start=strt3, trace=TRUE) fit5 John Nash On 13-10-04 02:19 AM, r-help-requ...@r-project.org wrote: Message: 40 Date: Thu, 3 Oct 2013 20:49:36 +0200 From:aline.fr...@wsl.ch To:r-help@r-project.org Subject: [R] SSweibull() : problems with step factor and singular gradient Message-ID: of669fa420.9ef643ed-onc1257bf9.00676b04-c1257bf9.00676...@wsl.ch Content-Type: text/plain SSweibull() :  problems with step factor and singular gradient Hello I am working with growth data of ~4000 tree seedlings and trying to fit non-linear Weibull growth curves through the data of each plant. Since they differ a lot in their shape, initial parameters cannot be set for all plants. That’s why I use the self-starting function SSweibull(). However, I often got two error messages: 1) # Example days - c(163,168,170,175,177,182,185,189,196,203,211,217,224) height - c(153,161,171,173,176,173,185,192,195,187,195,203,201) dat - as.data.frame(cbind(days,height)) fit - nls(y ~ SSweibull(x, Asym, Drop, lrc, pwr), data = dat, trace=T, control=nls.control(minFactor=1/10)) Error in nls(y ~cbind(1, -exp(-exp(lrc)* x^pwr)), data = xy, algorithm = “plinear�, :              step factor 0.000488281 reduced below `minFactor` of 0.000976562 I tried to avoid this error by reducing the step factor below the standard minFactor of 1/1024 using the nls.control function (shown in the example above). However, this didn’t work, as shown in the example (minFactor still the standard). Thus, does nls.control() not work for self-starting functions like SSweibull()? Or is there another explanation? 2) In other cases, a second error message showed up: Error in nls(y ~cbind(1, -exp(-exp(lrc)* x^pwr)), data = xy, algorithm = “plinear�, :              singular gradient Is there a way to avoid the problem of a singular gradient? I’d be very glad about helpful comments. Thanks a lot. Aline [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vif
Dear Arun,Thanks indeed Date: Thu, 3 Oct 2013 10:22:38 -0700 From: smartpink...@yahoo.com To: r-help@r-project.org Subject: Re: [R] vif Hi Eliza, Then, res needs a slight modification library(car) res- lapply(colnames(h),function(x) {x1- h[,x];dat1- do.call(rbind,lapply(seq_len(ncol(mat1)),function(i){ x2- m[,mat1[,i]];GG- lm(x1~x2[,1]+x2[,2]+x2[,3]+x2[,4]);GGsum- summary(GG); data.frame( Models=paste(colnames(x2),collapse=,), Multiple_Rsq= GGsum$r.squared, Adjusted_Rsq = GGsum$adj.r.squared, Pval = paste(GGsum$coef[-1,4],collapse=,),Vif=paste(vif(GG),collapse=,),stringsAsFactors=FALSE) })); dat1[rev(order(dat1[,3])),][1:10,]}) names(res)- colnames(h) A.K. From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Thursday, October 3, 2013 12:42 PM Subject: vif Dear Arun, There is one small question however. what if i also want in the table a column for vif values of each model. vif values can be generated for any model in the following way GG-lm(h[,any column]~m[,any column]+m[,any other column] +m[,any other column] +m[,any other column]) library(car) vif(GG) Here will be get 4 vif values. I want to make a new column which could contain these values seperated by comma, very much similar to the way we did with Pr(|t|) values. thanks in advance elisa Date: Thu, 3 Oct 2013 09:01:53 -0700 From: smartpink...@yahoo.com To: r-help@r-project.org Subject: Re: [R] a simple question Hi, Try: set.seed(494) h- matrix(sample(1:40,4*124,replace=TRUE),ncol=4) set.seed(39) m- matrix(sample(1:100,10*124,replace=TRUE),ncol=10) colnames(h)- paste0(h,1:4) colnames(m)- paste0(m,1:10) mat1-combn(colnames(m),4) res- lapply(colnames(h),function(x) {x1- h[,x];dat1- do.call(rbind,lapply(seq_len(ncol(mat1)),function(i){ x2- m[,mat1[,i]];GG- lm(x1~x2[,1]+x2[,2]+x2[,3]+x2[,4]);GGsum- summary(GG); data.frame( Models=paste(colnames(x2),collapse=,), Multiple_Rsq= GGsum$r.squared, Adjusted_Rsq = GGsum$adj.r.squared, Pval = paste(GGsum$coef[-1,4],collapse=,),stringsAsFactors=FALSE) })); dat1[rev(order(dat1[,3])),][1:10,]}) names(res)- colnames(h) A.K. From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Thursday, October 3, 2013 11:07 AM Subject: a simple question Dear Arun, I hope you are fine. I actually wanted to discuss the following problem. I have a linear model of the following form. GG-lm(h[,any column]~m[,any column]+m[,any other column] +m[,any other column] +m[,any other column]) where, h is matrix with 4 columns and 124 rows m is matrix with 10 columns and 124 rows what I want is the following make a loop command to run the linear model of all the possible combinations of columns of m with each column of h. more precisely, if i take column 1 of matrix h, it should be linear modeled with every combination of 10 (210 combinations) columns of m. All the columns of h m have certain names (you can suppose any). The summary(GG) will give Multiple R-squared,Adjusted R-squared and 4 values of Pr(|t|). I want in the end a table in the following format. Models Multiple R-squaredAdjusted R-squared Pr(|t|) Name of columns of m separated by comma Multiple R-squared Adjusted R-squared Pr(|t|) separated by comma For Example Models Multiple R-squaredAdjusted R-squared Pr(|t|) eliza, allen, murphy, jack 0.544 0.56 0.000114,0.000112,0.01114,0.002114 where, eliza, allen, murphy, jack are column names. The models are to be enlisted in the order of their Adjusted R-squared values. The models with highest Adjusted R-squared value should be on the top and so on. i m only interested in top 10 models. so the remaining should be ignored. I tried to put in my question everything but if there is anything wrong plz inform me. Thankyou very much in advance, Eliza __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] Interpreting the result of a Wilcoxon (Mann-Whitney U) test
-Original Message- Got it! I agree it should had been more obvious to me... :) I wouldn't feel too bad about that. I've spent most of the last 25 years discovering the hard way that statistics is very much a field where things are 'obvious' only _after_ you know the answer... S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying it 'isolate' correctly in Shiny
Hello! I am learning Shiny. In my ui am allowing the user to read in 3 files. Here is a piece of my ui.r code: sidebarPanel( fileInput('file1','Select File 1:'), fileInput('file2','Select File 2:'), fileInput('fileopt','Select Optional File:'), actionButton(goButton,RUN) ), Then I want my server code (below) to do something based on EITHER all 3 files that were read in OR - if the 3rd file was not uploaded - based on just the first 2 files. However, it's not working. Maybe what I am doing with myflag is wrong? Or maybe I am isolating incorrectly at the very bottom? Thank you so much for any pointers! Dimitri shinyServer(function(input,output){ # Drop-down selection box for Input file 1: renderUI({ fileInput('utils') }) # Drop-down selection box for Input file 2: renderUI({ fileInput('attrib') }) # Drop-down selection box for Input file 3 (optional): renderUI({ fileInput('test') }) myflag=1 bigone-reactive({ inFile1-input$file1 inFile2-input$file2 inFile3-input$fileopt if (is.null(inFile3)) myflag=0 forout1-read.csv(inFile1$datapath) forout2-read.csv(inFile2$datapath) if(myflag==1) forout3-read.csv(inFile3$datapath) if(myflag==1){ out1-cbind(forout1,forout2,forout3) out2-cbind(forout2,forout1,forout3) } else { out1-cbind(forout1,forout2) out2-cbind(forout2,forout1) } return(list(out1,out2)) }) # Summary statistics for the moments data frame (for testing purposes): output$myoutput1-renderPrint({ input$goButton # Isolating: isolate(bigone()$out1) # bigone()$out1 }) output$myoutput2-renderPrint({ input$goButton # Isolating: isolate(bigone()$out2) }) }) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Trying to avoid nested loop
Dear R users. I'm trying to avoid using nested loops in the following code but I'm not sure how to proceed. Any help would be greatly appreciated. With regards,Phil X = matrix(rnorm(100), 10, 10) ## Version with nested loopsresult = 0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n)))} }} ## No loop-sum(ifelse(M 0, M/??? , 0)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Counting numbers in R
I got sorted, Thanks all On Fri, Oct 4, 2013 at 2:03 PM, S Ellison s.elli...@lgcgroup.com wrote: I have a set of data and I need to find out how many points are below a certain value but R will not calculate this properly for me. R will. But you aren't. Negative numbers seem to be causing the issue. You haven't got any negative numbers in your data set. In fact, you haven't got any numbers. It's all character strings. Is there a reason for that? Assuming there is, if you have your data in a data frame 'A' and just want the count: table(as.numeric(A$Tm_ugL) = 0.0002) If you just want a complete vector of TRUE or FALSE as.numeric(d$Tm_ugL) = 0.0002) does that. If you want to add that to your data frame (is it called A?) that looks like A$Censored - as.numeric(d$Tm_ugL) = 0.0002) But you really shouldn't have numbers in character format; read it as numeric. Then it's just table(d$Tm_ugL = 0.0002) and so on. If it's refusing to read as numeric, find out why and fix the data. And some comments on code, while I'm here: for (i in one:nrow(A)) ... if (A[i,two]=A_LLD) Variables called 'one' and 'two' look like a really bad idea. If they are equal to 1 and 2, use 1 and 2 (or 1L and 2L if you want to be _sure_ they are integer). If not, the names are going to be pretty confusing, no? (A_Censored[i,two]-TRUE) Why use a character string like TRUE that R can't interpret as logical instead of the logical values TRUE and FALSE? S Ellison *** This email and any attachments are confidential. Any u...{{dropped:17}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to avoid nested loop
I'm trying to avoid using nested loops in the following code but I'm not sure how to proceed. Any help would be greatly appreciated. With regards,Phil X = matrix(rnorm(100), 10, 10) result = 0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } } } First, you don't need the 'if', do you? If X[m,n]==0 (rare for a floating point number) (X[m,n] / (1 + abs(m - n)) will be zero anyway. Then, depending on the matrix size, you can probably do the whole thing using an index array. Something like: idx - as.matrix( expand.grid(1:nrow(X), 1:ncol(X)) ) result - sum( X[idx] / apply(idx,1, function(x) 1+abs(diff(x))) ) #... which seemed to do the identically the same thing as your loop when I tried it. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] climstats 'spatial_sync_raster' function
The function I have been trying to use is: spatial_sync_raster The maintainer has let me know that this is available in his spatial.tools package, which you can get from CRAN: install.packages(spatial.tools) problem solved. Thanks jenny -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: 03 October 2013 20:45 To: Jenny Williams Cc: Prof Brian Ripley; r-help@r-project.org Subject: Re: [R] climstats On Oct 3, 2013, at 10:00 AM, Jenny Williams wrote: It seems to load now on 3.0.2 32bit and 64bit but NOT 3.0.1. install.packages(climstats, repos=http://R-Forge.R-project.org;, type=source) I did have to manually install some of the dependencies. There were 2 of us that tried loading climstats on different machines so there must have been a blip with our firewall or something. Now that I have climstats loaded the function I am trying to use doesn't work. I can bring up the help file: ?spatial_sync_raster but I get this error when I try to use the function: Error: could not find function spatial_sync_raster I was going to say a case offailing to load the package but you say that you got information from ?spatial_sync_raster # be sure to check exact spelling .. so that seems unlikely. (You are asked to show sessionInfor and the exact code which you have failed to provide.) Maybe you should ask the package maintainer. Type: maintainer(climstats) -- David On Sep 30, 2013, at 10:39 AM, Prof Brian Ripley wrote: On 30/09/2013 18:19, David Winsemius wrote: On Sep 30, 2013, at 3:25 AM, Jenny Williams wrote: I have been trying to download the climstats package: https://r-forge.r-project.org/R/?group_id=861 but it doesn't seem to run on R 3.0.2 or 3.0.1 What makes you say this? What errors are reprorted? (Doesn't seems to run is a bit vague.) and the zipfile is empty. I was able to install version 1.0 from sources with: install.packages(climstats, repos=http://R-Forge.R-project.org;, type=source) (I agree that the zipfile for Windows was not found.) R version 3.0.1 Patched (2013-07-23 r63392) Running Mac OS 10.7.5. It appears to require a fair number of external package, so you would need to check the Depends in the description file. Depends: R (= 2.13), raster, rgdal, chron, zoo, sp, ncdf, R.utils It did not appear to do any C or Fortran compiling, so I think that means you do not need to have RTools installed on Windows. But since it requires rgdal, you would need to have GDAL installed if you were to get it to load. Why do you say that? On both Windows and OS X, GDAL is part of the rgdal binary. My error apparently. I have in the past had incorrect installations of GDAL that prevented rgdal from loading properly and my sometimes fuzzy memory was that I fixed this by reinstalling GDAL. So I thought they were independent installations. Apologies for the noise. -- David. Does anyone know the status of this package or where I can download it. Thanks ** Jenny Williams Spatial Information Scientist, GIS Unit Herbarium, Library, Art Archives Directorate Royal Botanic Gardens, Kew Richmond, TW9 3AB, UK Tel: +44 (0)208 332 5277 email: jenny.willi...@kew.orgmailto:jenny.willi...@kew.org ** Film: The Forgotten Home of Coffee - Beyond the Gardenshttp://www.youtube.com/watch?v=-uDtytKMKpAsns=tw Stories: Coffee Expedition - Ethiopiahttp://storify.com/KewGIS/coffee-expedition-ethiopia Blog: Discovering Coffee in Ethiopia http://www.kew.org/news/kew-blogs/incrEdibles-food-blog/discovering-coffee.htm Kew in Harapan Rainforest Sumatrahttp://storify.com/KewGIS/kew-in-harapan-rainforest Articles: Seeing the wood for the treeshttp://www.kew.org/ucm/groups/public/documents/document/kppco nt_060602.pdf How Kew's GIS team and South East Asia botanists are working to help conserve and restore a rainforest in Sumatra. Download a pdf of this article here.http://www.kew.org/ucm/groups/public/documents/document/kppco nt_060602.pdf David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to avoid nested loop
Thank you for your answer. This is what I needed. From: s.elli...@lgcgroup.com To: r-help@r-project.org Date: Fri, 4 Oct 2013 15:13:49 +0100 Subject: Re: [R] Trying to avoid nested loop I'm trying to avoid using nested loops in the following code but I'm not sure how to proceed. Any help would be greatly appreciated. With regards,Phil X = matrix(rnorm(100), 10, 10) result = 0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } } } First, you don't need the 'if', do you? If X[m,n]==0 (rare for a floating point number) (X[m,n] / (1 + abs(m - n)) will be zero anyway. Then, depending on the matrix size, you can probably do the whole thing using an index array. Something like: idx - as.matrix( expand.grid(1:nrow(X), 1:ncol(X)) ) result - sum( X[idx] / apply(idx,1, function(x) 1+abs(diff(x))) ) #... which seemed to do the identically the same thing as your loop when I tried it. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time series has no or less than 2 periods
Bill, Thanks for replying. The data is weekly time series data. Assume there is 52 weeks in the year. Of the 52 weeks, I typically only have data for weeks 8 through 40. 4-Apr-10, 8, 27.2 11-Apr-10, 9, 32.3 18-Apr-10, 10, 31.7 DataXYZ, 40, 13.4 data - c(0,24.57,29.93,24.19,12.25,48.07,36.68,24.78,48.69,30.39,48.17,36.51,36.43,36.52,48.75,24.17,37.07,0,18.89) ts - ts(data= data, start = 8, end = 40, frequency = ) There is a weekly seasonality effect. What should I set my frequency value to? Thanks, Dan Hickman From: William Dunlap Sent: âThursdayâ, âOctoberâ â3â, â2013 â3â:â57â âPM To: David Winsemius, Daniel Hickman Cc: r-help@r-project.org ts - ts(data$QtyPerWeek, frequency=52) HoltWinters(ts,0.46924,0.05,0.2) This results in the following error. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods Since you have set the frequency of the time series to 52, you need to have 104 observations to get the initial estimate of the seasonal pattern. How many observations are in 'ts'? If you don't have enough you can omit the seaonal component (HoltWinters(gamma=FALSE,...)), change start.periods from the default 2 to 1, or supply a 52-long vector of the initial seasonal pattern as the s.start argument. If you do have more than 104 observations then you will have to tell us more about the data. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 03, 2013 12:39 PM To: Daniel Hickman Cc: r-help@r-project.org Subject: Re: [R] time series has no or less than 2 periods On Oct 3, 2013, at 8:32 AM, Daniel Hickman wrote: Hello, I have been tasked with taking an excel file that my colleague had implemented Triple Exponential Smoothing and recreate using R. The following image shows the before and after of smoothing out a fixed interval time series data using Triple Exponential Smoothing inside of Excel. enter image description here The image file formats that I know are acceptable are .ps, .pdf or .png. Not sure about jpeg. I am trying to perform the same triple exponential smoothing in R. I created a csv file with the before smoothing data. The csv file is attached and can also be found here. Need to send with .txt extension. I found the HoltWinters method but I keep getting an error when I try to apply HoltWinters against the csv. setwd(C:/temp) data - read.table(TripleExpSmoothingXLS.csv, header=TRUE, sep=,) ts - ts(data$QtyPerWeek, frequency=52) HoltWinters(ts,0.46924,0.05,0.2) This results in the following error. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods Perhaps a data entry problem. We would need to see either the file or output of str(data). In case it helps, excel file with the triple exponential smoothing formulas and original data can be found here. Again there is no here here. Any advice? Thanks, Dan__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to avoid nested loop
Hi, set.seed(49) X = matrix(rnorm(100), 10, 10) X1- X result-0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } }} indx-which(X!=0,arr.ind=TRUE) indx1-1+abs(indx[,1]-indx[,2]) X1[indx]- X1[indx]/indx1 #or res1- sapply(seq_len(nrow(X)),function(m) do.call(rbind,lapply(seq_len(ncol(X)),function(n) {if(X[m,n]!=0) X[m,n]/(1+abs(m-n))}))) res2-t(X1) identical(res1,res2) #[1] TRUE res3- sum(res2) all.equal(res3,result) #[1] TRUE A.K. - Original Message - From: philippe massicotte pmassico...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 4, 2013 9:49 AM Subject: [R] Trying to avoid nested loop Dear R users. I'm trying to avoid using nested loops in the following code but I'm not sure how to proceed. Any help would be greatly appreciated. With regards,Phil X = matrix(rnorm(100), 10, 10) ## Version with nested loopsresult = 0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } }} ## No loop-sum(ifelse(M 0, M/??? , 0)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tex-mining in R
Thanks Ista, Can you please suggest any useful link(s) which explain RcmdrPlugin.temis and tm package other than Cran-R one? Thanks Umesh On Tue, Oct 1, 2013 at 11:40 PM, Ista Zahn istaz...@gmail.com wrote: Do you know about task views? Try http://cran.r-project.org/web/views/NaturalLanguageProcessing.html Best, Ista On Tue, Oct 1, 2013 at 6:06 AM, umesh khatri khatriumes...@gmail.com wrote: Can anyone please guide me on any useful links or resource regarding text mining in R? -- Regards Umesh Khatri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards Umesh Khatri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 128, Issue 5
Hi Peter, The ssconvert tool (part of gnumeric) is very good at converting spreadsheets to csv-files. There is a wrapper in the gnumeric package on cran. Cheers, Thomas Date: Fri, 4 Oct 2013 09:08:50 +0100 From: Barry Rowlingson b.rowling...@lancaster.ac.uk To: Peter Maclean pmaclean2...@yahoo.com Cc: r-help@r-project.org r-help@r-project.org Subject: Re: [R] Importing odf file into R Message-ID: CANVKczMtrLD2rw_UWhRoJo5CUP- 7taucmizo+swrrrvkonm...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 On Fri, Oct 4, 2013 at 5:57 AM, Peter Maclean pmaclean2...@yahoo.com wrote: Anyone aware of a package or technique to import odf data file into R, I will appreciate his/her help. A quick scan of R-help points me here: http://www.omegahat.org/ROpenOffice/ Reports of that working (or not) would be appreciated - I've not tried it. Alternatively, there's python code for reading ODF so you could interface to that in a number of ways. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tab Separated File Reading Error
Hello, I have a seemingly simple problem that a tab-delimited file can't be read in. annoTranscripts - read.table(matched.txt, sep = '\t', stringsAsFactors = FALSE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 5933 did not have 12 elements However, all lines do have 12 columns. lines - readLines(matched.txt) tabsPosns - gregexpr(\t, lines) table(sapply(tabsPosns, length)) 11 367274 system(wc -l matched.txt) 367274 matched.txt You can obtain the file from https://dl.dropboxusercontent.com/u/37992150/matched.txt The line does not contain comment or quote characters. What can you suggest ? sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base loaded via a namespace (and not attached): [1] tools_3.0.1 -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting Timestamped data
Hi, I have a data frame, data, containing two columns: one- the TimeStamp (formatted using data$TimeStamp - as.POSTIXct(as.character(data$TimeStamp), format = %d/%m/%Y %H:%M) ) and two- the data value. The data frame has been read from a .csv file and should contain 48 values for each day of the year (values sampled at 30 minute intervals). However, there are only 15,948 observations i.e. only approx 332 days worth of data. I therefore would like to remove any days that do not contain the 48 values. My question, how would I go about doing this? Many thanks, -A. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to avoid nested loop
Hi, set.seed(49) X = matrix(rnorm(100), 10, 10) X1- X result-0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } }} indx-which(X!=0,arr.ind=TRUE) indx1-1+abs(indx[,1]-indx[,2]) X1[indx]- X1[indx]/indx1 #or res1- sapply(seq_len(nrow(X)),function(m) do.call(rbind,lapply(seq_len(ncol(X)),function(n) {if(X[m,n]!=0) X[m,n]/(1+abs(m-n))}))) res2-t(X1) identical(res1,res2) #[1] TRUE res3- sum(res2) all.equal(res3,result) #[1] TRUE A.K. - Original Message - From: philippe massicotte pmassico...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 4, 2013 9:49 AM Subject: [R] Trying to avoid nested loop Dear R users. I'm trying to avoid using nested loops in the following code but I'm not sure how to proceed. Any help would be greatly appreciated. With regards,Phil X = matrix(rnorm(100), 10, 10) ## Version with nested loopsresult = 0 for(m in 1:nrow(X)){ for(n in 1:ncol(X)){ if(X[m,n] != 0){ result = result + (X[m,n] / (1 + abs(m - n))) } }} ## No loop-sum(ifelse(M 0, M/??? , 0)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] abline is not plotting
Hello there, I have some data I want to plot together with a best-fit line. (see MWE below) The points from the first plot does appear as expected, but the abline does not appear, no matter what I change. I removed the log parameter before, but the abline is a very steep line around the origin. I really want to keep the logarithmic scale, plus a working abline. Can someone help me with that? What am I doing wrong? Thanks in advance, Hans # d = data.frame( x = c(154471 , 517423 , 704286 , 236117 , 10664898 , 21887 , 104994 , 794101 , 289567 , 74818 , 63920 , 251053 , 263583 , 84882 , 55075 , 741076 , 92000 , 137799 , 59856 , 184992 , 8292355), y = c(624 , 1681 , 590 , 2073 , 12189 , 42 , 343 , 365 , 969 , 108 , 366 , 1664 , 738 , 420 , 318 , 1278 , 887 , 395 , 462 , 1376 , 17907) ) plot(d, log = xy) abline(lm(x ~ y, data = d)) -- View this message in context: http://r.789695.n4.nabble.com/abline-is-not-plotting-tp4677583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abline is not plotting
Well you logged the x and y values before plotting but did not log the lm(). I think this means you have plotted abline() off the scale. I'm not sure how to fix it though. John Kane Kingston ON Canada -Original Message- From: hans_han...@gmx.de Sent: Fri, 4 Oct 2013 07:16:49 -0700 (PDT) To: r-help@r-project.org Subject: [R] abline is not plotting Hello there, I have some data I want to plot together with a best-fit line. (see MWE below) The points from the first plot does appear as expected, but the abline does not appear, no matter what I change. I removed the log parameter before, but the abline is a very steep line around the origin. I really want to keep the logarithmic scale, plus a working abline. Can someone help me with that? What am I doing wrong? Thanks in advance, Hans # d = data.frame( x = c(154471 , 517423 , 704286 , 236117 , 10664898 , 21887 , 104994 , 794101 , 289567 , 74818 , 63920 , 251053 , 263583 , 84882 , 55075 , 741076 , 92000 , 137799 , 59856 , 184992 , 8292355), y = c(624 , 1681 , 590 , 2073 , 12189 , 42 , 343 , 365 , 969 , 108 , 366 , 1664 , 738 , 420 , 318 , 1278 , 887 , 395 , 462 , 1376 , 17907) ) plot(d, log = xy) abline(lm(x ~ y, data = d)) -- View this message in context: http://r.789695.n4.nabble.com/abline-is-not-plotting-tp4677583.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time series has no or less than 2 periods
Perhaps looking at your data will suggest an appropriate number, viz. plot(data,type=b,xlim=c(0,20),ylim=c(0,50)) par(new=T) ind-1:19 # in this case where the data length is 19 data.ind-data.frame(ind,data) data.lo-loess(data~ind,data.ind) data.pre-predict(data.lo,data.frame(ind = seq(1,19,1))) plot(data.pre,pch=3,col=2,xlim=c(0,20),ylim=c(0,50)) If you now plot the difference between the data and the loess prediction, data.ind-cbind(data.ind,data.pre) data.diff-with(data.ind,data-data.pre) data.ind-cbind(data.ind,data.diff) with(data.ind,plot(ind,data.diff,type=b)) abline(h=0) there is also a pretty strong two week signal--is that of any interest? Now you should be able to decide how to proceed. Clint Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels:300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 4 Oct 2013, Daniel Hickman wrote: Bill, Thanks for replying. The data is weekly time series data. Assume there is 52 weeks in the year. Of the 52 weeks, I typically only have data for weeks 8 through 40. 4-Apr-10, 8, 27.2 11-Apr-10, 9, 32.3 18-Apr-10, 10, 31.7 DataXYZ, 40, 13.4 data - c(0,24.57,29.93,24.19,12.25,48.07,36.68,24.78,48.69,30.39,48.17,36.51,36.43,36.52,48.75,24.17,37.07,0,18.89) ts - ts(data= data, start = 8, end = 40, frequency = ) There is a weekly seasonality effect. What should I set my frequency value to? Thanks, Dan Hickman From: William Dunlap Sent: ???Thursday???, ???October??? ???3???, ???2013 ???3???:???57??? ???PM To: David Winsemius, Daniel Hickman Cc: r-help@r-project.org ts - ts(data$QtyPerWeek, frequency=52) HoltWinters(ts,0.46924,0.05,0.2) This results in the following error. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods Since you have set the frequency of the time series to 52, you need to have 104 observations to get the initial estimate of the seasonal pattern. How many observations are in 'ts'? If you don't have enough you can omit the seaonal component (HoltWinters(gamma=FALSE,...)), change start.periods from the default 2 to 1, or supply a 52-long vector of the initial seasonal pattern as the s.start argument. If you do have more than 104 observations then you will have to tell us more about the data. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius Sent: Thursday, October 03, 2013 12:39 PM To: Daniel Hickman Cc: r-help@r-project.org Subject: Re: [R] time series has no or less than 2 periods On Oct 3, 2013, at 8:32 AM, Daniel Hickman wrote: Hello, I have been tasked with taking an excel file that my colleague had implemented Triple Exponential Smoothing and recreate using R. The following image shows the before and after of smoothing out a fixed interval time series data using Triple Exponential Smoothing inside of Excel. enter image description here The image file formats that I know are acceptable are .ps, .pdf or .png. Not sure about jpeg. I am trying to perform the same triple exponential smoothing in R. I created a csv file with the before smoothing data. The csv file is attached and can also be found here. Need to send with .txt extension. I found the HoltWinters method but I keep getting an error when I try to apply HoltWinters against the csv. setwd(C:/temp) data - read.table(TripleExpSmoothingXLS.csv, header=TRUE, sep=,) ts - ts(data$QtyPerWeek, frequency=52) HoltWinters(ts,0.46924,0.05,0.2) This results in the following error. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods Perhaps a data entry problem. We would need to see either the file or output of str(data). In case it helps, excel file with the triple exponential smoothing formulas and original data can be found here. Again there is no here here. Any advice? Thanks, Dan__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Re: [R] abline is not plotting
I have some data I want to plot together with a best-fit line. (see MWE below) ... Can someone help me with that? What am I doing wrong? Not logging the lm. Also, you've calculated lm() the wrong way round; you've regressed x on y. Try plot(log(d), xlab=log(x), ylab=log(y)) abline(lm(y ~ x, data = log(d))) S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quote a column of a dataframe by its name
Dear All, I have a question, suppose X is a dataframe, with column names as x1, x2, x3, . And I would like to use the i-th column by X[,'xi']. But it seems the single quote and double quote are different. So if I run X[, names(X)[i]], it has some error. Please use the below example code X = matrix(rnorm(50),ncol = 5) X = data.frame(X) names(X)=c(x1,x2,x3,x4,x5) #pick the 4-th column X[,'x4'] #working X[,names(X)[4]] # not working , so how to modify this line? names(X)[4] # returns x4 sQuote(names(X)[4]) # returns 'x4' Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quote a column of a dataframe by its name
Sorry, this sample code seems to be OK. I will look into my original problem and update it soon. Best wishes, On Fri, Oct 4, 2013 at 12:06 PM, Jie jimmycl...@gmail.com wrote: Dear All, I have a question, suppose X is a dataframe, with column names as x1, x2, x3, . And I would like to use the i-th column by X[,'xi']. But it seems the single quote and double quote are different. So if I run X[, names(X)[i]], it has some error. Please use the below example code X = matrix(rnorm(50),ncol = 5) X = data.frame(X) names(X)=c(x1,x2,x3,x4,x5) #pick the 4-th column X[,'x4'] #working X[,names(X)[4]] # not working , so how to modify this line? names(X)[4] # returns x4 sQuote(names(X)[4]) # returns 'x4' Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quote a column of a dataframe by its name
Hello, I had no problems, and it shouldn't. What exactly do you mean by not working? Hope this helps, Rui Barradas Em 04-10-2013 17:06, Jie escreveu: Dear All, I have a question, suppose X is a dataframe, with column names as x1, x2, x3, . And I would like to use the i-th column by X[,'xi']. But it seems the single quote and double quote are different. So if I run X[, names(X)[i]], it has some error. Please use the below example code X = matrix(rnorm(50),ncol = 5) X = data.frame(X) names(X)=c(x1,x2,x3,x4,x5) #pick the 4-th column X[,'x4'] #working X[,names(X)[4]] # not working , so how to modify this line? names(X)[4] # returns x4 sQuote(names(X)[4]) # returns 'x4' Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quote a column of a dataframe by its name
X[,names(X)[4]] works fine for me. I had never thought of doing this. Neat idea. John Kane Kingston ON Canada -Original Message- From: jimmycl...@gmail.com Sent: Fri, 4 Oct 2013 12:06:50 -0400 To: r-help@r-project.org Subject: [R] quote a column of a dataframe by its name Dear All, I have a question, suppose X is a dataframe, with column names as x1, x2, x3, . And I would like to use the i-th column by X[,'xi']. But it seems the single quote and double quote are different. So if I run X[, names(X)[i]], it has some error. Please use the below example code X = matrix(rnorm(50),ncol = 5) X = data.frame(X) names(X)=c(x1,x2,x3,x4,x5) #pick the 4-th column X[,'x4'] #working X[,names(X)[4]] # not working , so how to modify this line? names(X)[4] # returns x4 sQuote(names(X)[4]) # returns 'x4' Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tab Separated File Reading Error
annoTranscripts - read.table(matched.txt, sep = '\t', stringsAsFactors = FALSE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 5933 did not have 12 elements However, all lines do have 12 columns. lines - readLines(matched.txt) ...[many omitted lines]... The line does not contain comment or quote characters. What can you suggest ? I suggest looking at the lines preceding the one where the error was found, with both print and cat: print(lines[5933 - (10:0)]) cat(lines[5933 - (10:0)], sep=\n) If things are not obvious after looking at them, see if read.table can read just those lines read.table(text=lines[5933 - (10:0)], sep=\t, stringsAsFactors=FALSE) If it can, try backing up more than 10 lines. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dario Strbenac Sent: Friday, October 04, 2013 5:01 AM To: r-help@r-project.org Subject: [R] Tab Separated File Reading Error Hello, I have a seemingly simple problem that a tab-delimited file can't be read in. annoTranscripts - read.table(matched.txt, sep = '\t', stringsAsFactors = FALSE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 5933 did not have 12 elements However, all lines do have 12 columns. lines - readLines(matched.txt) tabsPosns - gregexpr(\t, lines) table(sapply(tabsPosns, length)) 11 367274 system(wc -l matched.txt) 367274 matched.txt You can obtain the file from https://dl.dropboxusercontent.com/u/37992150/matched.txt The line does not contain comment or quote characters. What can you suggest ? sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base loaded via a namespace (and not attached): [1] tools_3.0.1 -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quote a column of a dataframe by its name
Is there ever a case that X[,names(X)[4]] would give a different result than X[,4]? Or is this just a case of the the longest distance between any 2 points is a shortcut? Well I guess if X has non-unique names then you might see a difference, but having a data frame with non-unique names and using the longer version above is more likely to be asking for trouble than doing anything useful. On Fri, Oct 4, 2013 at 10:15 AM, John Kane jrkrid...@inbox.com wrote: X[,names(X)[4]] works fine for me. I had never thought of doing this. Neat idea. John Kane Kingston ON Canada -Original Message- From: jimmycl...@gmail.com Sent: Fri, 4 Oct 2013 12:06:50 -0400 To: r-help@r-project.org Subject: [R] quote a column of a dataframe by its name Dear All, I have a question, suppose X is a dataframe, with column names as x1, x2, x3, . And I would like to use the i-th column by X[,'xi']. But it seems the single quote and double quote are different. So if I run X[, names(X)[i]], it has some error. Please use the below example code X = matrix(rnorm(50),ncol = 5) X = data.frame(X) names(X)=c(x1,x2,x3,x4,x5) #pick the 4-th column X[,'x4'] #working X[,names(X)[4]] # not working , so how to modify this line? names(X)[4] # returns x4 sQuote(names(X)[4]) # returns 'x4' Best, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tab Separated File Reading Error
Hi, Try: annoTranscripts- read.csv(matched.txt, sep = '\t', stringsAsFactors = FALSE,quote=,header=FALSE) str(annoTranscripts) 'data.frame': 367274 obs. of 12 variables: $ V1 : chr comp103529_c0_seq1 comp129123_c0_seq1 comp129123_c0_seq1 comp129124_c0_seq1 ... $ V2 : chr XM_003723822 XM_778057 EU116908 XM_786928 ... $ V3 : chr PREDICTED: Strongylocentrotus purpuratus neuromedin-U receptor 2-like (LOC100888633), mRNA PREDICTED: Strongylocentrotus purpuratus 60S ribosomal protein L30-like (LOC577852), mRNA Barentsia elongata putative ribosomal protein L30 mRNA, complete cds PREDICTED: Strongylocentrotus purpuratus 60S ribosomal protein L29-1-like (LOC587182), mRNA ... $ V4 : int 91 392 69 149 149 451 399 203 193 185 ... $ V5 : int 136 479 203 209 209 541 463 451 456 472 ... $ V6 : int 15 16 40 20 20 24 20 71 83 85 ... $ V7 : int 0 11 4 0 0 5 1 10 4 9 ... $ V8 : num 2e-38 0e+00 6e-26 2e-70 2e-70 ... $ V9 : int 1 22 210 135 135 131 189 205 196 185 ... $ V10: int 136 499 410 343 343 669 650 650 649 653 ... $ V11: int 576 159 27 1 1 1 21 23 140 22 ... $ V12: int 441 627 227 209 209 538 483 468 593 487 ... dim(annoTranscripts) [1] 367274 12 A.K. - Original Message - From: Dario Strbenac dstr7...@uni.sydney.edu.au To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 4, 2013 8:00 AM Subject: [R] Tab Separated File Reading Error Hello, I have a seemingly simple problem that a tab-delimited file can't be read in. annoTranscripts - read.table(matched.txt, sep = '\t', stringsAsFactors = FALSE) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 5933 did not have 12 elements However, all lines do have 12 columns. lines - readLines(matched.txt) tabsPosns - gregexpr(\t, lines) table(sapply(tabsPosns, length)) 11 367274 system(wc -l matched.txt) 367274 matched.txt You can obtain the file from https://dl.dropboxusercontent.com/u/37992150/matched.txt The line does not contain comment or quote characters. What can you suggest ? sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods [7] base loaded via a namespace (and not attached): [1] tools_3.0.1 -- Dario Strbenac PhD Student University of Sydney Camperdown NSW 2050 Australia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drawing garbled
Do you have the correct fonts installed on Windows? John Kane Kingston ON Canada -Original Message- From: cels...@163.com Sent: Wed, 2 Oct 2013 23:51:58 +0800 (CST) To: r-help@r-project.org Subject: [R] Drawing garbled Hi: I am Chinese, I am developing a java application, and deploy it to tomcat 7.0.42, I use rJava to use the R, when I use the command line to start tomcat, the R drawing well done, see attachment histview.png,but when i use windows service to start tomcat, the R drawing bad, see attachment histview2.png, I don't why, can you give me a suggestion? My window OS is window 7 home, thanks your help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Applying and labeling scenarios
Folks, I have a working version of the code below for my real world problem. I was wondering if there was a better way to do this. I have 3 sets of parameters and I want to iterate over all combinations and label the results. Would some of the plyr tools make this easier? # Input Scenarios scens-list(A=c(.2,.3), B= c(.2,.4), C=2:3) # All combinations allPerms-expand.grid(scens) # Create labels labels-sapply(1:nrow(allPerms), function(x, y) paste0([, paste0(y[x,], collapse = |), ]), allPerms) # Simple example computation vals-sapply(1:nrow(allPerms), function(x, y) cumsum(t(y[x,])), allPerms) # Apply labels to columns colnames(vals)-labels # Beautiful output! dput(vals) structure(c(0.2, 0.4, 2.4, 0.3, 0.5, 2.5, 0.2, 0.6, 2.6, 0.3, 0.7, 2.7, 0.2, 0.4, 3.4, 0.3, 0.5, 3.5, 0.2, 0.6, 3.6, 0.3, 0.7, 3.7), .Dim = c(3L, 8L), .Dimnames = list(NULL, c([0.2|0.2|2], [0.3|0.2|2], [0.2|0.4|2], [0.3|0.4|2], [0.2|0.2|3], [0.3|0.2|3], [0.2|0.4|3], [0.3|0.4|3]))) Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying and labeling scenarios
labels1- paste0([,with(allPerms,paste(A,B,C,sep=|)),]) vals2-as.matrix(cumsum(as.data.frame(t(allPerms dimnames(vals2)- list(NULL,labels1) all.equal(vals,vals2) #[1] TRUE A.K. - Original Message - From: Keith S Weintraub kw1...@gmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 4, 2013 2:38 PM Subject: [R] Applying and labeling scenarios Folks, I have a working version of the code below for my real world problem. I was wondering if there was a better way to do this. I have 3 sets of parameters and I want to iterate over all combinations and label the results. Would some of the plyr tools make this easier? # Input Scenarios scens-list(A=c(.2,.3), B= c(.2,.4), C=2:3) # All combinations allPerms-expand.grid(scens) # Create labels labels-sapply(1:nrow(allPerms), function(x, y) paste0([, paste0(y[x,], collapse = |), ]), allPerms) # Simple example computation vals-sapply(1:nrow(allPerms), function(x, y) cumsum(t(y[x,])), allPerms) # Apply labels to columns colnames(vals)-labels # Beautiful output! dput(vals) structure(c(0.2, 0.4, 2.4, 0.3, 0.5, 2.5, 0.2, 0.6, 2.6, 0.3, 0.7, 2.7, 0.2, 0.4, 3.4, 0.3, 0.5, 3.5, 0.2, 0.6, 3.6, 0.3, 0.7, 3.7), .Dim = c(3L, 8L), .Dimnames = list(NULL, c([0.2|0.2|2], [0.3|0.2|2], [0.2|0.4|2], [0.3|0.4|2], [0.2|0.2|3], [0.3|0.2|3], [0.2|0.4|3], [0.3|0.4|3]))) Thanks for your time, KW -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why 'gbm' is not giving me error when I change the response from numeric to categorical?
This reproducible example is from the help of 'gbm' in R. I ran the following code in R, and works fine as long as the response is numeric. The problem starts when I convert the response from numeric to binary (0/1). It gives me an error. My question is, is converting the response from numeric to binary will have this much effect. Help page code: N - 1000 X1 - runif(N) X2 - 2*runif(N) X3 - ordered(sample(letters[1:4],N,replace=TRUE),levels=letters[4:1]) X4 - factor(sample(letters[1:6],N,replace=TRUE)) X5 - factor(sample(letters[1:3],N,replace=TRUE)) X6 - 3*runif(N) mu - c(-1,0,1,2)[as.numeric(X3)] SNR - 10 # signal-to-noise ratio Y - X1**1.5 + 2 * (X2**.5) + mu sigma - sqrt(var(Y)/SNR) Y - Y + rnorm(N,0,sigma) # introduce some missing values X1[sample(1:N,size=500)] - NA X4[sample(1:N,size=300)] - NA data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) # fit initial model gbm1 - gbm(Y~X1+X2+X3+X4+X5+X6, # formula data=data, # dataset var.monotone=c(0,0,0,0,0,0), # -1: monotone decrease, # +1: monotone increase, # 0: no monotone restrictions distribution=gaussian, # see the help for other choices n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm1 summary(gbm1) Now I slightly change the response variable to make it binary. Y[Y mean(Y)] = 0 #My edit Y[Y = mean(Y)] = 1 #My edit data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) fmla = as.formula(factor(Y)~X1+X2+X3+X4+X5+X6) #My edit gbm2 - gbm(fmla,# formula data=data, # dataset distribution=bernoulli, # My edit n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm2 gbm2 gbm(formula = fmla, distribution = bernoulli, data = data, n.trees = 1000, interaction.depth = 3, n.minobsinnode = 10, shrinkage = 0.05, bag.fraction = 0.5, train.fraction = 0.5, cv.folds = 3, keep.data = TRUE, verbose = FALSE) A gradient boosted model with bernoulli loss function. 1000 iterations were performed. The best cross-validation iteration was . The best test-set iteration was . Error in 1:n.trees : argument of length 0 My question is, Is binarizing the response will have so much effect that it does not find anythin useful in the predictors? Thanks -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why 'gbm' is not giving me error when I change the response from numeric to categorical?
My question is, Is binarizing the response will have so much effect that it does not find anythin useful in the predictors? Yes. Dichotomizing throws away most of the information in the data. Which is why you shouldn't do it. This is a statistics, not an R question, so any follow-up should be posted on a statistical list like stats.stackexchange.com, not here. -- Bert On Fri, Oct 4, 2013 at 12:16 PM, Mary Kindall mary.kind...@gmail.com wrote: This reproducible example is from the help of 'gbm' in R. I ran the following code in R, and works fine as long as the response is numeric. The problem starts when I convert the response from numeric to binary (0/1). It gives me an error. My question is, is converting the response from numeric to binary will have this much effect. Help page code: N - 1000 X1 - runif(N) X2 - 2*runif(N) X3 - ordered(sample(letters[1:4],N,replace=TRUE),levels=letters[4:1]) X4 - factor(sample(letters[1:6],N,replace=TRUE)) X5 - factor(sample(letters[1:3],N,replace=TRUE)) X6 - 3*runif(N) mu - c(-1,0,1,2)[as.numeric(X3)] SNR - 10 # signal-to-noise ratio Y - X1**1.5 + 2 * (X2**.5) + mu sigma - sqrt(var(Y)/SNR) Y - Y + rnorm(N,0,sigma) # introduce some missing values X1[sample(1:N,size=500)] - NA X4[sample(1:N,size=300)] - NA data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) # fit initial model gbm1 - gbm(Y~X1+X2+X3+X4+X5+X6, # formula data=data, # dataset var.monotone=c(0,0,0,0,0,0), # -1: monotone decrease, # +1: monotone increase, # 0: no monotone restrictions distribution=gaussian, # see the help for other choices n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm1 summary(gbm1) Now I slightly change the response variable to make it binary. Y[Y mean(Y)] = 0 #My edit Y[Y = mean(Y)] = 1 #My edit data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) fmla = as.formula(factor(Y)~X1+X2+X3+X4+X5+X6) #My edit gbm2 - gbm(fmla,# formula data=data, # dataset distribution=bernoulli, # My edit n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm2 gbm2 gbm(formula = fmla, distribution = bernoulli, data = data, n.trees = 1000, interaction.depth = 3, n.minobsinnode = 10, shrinkage = 0.05, bag.fraction = 0.5, train.fraction = 0.5, cv.folds = 3, keep.data = TRUE, verbose = FALSE) A gradient boosted model with bernoulli loss function. 1000 iterations were performed. The best cross-validation iteration was . The best test-set iteration was . Error in 1:n.trees : argument of length 0 My question is, Is binarizing the response will have so much effect that it does not find anythin useful in the predictors? Thanks -- - Mary Kindall Yorktown Heights, NY USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why 'gbm' is not giving me error when I change the response from numeric to categorical?
On Oct 4, 2013, at 2:16 PM, Mary Kindall mary.kind...@gmail.com wrote: This reproducible example is from the help of 'gbm' in R. I ran the following code in R, and works fine as long as the response is numeric. The problem starts when I convert the response from numeric to binary (0/1). It gives me an error. My question is, is converting the response from numeric to binary will have this much effect. Help page code: N - 1000 X1 - runif(N) X2 - 2*runif(N) X3 - ordered(sample(letters[1:4],N,replace=TRUE),levels=letters[4:1]) X4 - factor(sample(letters[1:6],N,replace=TRUE)) X5 - factor(sample(letters[1:3],N,replace=TRUE)) X6 - 3*runif(N) mu - c(-1,0,1,2)[as.numeric(X3)] SNR - 10 # signal-to-noise ratio Y - X1**1.5 + 2 * (X2**.5) + mu sigma - sqrt(var(Y)/SNR) Y - Y + rnorm(N,0,sigma) # introduce some missing values X1[sample(1:N,size=500)] - NA X4[sample(1:N,size=300)] - NA data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) # fit initial model gbm1 - gbm(Y~X1+X2+X3+X4+X5+X6, # formula data=data, # dataset var.monotone=c(0,0,0,0,0,0), # -1: monotone decrease, # +1: monotone increase, # 0: no monotone restrictions distribution=gaussian, # see the help for other choices n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm1 summary(gbm1) Now I slightly change the response variable to make it binary. Y[Y mean(Y)] = 0 #My edit Y[Y = mean(Y)] = 1 #My edit data - data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6) fmla = as.formula(factor(Y)~X1+X2+X3+X4+X5+X6) #My edit gbm2 - gbm(fmla,# formula data=data, # dataset distribution=bernoulli, # My edit n.trees=1000,# number of trees shrinkage=0.05, # shrinkage or learning rate, # 0.001 to 0.1 usually work interaction.depth=3, # 1: additive model, 2: two-way interactions, etc. bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best train.fraction = 0.5,# fraction of data for training, # first train.fraction*N used for training n.minobsinnode = 10, # minimum total weight needed in each node cv.folds = 3,# do 3-fold cross-validation keep.data=TRUE, # keep a copy of the dataset with the object verbose=FALSE) # don't print out progress gbm2 gbm2 gbm(formula = fmla, distribution = bernoulli, data = data, n.trees = 1000, interaction.depth = 3, n.minobsinnode = 10, shrinkage = 0.05, bag.fraction = 0.5, train.fraction = 0.5, cv.folds = 3, keep.data = TRUE, verbose = FALSE) A gradient boosted model with bernoulli loss function. 1000 iterations were performed. The best cross-validation iteration was . The best test-set iteration was . Error in 1:n.trees : argument of length 0 My question is, Is binarizing the response will have so much effect that it does not find anythin useful in the predictors? Thanks Sure, it's possible. See this page for a good overview of why you should not dichotomize continuous data: http://biostat.mc.vanderbilt.edu/wiki/Main/CatContinuous Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why 'gbm' is not giving me error when I change the response from numeric to categorical?
On Oct 4, 2013, at 21:16 , Mary Kindall wrote: Y[Y mean(Y)] = 0 #My edit Y[Y = mean(Y)] = 1 #My edit I have no clue about gbm, but I don't think the above does what I think you think it does. Y - as.integer(Y = mean(Y)) might be closer to the mark. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why 'gbm' is not giving me error when I change the response from numeric to categorical?
On Oct 4, 2013, at 2:35 PM, peter dalgaard pda...@gmail.com wrote: On Oct 4, 2013, at 21:16 , Mary Kindall wrote: Y[Y mean(Y)] = 0 #My edit Y[Y = mean(Y)] = 1 #My edit I have no clue about gbm, but I don't think the above does what I think you think it does. Y - as.integer(Y = mean(Y)) might be closer to the mark. Good catch Peter! I didn't pay attention to that initially. Here is an example: set.seed(1) Y - rnorm(10) Y [1] -0.6264538 0.1836433 -0.8356286 1.5952808 0.3295078 -0.8204684 [7] 0.4874291 0.7383247 0.5757814 -0.3053884 mean(Y) [1] 0.1322028 Before changing Y: Y[Y mean(Y)] [1] -0.6264538 -0.8356286 -0.8204684 -0.3053884 Y[Y = mean(Y)] [1] 0.1836433 1.5952808 0.3295078 0.4874291 0.7383247 0.5757814 However, the incantation that Mary is using, which calculates mean(Y) separately in each call, results in: Y[Y mean(Y)] = 0 Y [1] 0.000 0.1836433 0.000 1.5952808 0.3295078 0.000 [7] 0.4874291 0.7383247 0.5757814 0.000 # mean(Y) is no longer the original value from above mean(Y) [1] 0.3909967 Thus: Y[Y = mean(Y)] = 1 Y [1] 0.000 0.1836433 0.000 1.000 0.3295078 0.000 [7] 1.000 1.000 1.000 0.000 Some of the values in Y do not change because the threshold for modifying the values changed as a result of the recalculation of the mean after the first set of values in Y have changed. As Peter noted, you don't end up with a dichotomous vector. Using Peter's method: Y - as.integer(Y = mean(Y)) Y [1] 0 1 0 1 1 0 1 1 1 0 That being said, the original viewpoint stands, which is to not do this due to loss of information. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SSweibull() : problems with step factor and singular gradient
Dear John Nash Thank you very much for your inputs on how to fit a non-linear growth curve without running into singular gradient. I wasn't aware of the package nlmrt which seems to provide very helpful functions, indeed. I'll try to figure out how nlxb() can be applied to my data (I am not sure yet wheter I can actually apply the same initial parameters for all of my plants) and also how plotting of the fitted curve works with nlxb(). The easy way of using predict(fit) did not work in this case. In addition, I am still wondering if there is in fact no way of controlling the step factor in SSweibull. Best regards, Aline Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Web Scraping
Hello everybody, I just started using R and I'm presenting a poster for R day at Kennesaw State University and I really need some help in terms of web scraping. I'm trying to extract used cars data from www.cars.com to include the mileage, year, model, make, price, CARFAX availability and Technology package availability. I've done some research, and everything points to the XML package and RCurl package. I also got my hands on a function that would capture all the text in the web page and store as a huge character vector. I've never done data mining before so when i read the help documents on the packages i mentioned earlier is like reading Chinese. I would appreciate it if you guide me through this process of data extraction. Here's an example of what the data would look like: CostYearMileageTechCARFAXMake Model $32000 1999 57,987 1 FREEAudi A4 Here's the link to the search:- http://www.cars.com/for-sale/searchresults.action?stkTyp=Utracktype=usedccmkId=20049AmbMkId=20049AmbMkNm=Audimake=AudiAmbMdNm=A4model=A4mdId=20596AmbMdId=20596rd=100zc=30062searchSource=QUICK_FORMenableSeo=1 I'm not expecting you to write the whole code for me, but just some guidance and where to start and what functions would be useful in my situation. Thanks a lot anyway. Regards, M. Samir Anany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting Timestamped data
Hi, May be this helps: set.seed(45) df1- data.frame(datetime=as.POSIXct(2011-05-25,tz=GMT)+0:200*30*60,value=sample(1:40,201,replace=TRUE),value2= sample(45:90,201,replace=TRUE)) df2- df1[ave(1:nrow(df1),as.Date(df1[,1]),FUN=length)==48,] dim(df2) #[1] 192 3 #or library(plyr) df3-df1[ddply(df1,.(as.Date(datetime)),mutate,Ldt=length(datetime)==48)$Ldt,] identical(df3,df2) #[1] TRUE A.K. - Original Message - From: aj...@bath.ac.uk aj...@bath.ac.uk To: r-help@r-project.org Cc: Sent: Friday, October 4, 2013 11:03 AM Subject: [R] Subsetting Timestamped data Hi, I have a data frame, data, containing two columns: one- the TimeStamp (formatted using data$TimeStamp - as.POSTIXct(as.character(data$TimeStamp), format = %d/%m/%Y %H:%M) ) and two- the data value. The data frame has been read from a .csv file and should contain 48 values for each day of the year (values sampled at 30 minute intervals). However, there are only 15,948 observations i.e. only approx 332 days worth of data. I therefore would like to remove any days that do not contain the 48 values. My question, how would I go about doing this? Many thanks, -A. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] String substitution
Wonderful! Thank you Arun! Irene  Irene Ruberto Da: arun kirshna [via R] ml-node+s789695n4677558...@n4.nabble.com Inviato: Giovedì 3 Ottobre 2013 22:51 Oggetto: Re: String substitution Hi, Try: dat$y- as.character(dat$y) dat1- dat dat2- dat library(stringr)  dat$y[NET]- substr(word(dat$y[NET],2),1,1)  dat$y #[1] n    n    house n    tree #or for(i in 1:length(NET)){dat1$y[NET[i]]- n}  dat1$y #[1] n    n    house n    tree #or dat2$y[NET]- gsub(.*(n).*,\\1,dat2$y[NET])  dat2$y #[1] n    n    house n    tree A.K. Hello, I am trying to replace strings containing a certain word, I first identified the word (in this example net) with grep, and then I need to replace those string with n. It should be very simple but I don't seem to find the solution.  Example: x-c(5:9) y- c(with net, with nets, house, no nets, tree) dat-as.data.frame(cbind(x, y) ) NET-grep(net, dat$y) # I want y to become (n, n, house, n, tree) # # I have tried several ways including the following but without success # for (i in 1: length(NET)) { dat$y[NET[i]]- n } Thank you for your help! __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below:http://r.789695.n4.nabble.com/String-substitution-tp4677541p4677558.html To unsubscribe from String substitution, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/String-substitution-tp4677541p4677612.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function with loop that goes through columns of dataframes with different dimensions
Writing loops are the bane of my existence. I have this function, which works: rnd.data-function(x){ min.x-min(x[,2]) max.x-max(x[,2]) min.y-min(x[,3]) max.y-max(x[,3]) data.table(x = runif(34, min.x, max.x))[, y := runif(34, min.y, max.y)] } it's purpose is to simulate data within parameters that are dependent on the column of the dataframe in question for the first data set I wrote it for had only 2 columns I wanted to simulate samples for however i have additional dataframes with different numbers of columns ideally i would write one function with a for loop that could compute samples for all dataframes I want to input as I need to simulate more than 1000 samples per dataframe I tired manipulating the beginning to read as: rnd2.data-function(x){ n-dim(x)[2] for(i in 1:n){ if(n 3){ but then got stuck as to what to do next Any help would be greatly appreciated Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quote a column of a dataframe by its name
On 10/05/13 05:15, John Kane wrote: X[,names(X)[4]] works fine for me. I had never thought of doing this. Neat idea. Perhaps I am being obtuse, but how would X[,names(X)[4]] differ from X[,4]? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cannot read XPT file using foreign package
OS X 10.8 R 3.0.1 foreign 0.8-55 (2013-09-02) Colleagues, I received a SAS XPT file that I cannot read using the foreign package. The command: read.xport(FILENAME) results in the following message: Error in lookup.xport(file) : file not in SAS transfer format I am able to read the file successfully using StatTransport so it appears that the file is OK. When I examine the file using more, the first few lines look like this: ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***MEMBER HEADER RECORD!!!016140 HEADER RECORD***DSCRPTR HEADER RECORD!!!00 SAS SAS SASDATA 6.06bsd4.2 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***NAMESTR HEADER RECORD!!!05 ^@^A^@^@^@^H^@^ASubject Subject BEST ^@^L^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^A^@^@^@^H^@^BPeriod Period BEST^@^L^@^@^@^@^@^@ ^@^@^@^@^@^@^@^H^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ Of course, I can use StatTransport to write the file to another format. However, I would like to understand why the foreign package is unable to process the file. Any help would be greatly appreciated. Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read XPT file using foreign package
Duncan I looked at support.sas.com/techsup/technote/ts140.pdf and it is a bit difficult to decipher. I then replaced the string ^@ in the file contents with !. There is some concordance with he sample text shown in support.sas.com/techsup/technote/ts140.pdf but I don't know exactly how much concordance is expected. The time stamp in the file is today so I assume that the file was created today. You asked why [I] think this is a file that follows the format -- I did not make that assumption; I merely attempted to read an XPT file with read.xport and it failed. Could there be an issue with the version of SAS (which appears to be 6.06) -- they are now up to version 9 (for Windows - I don't know the version # for UNIX). Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com On Oct 4, 2013, at 4:47 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 13-10-04 6:50 PM, Dennis Fisher wrote: OS X 10.8 R 3.0.1 foreign 0.8-55 (2013-09-02) Colleagues, I received a SAS XPT file that I cannot read using the foreign package. The command: read.xport(FILENAME) results in the following message: Error in lookup.xport(file) : file not in SAS transfer format I am able to read the file successfully using StatTransport so it appears that the file is OK. When I examine the file using more, the first few lines look like this: !04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***MEMBER HEADER RECORD!!!016140 HEADER RECORD***DSCRPTR HEADER RECORD!!!00 SAS SAS SASDATA 6.06bsd4.2 !04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***NAMESTR HEADER RECORD!!!05 !^A!^H!^ASubject Subject BEST!^L! ! !^A!^H!^BPeriod Period BEST!^L! !^H! Of course, I can use StatTransport to write the file to another format. However, I would like to understand why the foreign package is unable to process the file. That file doesn't follow the documented format linked to from ?read.xport. You'll have to ask SAS why their documentation is incorrect, or ask yourself why you think this file is a file that follows that format when it doesn't. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trying to compile R in win 7 (with Rtools)
hello all, I am trying to compile the R in Win7 and compiles one small part but the script don't move from the 'base' directory to 'stats' I installed the Rtools likee administrator and call the terminal (MS-DOS) like administrator too. if somebody can tell me any tips, I thank in advanced cleber # File LOG http://klebyn.ploud.com/arquivo_log/log C:\Rsrc C:\Rsrc C:\Rsrctar -xf R-3.0.2.tar.gz C:\Rsrcwhere basename cat cmp comm cp cut date diff du echo expr gzip ls makeinfo C:\Rtools\bin\basename.exe C:\Rtools\bin\cat.exe C:\Rtools\bin\cmp.exe C:\Rtools\bin\comm.exe C:\Rtools\bin\cp.exe C:\Rtools\bin\cut.exe C:\Rtools\bin\date.exe C:\Rtools\bin\diff.exe C:\Rtools\bin\du.exe C:\Rtools\bin\echo.exe C:\Rtools\bin\expr.exe C:\Rtools\bin\gzip.exe C:\Rtools\bin\ls.exe C:\Rtools\bin\makeinfo.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\makeinfo.exe C:\Rsrcwhere mkdir mv rm rsync sed sort texindex touch uniq C:\Rtools\bin\mkdir.exe C:\Rtools\bin\mv.exe C:\Rtools\bin\rm.exe C:\Rtools\bin\rsync.exe C:\Rtools\bin\sed.exe C:\Rtools\bin\sort.exe C:\Windows\System32\sort.exe C:\Rtools\bin\texindex.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\texindex.exe C:\Rtools\bin\touch.exe C:\Rtools\bin\uniq.exe C:\Rsrcsort --version sort (GNU coreutils) 8.15 Packaged by Cygwin (8.15-1) Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. C:\Rsrccd R-3.0.2\src\gnuwin32 C:\Rsrc\R-3.0.2\src\gnuwin32make all recommended compilaR.log # # http://klebyn.ploud.com/arquivo_log/log # # connections.c: In function 'do_readbin': connections.c:3759:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3761:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3769:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3784:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3788:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] pcre_exec.c: In function 'pcre_exec': pcre_exec.c:7190:20: warning: 'match_partial' may be used uninitialized in this function [-Wuninitialized] localtime.c: In function 'timesub.isra.2': localtime.c:1407:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1411:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c: In function 'time2sub.constprop.10': localtime.c:1566:8: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1581:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1593:9: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1599:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c:1619:5: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] cannot create /tmp/R4428: directory nonexistent mv: cannot stat `/tmp/R4428': No such file or directory make[3]: *** [mkR1] Error 1 make[2]: *** [all] Error 2 make[1]: *** [R] Error 1 make: *** [all] Error 2 C:\Rsrc\R-3.0.2\src\gnuwin32 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read XPT file using foreign package
There two different transport or portable file types that SAS creates: 1. using Proc CPORT 2. using the XPORT engine in a LIBNAME statement. That may not mean much to a non-SAS user, but people often use 'xpt' as a file extension for both approaches. If Proc CPORT was used to write the file, read.xport will not be able to read it. read.xport only reads files written using the XPORT engine. If you don't have access to SAS you will need to get whomever created the file to re-create it. Dan Daniel Nordlund Bothell, WA USA -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Dennis Fisher Sent: Friday, October 04, 2013 5:01 PM To: Duncan Murdoch; r-h...@stat.math.ethz.ch Subject: Re: [R] Cannot read XPT file using foreign package Duncan I looked at support.sas.com/techsup/technote/ts140.pdf and it is a bit difficult to decipher. I then replaced the string ^@ in the file contents with !. There is some concordance with he sample text shown in support.sas.com/techsup/technote/ts140.pdf but I don't know exactly how much concordance is expected. The time stamp in the file is today so I assume that the file was created today. You asked why [I] think this is a file that follows the format -- I did not make that assumption; I merely attempted to read an XPT file with read.xport and it failed. Could there be an issue with the version of SAS (which appears to be 6.06) -- they are now up to version 9 (for Windows - I don't know the version # for UNIX). Dennis Dennis Fisher MD P (The P Less Than Company) Phone: 1-866-PLessThan (1-866-753-7784) Fax: 1-866-PLessThan (1-866-753-7784) www.PLessThan.com On Oct 4, 2013, at 4:47 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 13-10-04 6:50 PM, Dennis Fisher wrote: OS X 10.8 R 3.0.1 foreign 0.8-55 (2013-09-02) Colleagues, I received a SAS XPT file that I cannot read using the foreign package. The command: read.xport(FILENAME) results in the following message: Error in lookup.xport(file) : file not in SAS transfer format I am able to read the file successfully using StatTransport so it appears that the file is OK. When I examine the file using more, the first few lines look like this: !04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***MEMBER HEADER RECORD!!!016140 HEADER RECORD***DSCRPTR HEADER RECORD!!!00 SAS SAS SASDATA 6.06bsd4.2 !04Oct13:11:15:5904Oct13:11:15:59 HEADER RECORD***NAMESTR HEADER RECORD!!!05 !^A!^H!^ASubject Subject BEST!^L! ! !^A!^H!^BPeriod Period BEST!^L! !^H! Of course, I can use StatTransport to write the file to another format. However, I would like to understand why the foreign package is unable to process the file. That file doesn't follow the documented format linked to from ?read.xport. You'll have to ask SAS why their documentation is incorrect, or ask yourself why you think this file is a file that follows that format when it doesn't. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trying to compile R in win 7 (with Rtools)
Hi Cleber, You need to set TMPDIR to a valid directory, the default /tmp/ does not work on Windows. From the cmd shell: set TMPDIR=C:/TMP for example and then run make all recommended Cheers, Josh On Fri, Oct 4, 2013 at 5:03 PM, Cleber N.Borges kle...@yahoo.com.br wrote: hello all, I am trying to compile the R in Win7 and compiles one small part but the script don't move from the 'base' directory to 'stats' I installed the Rtools likee administrator and call the terminal (MS-DOS) like administrator too. if somebody can tell me any tips, I thank in advanced cleber ##**### File LOG http://klebyn.ploud.com/**arquivo_log/loghttp://klebyn.ploud.com/arquivo_log/log C:\Rsrc C:\Rsrc C:\Rsrctar -xf R-3.0.2.tar.gz C:\Rsrcwhere basename cat cmp comm cp cut date diff du echo expr gzip ls makeinfo C:\Rtools\bin\basename.exe C:\Rtools\bin\cat.exe C:\Rtools\bin\cmp.exe C:\Rtools\bin\comm.exe C:\Rtools\bin\cp.exe C:\Rtools\bin\cut.exe C:\Rtools\bin\date.exe C:\Rtools\bin\diff.exe C:\Rtools\bin\du.exe C:\Rtools\bin\echo.exe C:\Rtools\bin\expr.exe C:\Rtools\bin\gzip.exe C:\Rtools\bin\ls.exe C:\Rtools\bin\makeinfo.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\makeinfo.exe C:\Rsrcwhere mkdir mv rm rsync sed sort texindex touch uniq C:\Rtools\bin\mkdir.exe C:\Rtools\bin\mv.exe C:\Rtools\bin\rm.exe C:\Rtools\bin\rsync.exe C:\Rtools\bin\sed.exe C:\Rtools\bin\sort.exe C:\Windows\System32\sort.exe C:\Rtools\bin\texindex.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\texindex.exe C:\Rtools\bin\touch.exe C:\Rtools\bin\uniq.exe C:\Rsrcsort --version sort (GNU coreutils) 8.15 Packaged by Cygwin (8.15-1) Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.** html http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. C:\Rsrccd R-3.0.2\src\gnuwin32 C:\Rsrc\R-3.0.2\src\gnuwin32**make all recommended compilaR.log ##**##** # ##**##** # http://klebyn.ploud.com/**arquivo_log/loghttp://klebyn.ploud.com/arquivo_log/log ##**##** # ##**##** # connections.c: In function 'do_readbin': connections.c:3759:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3761:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3769:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3784:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3788:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] pcre_exec.c: In function 'pcre_exec': pcre_exec.c:7190:20: warning: 'match_partial' may be used uninitialized in this function [-Wuninitialized] localtime.c: In function 'timesub.isra.2': localtime.c:1407:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1411:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c: In function 'time2sub.constprop.10': localtime.c:1566:8: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1581:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1593:9: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1599:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c:1619:5: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] cannot create /tmp/R4428: directory nonexistent mv: cannot stat `/tmp/R4428': No such file or directory make[3]: *** [mkR1] Error 1 make[2]: *** [all] Error 2 make[1]: *** [R] Error 1 make: *** [all] Error 2 C:\Rsrc\R-3.0.2\src\gnuwin32 __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] trying to compile R in win 7 (with Rtools)
thanks. I am logged in the MS-DOS. I thought that cygwin is not necessary... in cygwin terminal, when I type: where sh CLEBER@pinkfloyd /cygdrive/c $ where sh C:\cygwin\bin\sh.exe C:\Rtools\bin\sh.exe so, I have two version of sh and the cygwin will be priority... I will make more test and to consider your sugestion of cygwin... thanks cleber Em 04/10/2013 21:58, Tambellini William escreveu: Hi Cleber It cant find /tmp which does not exist on standard win32 mount system. Are you sure you dont have to call the make all... from the cygwin bash (cygwin terminal) and not the msdos pseudo terminal ? W. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trying to compile R in win 7 (with Rtools)
bingo! :-) I got one pass to advanced! my TMP environment variable is: %SystemRoot%\TEMP thanks cleber Em 04/10/2013 22:02, Joshua Wiley escreveu: Hi Cleber, You need to set TMPDIR to a valid directory, the default /tmp/ does not work on Windows. From the cmd shell: set TMPDIR=C:/TMP for example and then run make all recommended Cheers, Josh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Web Scraping
Hi, I have a short demo at https://gist.github.com/izahn/5785265 that might get you started. Best, Ista On Fri, Oct 4, 2013 at 12:51 PM, Mohamed Anany melsa...@students.kennesaw.edu wrote: Hello everybody, I just started using R and I'm presenting a poster for R day at Kennesaw State University and I really need some help in terms of web scraping. I'm trying to extract used cars data from www.cars.com to include the mileage, year, model, make, price, CARFAX availability and Technology package availability. I've done some research, and everything points to the XML package and RCurl package. I also got my hands on a function that would capture all the text in the web page and store as a huge character vector. I've never done data mining before so when i read the help documents on the packages i mentioned earlier is like reading Chinese. I would appreciate it if you guide me through this process of data extraction. Here's an example of what the data would look like: CostYearMileageTechCARFAXMake Model $32000 1999 57,987 1 FREEAudi A4 Here's the link to the search:- http://www.cars.com/for-sale/searchresults.action?stkTyp=Utracktype=usedccmkId=20049AmbMkId=20049AmbMkNm=Audimake=AudiAmbMdNm=A4model=A4mdId=20596AmbMdId=20596rd=100zc=30062searchSource=QUICK_FORMenableSeo=1 I'm not expecting you to write the whole code for me, but just some guidance and where to start and what functions would be useful in my situation. Thanks a lot anyway. Regards, M. Samir Anany [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function with loop that goes through columns of dataframes with different dimensions
Hi, May be this helps: set.seed(24) dat1- as.data.frame(matrix(sample(1:50,100,replace=TRUE),10,10)) colnames(dat1)- paste0(Col,1:ncol(dat1)) rnd.data1 - function(x,n,ColSub,ColIndex=FALSE){ library(matrixStats) if(ColIndex){ index - seq_len(ncol(x))%in% ColSub Mins1 - colMins(x[index]) Maxs1 - colMaxs(x[index]) library(data.table) res - sapply(seq_along(Mins1),function(i) runif(n, Mins1[i],Maxs1[i])) colnames(res)- colnames(x)[index] res - data.table(res) } else{ Mins1 - colMins(x) Maxs1 - colMaxs(x) res - sapply(seq_along(Mins1),function(i) runif(n, Mins1[i],Maxs1[i])) colnames(res) - colnames(x) res - data.table(res) } res } rnd.data1(dat1,3) # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #1: 21.93711 17.314480 7.351077 28.05296 3.485837 27.97173 29.70568 19.273547 #2: 14.58832 5.026405 36.467826 43.04324 19.002031 29.12006 14.61867 4.809799 #3: 35.23000 17.353508 24.795010 18.65929 19.331303 16.85060 24.13479 49.966598 # Col9 Col10 #1: 14.90463 40.22131 #2: 16.03714 22.42686 #3: 32.74977 35.68602 rnd.data1(dat1,3,c(2,5),TRUE) # Col2 Col5 #1: 26.87589 22.872162 #2: 19.78380 5.002566 #3: 37.43138 13.187147 rnd.data1(dat1,3,c(2,5,8),TRUE) # Col2 Col5 Col8 #1: 39.63718 19.27199 49.86884 #2: 23.84264 14.19576 42.45117 #3: 41.13644 13.45054 36.48446 A.K. - Original Message - From: Katherine Bannar-Martin kbann...@hotmail.com To: r-help@r-project.org Cc: Sent: Friday, October 4, 2013 5:23 PM Subject: [R] function with loop that goes through columns of dataframes with different dimensions Writing loops are the bane of my existence. I have this function, which works: rnd.data-function(x){ min.x-min(x[,2]) max.x-max(x[,2]) min.y-min(x[,3]) max.y-max(x[,3]) data.table(x = runif(34, min.x, max.x))[, y := runif(34, min.y, max.y)] } it's purpose is to simulate data within parameters that are dependent on the column of the dataframe in question for the first data set I wrote it for had only 2 columns I wanted to simulate samples for however i have additional dataframes with different numbers of columns ideally i would write one function with a for loop that could compute samples for all dataframes I want to input as I need to simulate more than 1000 samples per dataframe I tired manipulating the beginning to read as: rnd2.data-function(x){ n-dim(x)[2] for(i in 1:n){ if(n 3){ but then got stuck as to what to do next Any help would be greatly appreciated Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trying to compile R in win 7 (with Rtools)
Hi Cleber It cant find /tmp which does not exist on standard win32 mount system. Are you sure you dont have to call the make all... from the cygwin bash (cygwin terminal) and not the msdos pseudo terminal ? W. Le 04/10/2013 17:03, Cleber N.Borges a écrit : hello all, I am trying to compile the R in Win7 and compiles one small part but the script don't move from the 'base' directory to 'stats' I installed the Rtools likee administrator and call the terminal (MS-DOS) like administrator too. if somebody can tell me any tips, I thank in advanced cleber # File LOG http://klebyn.ploud.com/arquivo_log/log C:\Rsrc C:\Rsrc C:\Rsrctar -xf R-3.0.2.tar.gz C:\Rsrcwhere basename cat cmp comm cp cut date diff du echo expr gzip ls makeinfo C:\Rtools\bin\basename.exe C:\Rtools\bin\cat.exe C:\Rtools\bin\cmp.exe C:\Rtools\bin\comm.exe C:\Rtools\bin\cp.exe C:\Rtools\bin\cut.exe C:\Rtools\bin\date.exe C:\Rtools\bin\diff.exe C:\Rtools\bin\du.exe C:\Rtools\bin\echo.exe C:\Rtools\bin\expr.exe C:\Rtools\bin\gzip.exe C:\Rtools\bin\ls.exe C:\Rtools\bin\makeinfo.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\makeinfo.exe C:\Rsrcwhere mkdir mv rm rsync sed sort texindex touch uniq C:\Rtools\bin\mkdir.exe C:\Rtools\bin\mv.exe C:\Rtools\bin\rm.exe C:\Rtools\bin\rsync.exe C:\Rtools\bin\sed.exe C:\Rtools\bin\sort.exe C:\Windows\System32\sort.exe C:\Rtools\bin\texindex.exe C:\Program Files (x86)\MiKTeX 2.9\miktex\bin\texindex.exe C:\Rtools\bin\touch.exe C:\Rtools\bin\uniq.exe C:\Rsrcsort --version sort (GNU coreutils) 8.15 Packaged by Cygwin (8.15-1) Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. C:\Rsrccd R-3.0.2\src\gnuwin32 C:\Rsrc\R-3.0.2\src\gnuwin32make all recommended compilaR.log # # http://klebyn.ploud.com/arquivo_log/log # # connections.c: In function 'do_readbin': connections.c:3759:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3761:8: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3769:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3784:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] connections.c:3788:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] pcre_exec.c: In function 'pcre_exec': pcre_exec.c:7190:20: warning: 'match_partial' may be used uninitialized in this function [-Wuninitialized] localtime.c: In function 'timesub.isra.2': localtime.c:1407:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1411:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c: In function 'time2sub.constprop.10': localtime.c:1566:8: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1581:5: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1593:9: warning: assuming signed overflow does not occur when assuming that (X + c) X is always false [-Wstrict-overflow] localtime.c:1599:8: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] localtime.c:1619:5: warning: assuming signed overflow does not occur when assuming that (X - c) X is always false [-Wstrict-overflow] cannot create /tmp/R4428: directory nonexistent mv: cannot stat `/tmp/R4428': No such file or directory make[3]: *** [mkR1] Error 1 make[2]: *** [all] Error 2 make[1]: *** [R] Error 1 make: *** [all] Error 2 C:\Rsrc\R-3.0.2\src\gnuwin32 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] make a file from four individual files. but keys can be missing among files.
Hello, I have a list of four files. Each file is a list of gene models, and each gene model has attributes in file columns. The sample_week-over-sample_week fold change value foreach i in file is in column 5. The gene model ID is in column 1. To make it more complicated, each gene i may not be found in all four files. I've written R scripts to try to 1) write a function to iterate over each file in the working directory, 2) if a gene model, i, is missing in a file(x), paste an NA as the value for that key and 3) make one master file from the four files with column names as indicated below in the code. function(g)( for( i in x){ if( i == NULL) print( text(NA)) else( print(x[i,5])) #Fold-change value )) files - list.files( path=getwd(), pattern=*.txt, full.names=T, recursive=FALSE) lapply( files, function(x) { x - read.table( x, header=T) # load file # apply function out - function(g) # write to file write.table( out, myfile.txt, sep=\t, quote=F, row.names=F, col.names=T) colnames(out)[1] = ca02 colnames(out)[2] = ca24 colnames(out)[3] = ca48 colnames(out)[4] = ca812 }) }) The code has no errors, but the file doesn't write to my working directory. I don't think this path is the problem either. I've seen the do.call function, thinking this may be the way to go. Also, if I'd like to add a third column to the master file from the four files, for example, the BLAST description for that gene model, i, in column 7, how to handle this command as well? I'm kinda new to R but hit a wall when it comes to this level, as I have had no classroom training on writing code. Hopefully, I can break on through with your assistance. Thanks much and have a nice day. -- Franklin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trying to compile R in win 7 (with Rtools) ... tcl.h
stop because *had a stone in the middle of the way* *in the middle of the way had a stone* (by vinicius de moraes) # so, one more help? somebody? :-) thanks... cleber building package 'tcltk' making init.d from init.c making tcltk.d from tcltk.c making tcltk_win.d from tcltk_win.c gcc -I../../../../include -DNDEBUG -I ../../../../Tcl/include -DWin32 -O3 -Wall -std=gnu99 -mtune=core2 -c init.c -o init.o In file included from init.c:22:0: tcltk.h:23:17: fatal error: tcl.h: No such file or directory compilation terminated. make[4]: *** [init.o] Error 1 make[3]: *** [mksrc-win2] Error 1 make[2]: *** [all] Error 2 make[1]: *** [R] Error 1 make: *** [all] Error 2 Em 04/10/2013 22:46, Cleber N.Borges escreveu: bingo! :-) I got one pass to advanced! my TMP environment variable is: %SystemRoot%\TEMP thanks cleber Em 04/10/2013 22:02, Joshua Wiley escreveu: Hi Cleber, You need to set TMPDIR to a valid directory, the default /tmp/ does not work on Windows. From the cmd shell: set TMPDIR=C:/TMP for example and then run make all recommended Cheers, Josh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trying to compile R in win 7 (with Rtools) ... tcl.h
Hi Cleber, When you install Rtools, it asks you the home directory of R, and there it puts a directory called src and Tcl. You need to copy those over to whereever you are making R. So for example, I have: C:\usr\R\R-devel\Tcl Where I tar -xf R devel into C:\usr\R\ and then copy the src and Tcl dirs from R tools over into C:\usr\R\R-devel\ Also don't forget to when you finish make all recommended to cd bitmap make all so you can save graphs as PNG, etc. Cheers, Josh On Fri, Oct 4, 2013 at 7:29 PM, Cleber N.Borges kle...@yahoo.com.br wrote: stop because *had a stone in the middle of the way* *in the middle of the way had a stone* (by vinicius de moraes) # so, one more help? somebody? :-) thanks... cleber building package 'tcltk' making init.d from init.c making tcltk.d from tcltk.c making tcltk_win.d from tcltk_win.c gcc -I../../../../include -DNDEBUG -I ../../../../Tcl/include -DWin32 -O3 -Wall -std=gnu99 -mtune=core2 -c init.c -o init.o In file included from init.c:22:0: tcltk.h:23:17: fatal error: tcl.h: No such file or directory compilation terminated. make[4]: *** [init.o] Error 1 make[3]: *** [mksrc-win2] Error 1 make[2]: *** [all] Error 2 make[1]: *** [R] Error 1 make: *** [all] Error 2 Em 04/10/2013 22:46, Cleber N.Borges escreveu: bingo! :-) I got one pass to advanced! my TMP environment variable is: %SystemRoot%\TEMP thanks cleber Em 04/10/2013 22:02, Joshua Wiley escreveu: Hi Cleber, You need to set TMPDIR to a valid directory, the default /tmp/ does not work on Windows. From the cmd shell: set TMPDIR=C:/TMP for example and then run make all recommended Cheers, Josh __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] try to subset zoo dataset
Hi, Try: library(zoo) z2- read.zoo(text=Date WBK_Last WBK_1d_Close 2003-01-03 13.88506 14.08276 2003-01-06 14.11254 13.88506 2003-01-07 14.07033 14.11254 2003-01-14 14.24165 14.30967 2003-01-22 14.28913 14.30563 2003-01-29 13.95664 14.16483 2007-01-01 14.87033 15.11254 2007-01-07 14.34165 14.50967 2007-01-22 12.24913 14.80563 2008-01-29 12.95664 14.26483,sep=,header=TRUE,FUN=as.Date,format=%Y-%m-%d) library(xts) x1- as.xts(z2) x1[2007-01-01/2007-12-31] # WBK_Last WBK_1d_Close #2007-01-01 14.87033 15.11254 #2007-01-07 14.34165 14.50967 #2007-01-22 12.24913 14.80563 x1[2007-01-01/] # WBK_Last WBK_1d_Close #2007-01-01 14.87033 15.11254 #2007-01-07 14.34165 14.50967 #2007-01-22 12.24913 14.80563 #2008-01-29 12.95664 14.26483 A.K. Hi There, I have a zoo dataset with more 10-year daily price data. However, I only want data from a specific date onward. This is how my data looks like: WBK_Last WBK_1d_Close 2003-01-03 13.88506 14.08276 2003-01-06 14.11254 13.88506 2003-01-07 14.07033 14.11254 2003-01-14 14.24165 14.30967 2003-01-22 14.28913 14.30563 2003-01-29 13.95664 14.16483 However, I only want data from 2007-01-01 onward. I know I can use prices[as.Date2007-01-01] to find out the data of the starting date but I can't subset the dataset from that date. Thank you in advance. Any help will be appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.