[R] [R-pkgs] Version 1.3.0 of apcluster package on CRAN
Dear colleagues, This is to inform you that Version 1.3.0 of the R package apcluster has been released on CRAN yesterday. We did a major extension and overhaul of the package. Most importantly, we added Leveraged Affinity Propagation in fulfillment of multiple users requests. It should now be much easier to cluster large data sets. Apart from this extension, the interfaces to apcluster() and related functions have been made more comfortable and flexible. For more details, see the following URLs: http://www.bioinf.jku.at/software/apcluster/ http://www.bioinf.jku.at/software/apcluster/ http://cran.r-project.org/web/packages/apcluster/index.html http://cran.r-project.org/web/packages/apcluster/index.html Best regards, Ulrich *Dr. Ulrich Bodenhofer* Associate Professor Institute of Bioinformatics *Johannes Kepler University* Altenberger Str. 69 4040 Linz, Austria Tel. +43 732 2468 9552 Fax +43 732 2468 9511 bodenho...@bioinf.jku.at mailto:bodenho...@bioinf.jku.at http://www.bioinf.jku.at/ http://www.bioinf.jku.at ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem getting loess tricubic weights
Thank you Mr Gunter! I will look into it. On Wed, Jan 9, 2013 at 11:59 AM, Bert Gunter gunter.ber...@gene.com wrote: As this does not seem to have been answered... I believe you may misunderstand how loess works. The tricube weights are part of the smoothing algorithm and change with each local fit, not fixed weights for observations, which is what the weights argument provides (and initially multiplies the tricube weight, IIRC). I suggest you consult ?predict.loess to get standard deviations of fitted values at existing or new points. -- Bert On Tue, Jan 8, 2013 at 12:57 AM, Joyce Lin joyceli...@gmail.com wrote: Hi I am trying to get the tricube weights from the loess outputs as I need to calculate an error function which requires the weight. So I have used the following example from the R: cars.lo - loess(dist ~ speed, cars, span=0.5, degree=1, family=symmetric) Then i try to get the weights: cars.lo$weights [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 The results are all 1 so i dont think that the tricube weighting are set. May I know what other parameters do i need to tweak to set the weights to tricube weights? Thank you. -- Best regards Joyce Lin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Best regards Joyce Lin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot residuals per factor
Hi, I forgot to mention: levels(dat1$d) #[1] 1 2 3 4 5 Suppose, if I use different levels library(car) dat1$d1-recode(dat1$d,1='A';2='B';3='C';4='D';5='E') levels(dat1$d1) # check the order of the levels #[1] A B C D E mypath-file.path(/home/arun/Trial1,paste(catalin_,LETTERS[1:5],.jpg,sep=)) #change the file.path according to your system for(i in seq_along(mypath)){ jpeg(file=mypath[i]) par(mfrow=c(2,2)) line-lm(y~x,data=dat1[as.numeric(dat1$d1)==i,]) plot(line,which=1:4)# if you want only residual vs. fitted, change which=1 #abline(0,0) dev.off() } In case you need to change the order of levels dat1$d1-factor(dat1$d1,levels=c(C,D,E,A,B)) levels(dat1$d1) #[1] C D E A B mypath-file.path(/home/arun/Trial1,paste(catalin_,LETTERS[c(3,4,5,1,2)],.jpg,sep=)) for(i in seq_along(mypath)){ jpeg(file=mypath[i]) par(mfrow=c(2,2)) line-lm(y~x,data=dat1[as.numeric(dat1$d1)==i,]) plot(line,which=1:4) #abline(0,0) dev.off() } A.K. - Original Message - From: catalin roibu catalinro...@gmail.com To: r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 4:22 AM Subject: [R] plot residuals per factor Dear R-users, I want to plot residuals vs fitted for multiple groups with ggplot2. I try this code, but unsuccessful. library(plyr) models-dlply(dat1,d,function(df) mod-lm(y~x,data=df) ggplot(models,aes(.fitted,.resid), color=factor(d))+ geom_hline(yintercept=0,col=white,size=2)+ geom_point()+ geom_smooth(se=F) -- --- Catalin-Constantin ROIBU Forestry engineer, PhD Forestry Faculty of Suceava Str. Universitatii no. 13, Suceava, 720229, Romania office phone +4 0230 52 29 78, ext. 531 mobile phone +4 0745 53 18 01 +4 0766 71 76 58 FAX: +4 0230 52 16 64 silvic.usv.ro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use 'glmnet' or 'lars' package to select features?
Hi all, I am a newbie of statistics. I want to make lasso feature selection on a bioinfomatics data set. I know I can use 'glmnet' or 'lars' package to do that. However, the glmnet() and lars() function return a model object. I don't know how to use this object to make feature selection. What should I do next? Thanks, Zhong [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t-test behavior given that the null hypothesis is true
Dear all, I observer a strange behavior of the pvalues of the t-test under the null hypothesis. Specifically, I obtain 2 samples of 3 individuals each from a normal distribution of mean 0 and variance 1. Then, I calculate the pvalue using the t-test (var.equal=TRUE, samples are independent). When I make a histogram of pvalues I see that consistently the bin of the smallest pvalues has a lower frequency. Is this a known behavior of the t-test or it's a kind of bug/random number generation problem? kind regards, idaios [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying a user-defined function
Hi Pradip, I didn't check the mode at that time. It generated a matrix test1$newcols- sapply() You can do this: test2-data.frame(test1[,-7],test1$newcols) str(test2) #'data.frame': 51 obs. of 9 variables: # $ ObtMj_P : num 49.6 55 52.5 50.5 51.1 55.1 56.3 53.6 53.5 52.7 ... # $ ObtMj_SE : num 1.37 1.41 1.56 1.22 0.65 1.26 1.28 1.3 1.22 0.67 ... # $ ExpPrevMed_P : num 80 81.8 79.6 78 80.5 81.7 85 79.5 76.2 78.9 ... # $ ExpPrevMed_SE : num 0.91 1.08 1.2 0.78 0.53 1.03 0.93 1.04 1.03 0.52 ... # $ ParMon_P : num 12.1 12.4 15.8 12.8 13 12.1 14.6 14.7 14.3 14.1 ... # $ ParMon_SE : num 0.68 0.9 1.08 0.72 0.41 0.72 0.77 0.97 1.13 0.45 ... # $ ObtMj_P.1 : Factor w/ 5 levels [42,48.7],(48.7,50.9],..: 2 5 3 2 3 5 5 4 4 3 ... # $ ExpPrevMed_P.1: Factor w/ 5 levels [76.2,79.2],..: 2 3 2 1 2 3 5 2 1 1 ... # $ ParMon_P.1 : Factor w/ 5 levels [11.9,12.6],..: 1 1 5 2 2 1 4 4 3 3 .. levels(test2[,7]) #[1] [42,48.7] (48.7,50.9] (50.9,52.8] (52.8,54.2] (54.2,58.7] Do you want to replace this with 1:5? levels(test2[,8]) #[1] [76.2,79.2] (79.2,80.5] (80.5,81.9] (81.9,83.5] (83.5,85] as.numeric(test2[,7]) #[1] 2 5 3 2 3 5 5 4 4 3 3 2 1 3 2 1 1 4 2 5 4 5 3 3 1 1 5 1 4 5 4 5 4 3 1 2 1 4 #[39] 4 5 2 1 2 1 1 5 3 4 3 2 2 A.K. - Original Message - From: Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.gov To: R help r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 10:06 PM Subject: Re: [R] Applying a user-defined function Hello List, Last time, Arun's following solution worked to create 3 new columns (1,3,5). Now how would I tweak this function to create corresponding (additional) columns (7,8,9) of mode factor (levels = 1,2,3,4,5)? Thanks for your continued support. Pradip ### cut and paste from the reproducible example CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) ## Reproducible Example test1 - read.table (text= State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE Alabama,49.60,1.37,80.00,0.91,12.10,0.68 Alaska,55.00,1.41,81.80,1.08,12.40,0.90 Arizona,52.50,1.56,79.60,1.20,15.80,1.08 Arkansas,50.50,1.22,78.00,0.78,12.80,0.72 California,51.10,0.65,80.50,0.53,13.00,0.41 Colorado,55.10,1.26,81.70,1.03,12.10,0.72 Connecticut,56.30,1.28,85.00,0.93,14.60,0.77 Delaware,53.60,1.30,79.50,1.04,14.70,0.97 District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13 Florida,52.70,0.67,78.90,0.52,14.10,0.45 Georgia,52.50,1.15,79.30,1.02,15.90,0.98 Hawaii,49.40,1.33,83.80,1.12,16.00,1.06 Idaho,48.30,1.23,82.40,0.99,11.90,0.74 Illinois,52.70,0.63,81.00,0.46,13.60,0.40 Indiana,49.60,1.16,80.90,0.91,12.60,0.82 Iowa,46.30,1.37,82.10,1.01,13.60,0.87 Kansas,44.30,1.43,79.20,0.98,12.90,0.79 Kentucky,52.90,1.37,78.70,1.05,14.60,0.98 Louisiana,49.70,1.23,76.80,1.06,14.50,0.76 Maine,55.60,1.44,82.90,0.93,16.70,0.83 Maryland,53.90,1.46,83.60,0.95,14.00,0.80 Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80 Michigan,52.40,0.62,80.50,0.47,15.00,0.43 Minnesota,51.50,1.20,84.40,0.87,14.40,0.86 Mississippi,43.20,1.14,76.60,0.91,12.30,0.78 Missouri,48.70,1.20,80.30,0.90,13.70,0.12 Montana,56.40,1.16,83.70,0.95,12.10,0.68 Nebraska,45.70,1.51,83.40,0.95,12.40,0.90 Nevada,54.20,1.17,80.60,1.07,15.80,1.08 New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72 New Jersey,53.20,1.45,83.70,0.95,13.00,0.41 New Mexico,57.60,1.34,78.90,1.03,12.10,0.72 New York,53.70,0.67,82.60,0.48,14.60,0.77 North Carolina,52.20,1.26,81.90,0.84,14.70,0.97 North Dakota,48.60,1.34,84.20,0.88,14.30,1.13 Ohio,50.90,0.61,82.70,0.49,14.10,0.45 Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98 Oregon,54.00,1.35,80.60,1.14,16.00,1.06 Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74 Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40 South Carolina,50.50,1.21,79.50,0.95,12.60,0.82 South Dakota,43.40,1.30,81.70,1.05,13.60,0.87 Tennessee,48.90,1.35,78.40,1.35,12.90,0.79 Texas,48.70,0.62,79.00,0.48,14.60,0.98 Utah,42.00,1.49,85.00,0.93,14.50,0.76 Vermont,58.70,1.24,83.70,0.84,16.70,0.83 Virginia,51.80,1.18,82.00,1.04,14.00,0.80 Washington,53.50,1.39,84.10,0.96,14.70,0.80 West Virginia,52.80,1.07,79.80,0.93,15.00,0.43 Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86 Wyoming,49.20,1.29,82.00,0.85,12.30,0.78 , sep=,, row.names='State', header=TRUE, as.is=TRUE) # change names () to lower case names (test1) - tolower (names (test1)) #Write a cut/quantile function to apply on different columns of the data frame CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste
Re: [R] plot x-axis DateTime NOT evenly spaced
Thanks! it was really helpful. soichi 2013/1/7 arun smartpink...@yahoo.com Hi, Try this: dat1-read.table(text=1 2012-07-01 00:57:54 +0900156 2 2012-07-01 01:07:41 +0900587 3 2012-07-01 01:09:31 +0900110 4 2012-07-01 01:18:42 +0900551 5 2012-07-01 01:39:01 +09001219 6 2012-07-01 01:40:40 +0900 99,sep=,header=FALSE,stringsAsFactors=FALSE) dat2-data.frame(date=paste(dat1[,2],dat1[,3],paste0(+,dat1[,4]),sep= ),Interval=dat1[,5]) dat2$date-as.POSIXct(dat2$date,%Y-%m-%d %H:%M:%S) library(xts) dat3-xts(dat2[,-1],order.by=dat2[,1]) plot(dat3) A.K. - Original Message - From: ishi soichi soichi...@gmail.com To: r-help r-help@r-project.org Cc: Sent: Monday, January 7, 2013 3:55 AM Subject: [R] plot x-axis DateTime NOT evenly spaced R-64 latest Hi. I am trying to plot a set of csv data, which looks like head(interval) date inteval 1 2012-07-01 00:57:54 +0900 156 2 2012-07-01 01:07:41 +0900 587 3 2012-07-01 01:09:31 +0900 110 4 2012-07-01 01:18:42 +0900 551 5 2012-07-01 01:39:01 +09001219 6 2012-07-01 01:40:40 +0900 99 as you can see, more than one event happens each day, and they are not evenly spaced. Obviously hours, minutes and seconds are important for the plot. I tried interval$date - as.Date(interval$date, %Y-%m-%d %H:%M:%S +0900) but this chops the time off. Could anyone show me how to plot data with x values as Date(or Time) objects? soichi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to label two figures in the same chunk independently with knitr
Dear Yihui, thanks a lot for your kind reply. Your solution is very elegant and versatile. However, there is a point that is obscure to me and I didn't manage to fully understand them after looking at the Knitr manual and graphic manual. The issue concerns the hook: *knit_hooks$set(par = function(before, options, envir) { if (before) par(mar = c(4, 4, .1, .1)) })* why do you set par as a function? moreover, below you write: *par(bg=rgb(runif(1), runif(1), runif(1)))* does this mean that* before = rgb(runif(1))*;* options = runif(1)* and*envir = runif(1) *? and what does this produce? I don't understand what's going on, can you please help me or address me to some documentation? thanks in advance for your kind help, f. On 8 January 2013 18:03, Yihui Xie x...@yihui.name wrote: All you mentioned are possible; knitr has very comprehensive support to figures in LaTeX, and what you want in this case is subfigures (\usepackage{subfig}); here is an example: https://github.com/yihui/knitr-examples/blob/master/067-graphics-options.Rnw (search for 'fig.subcap' for the relevant chunk) And here is a preview: http://i.imgur.com/4lKpw.png Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Tue, Jan 8, 2013 at 4:17 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Dear R helpers, I am using knitr to run analysis with R and edit my document with Latex. I am wondering whether there is a way to include 2 or more pictures per chunk and being able to refer them in the text independently and eventually whether it is possible to give them different captions. Let me give you an example.Rnw: \documentclass{article} \title{Example} \author{FS} \begin{document} \maketitle I put some text here. I want to plot to charts in the same figure and label them independently. stat, echo = FALSE, results = 'hide'= ii - 2000:2011 xx - rnorm(12,0,1) yy - rnorm(12,0,1) pm - data.frame(ii,zz,yy @ Now I generate the two pictures and put them into the same chunk with the option out.width set to .49 so that knitr places the two charts side by side: fig:example, echo = FALSE, out.width=.49\\linewidth, fig.cap=this is an example= plot(ii,xx, type = l) plot(ii,yy, type = l, lty = 2) @ Finally, I want the reader to look at the figure on the left. \end{document} How can I do this? If I refer to \ref{fig:example} I will get the number of the figure, but of the chart on the left. Eventually, is it possible to have separate captions for each chart? Thanks in advance for your kind help, f. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Francesco Sarracino, Ph.D. https://sites.google.com/site/fsarracino/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] deparse substitute
Hi, I'm writing a function that needs the input names (as characterstrings) as part of the output. With deparse(substitute( ) ) that works fine, until I replace all zeros with 0.001 (log is calculated at some time): tf - function(input) { input[input==0] - 0.001 ; deparse(substitute(input)) } myguess - 42 tf(myguess) # not myguess, but 42 Now when I extract the input names before replacing the zeros, this works: tf - function(input) { out - deparse(substitute(input)) ; input[input==0] - 0.001 ; out } tf(myguess) # correct: myguess myguess - 0 ; tf(myguess) # ditto While I did find a workaround, I'm still wondering why this happens. Any hints on where to start reading? Thanks ahead, Berry PS: R version 2.15.1 (2012-06-22) -- Roasted Marshmallows Windows 7 - Platform: x86_64-pc-mingw32/x64 (64-bit) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-test behavior given that the null hypothesis is true
On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote: Dear all, I observer a strange behavior of the pvalues of the t-test under the null hypothesis. Specifically, I obtain 2 samples of 3 individuals each from a normal distribution of mean 0 and variance 1. Then, I calculate the pvalue using the t-test (var.equal=TRUE, samples are independent). When I make a histogram of pvalues I see that consistently the bin of the smallest pvalues has a lower frequency. Is this a known behavior of the t-test or it's a kind of bug/random number generation problem? kind regards, idaios Using the following code, I did not observe the behavious you describe. The histograms are consistent with a uniform distribution of the P-values, and the lowest bin for the P-values (when the code is run repeatedly) is not consistently lower (or higher, or anything else) than the other bins. ## My code: N - 1 Ps - numeric(N) for(i in (1:N)){ X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1) Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value } hist(Ps) If you would post the code you used, the reason why you are observing this may become more evident! Hoping this helps, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 09-Jan-2013 Time: 10:29:21 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select partial name and full name columns
Hi, I have the following function: getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator = \t) { DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE, comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings = NA)) DVdatatmper - as.matrix(DVdatatmp[ , c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) retval - as.data.frame(DVdatatmper, colClasses = c(character), fill = TRUE, comment.char = #, stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) - c(dateTime, value) } else if (ncol(retval) == 3) { names(retval) - c(dateTime, value, code) } if (dateFormatCheck(retval$dateTime)) { retval$dateTime - as.Date(retval$dateTime) } else { retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y) } retval$value - as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_3 and 00060_3_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_3 and 00060_3_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way to select those 3 columns only? Thank-you. Irucka Embry span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deparse substitute
On 13-01-09 5:03 AM, Berry Boessenkool wrote: Hi, I'm writing a function that needs the input names (as characterstrings) as part of the output. With deparse(substitute( ) ) that works fine, until I replace all zeros with 0.001 (log is calculated at some time): tf - function(input) { input[input==0] - 0.001 ; deparse(substitute(input)) } myguess - 42 tf(myguess) # not myguess, but 42 Now when I extract the input names before replacing the zeros, this works: tf - function(input) { out - deparse(substitute(input)) ; input[input==0] - 0.001 ; out } tf(myguess) # correct: myguess myguess - 0 ; tf(myguess) # ditto While I did find a workaround, I'm still wondering why this happens. Any hints on where to start reading? Probably the R Language definition, section 2.1.8. The basic explanation for the behaviour you see is that deparse(substitute(input)) acts on the input promise object, looking at its expression slot. It's not doing any magic examination of the context in which it was originally defined. Once you modify it, it is no longer a promise, and so it has no expression slot. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-Forge package check error. Package dependencies on linux platform
Dear All, I've got an error in R-Forge package check. when it checks with windows and mac platform it doesn't give an error except one note which is regarding to maintainer. However, it doesn't check correctly regarding to Linux platform and gives me the following error. Due to this error I couldn't submit my package to cran. Please help me how to solve the problem. BcDiag log file (check_x86_64_linux) Sun Dec 30 16:15:16 2012: Checking package BcDiag (SVN revision 7) ...* using log directory /mnt/building/build_2012-12-30-16-05/RF_PKG_CHECK/PKGS/BcDiag.Rcheck* using R version 2.15.2 Patched (2012-12-14 r61333)* using platform: x86_64-unknown-linux-gnu (64-bit)* using session charset: UTF-8* checking for file BcDiag/DESCRIPTION ... OK* this is package BcDiag version 1.0* checking CRAN incoming feasibility ... NOTEMaintainer: Aregay Mengsteab New submission* checking package namespace information ... OK* checking package dependencies ... ERRORPackages required but not available: isa2 fabiaPackages suggested but not available for checking: isa2 fabia Regards, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in a abline loop
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of arun Sent: Tuesday, January 08, 2013 3:30 PM To: Elaine Kuo Cc: R help Subject: Re: [R] error in a abline loop HI Elaine, In the data you sent to me, it had 5 levels for skin_color. data1-read.csv(skin_color.csv,sep=\t) data1$skin_color-factor(data1$skin_color) levels(data1$skin_color) #[1] 1 2 3 4 5 mypath- file.path(/home/arun/Trial1,paste(Elaine_,1:5,.jpg,sep=)) #change the file.path according to your system Instead of multiple files you can create multipage pdf pdf(myfile.pdf) for(i in 1:5){ # jpeg(file=mypath[i]) plot(body_weight~body_length,data=data1[data1$skin_color==i,]) line-lm(body_weight~body_length,data=data1[data1$skin_color==i,]) abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2) } dev.off() see ?pdf for details Regards Petr # } #or lapply(seq_along(mypath),function(i) {jpeg(file=mypath[i]) line- lm(body_weight~body_length,data=data1[data1$skin_color==i,]) plot(body_weight~body_length,data=data1[data1$skin_color==i,]) abline(line,col=c(yellow,blue,green,orange,red)[i],lwd=2) dev.off() }) A.K. - Original Message - From: Elaine Kuo elaine.kuo...@gmail.com To: arun smartpink...@yahoo.com Cc: Sent: Monday, January 7, 2013 9:48 PM Subject: Re: [R] error in a abline loop Hello arun Thank you always. Please kindly help the attached data for your reference. Elaine On Tue, Jan 8, 2013 at 10:00 AM, arun smartpink...@yahoo.com wrote: HI, A possible guess ( with no data): for (i in 1:7) { subs - data$skin_color==levels(data$skin_color)[i] line-lm(body_weight~body_length, data=subset(data, subset=subs), abline(line,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[i],lwd=2) ) #closing parenthesis for lm( was missing } A.K. - Original Message - From: Elaine Kuo elaine.kuo...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, January 7, 2013 8:23 PM Subject: [R] error in a abline loop Hello I have data of body length and body weight of people of different skin colors. I tried to write a code to plot body length and body weight according to the skin colors. (Thanks for Petr's advice so far.) A loop is used but an error shows up in the following code. It says: unexpected '}' in red3,red3,saddlebrown,coral4,chocolate4,darkblue,navy,g r ey38)[i],lwd=2) } Please kindly advise how to modify the code. Thank you. The code data -read.csv(H:/skincolor.csv,header=T) # graph par(mai=c(1.03,1.03,0.4,0.4)) plot(data$body_weight, data$body_length, xaxp=c(0,200,4), yaxp=c(0,200,4), type=p, pch=1,lwd=1.0, cex.lab=1.4, cex.axis=1.2, font.axis=2, cex=1.5, las=1, bty=l,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[as.numeric(data$skin_color)]) #~ ~ ~ ## for (i in 1:7) { subs - data$skin_color==levels(data$skin_color)[i] line-lm(body_weight~body_length, data=subset(data, subset=subs), abline(line,col=c(yellow,chocolate1,darkorange2, red3,saddlebrown,coral4,grey38)[i],lwd=2) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] update.packages problem
I've updated to R-devel on my development machine and have lots of packages. The update.packages() script ends up with 33 failures, all due to out-of-order reloading. That is, if package abc depends on package xyz, then the reinstall of abc fails with a message that version of xyz is built before R 3.0.0: please re-install it. So I ran it a second time, and got 32 failures. There should be a better way. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.packages problem
On 09/01/2013 12:52, Terry Therneau wrote: I've updated to R-devel on my development machine and have lots of packages. The update.packages() script ends up with 33 failures, all due to out-of-order reloading. That is, if package abc depends on package xyz, then the reinstall of abc fails with a message that version of xyz is built before R 3.0.0: please re-install it. So I ran it a second time, and got 32 failures. There should be a better way. There is, on the help page! checkBuilt: If ‘TRUE’, a package built under an earlier minor version of R is considered to be ‘old’. As the NEWS file says, right at the top Packages need to have been installed under this version of R. (Pro tem, this is considered to be R-devel from April 2012.) so my guess is that 'xyz' was not even installed under R-devel but 2.15.x. We don't usually discuss development versions of R on R-help. PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems regarding the latest version of the package BRugs
I think we a much clearer statement of the problem This link provides some suggestions on how to frame the problem https://github.com/hadley/devtools/wiki/Reproducibility John Kane Kingston ON Canada -Original Message- From: mcmoumi...@gmail.com Sent: Wed, 9 Jan 2013 08:55:25 +0530 To: r-help@r-project.org Subject: [R] problems regarding the latest version of the package BRugs Respected Sir/Madam, I am a research scholar of Department of Statistics, University Of Calcutta. I had downloaded the latest version of BRugs, and installed it in R 2.15.1 both in 32 and 64 bits with the help of openBUGS 3.2.2. My problem is that one of the programmes which What programme? An R package, one you have written youself??? Please provide code and some sample data Please supply some sample data and code. The easiest way to supply data is to use the dput() function. Example with your file named testfile: dput(testfile) Then copy the output and paste into your email. For large data sets, you can just supply a representative sample. Usually, dput(head(testfile, 100)) will be sufficient. requires the package BRugs is giving me an error given below: Error in samplesSize(node) : node must be a scalar variable from the model However, other persons , who are running the programme , using the earlier version of this package , can run the programme without any error. I am unable to find where the actual problem lies. I would be very much obliged if you kindly give me a solution of the problem. Thanking you. Moumita Chatterjee. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need an advise for bayesian estimate
Hi R bayesians, I need an advise how to resolve the two different estimates applying a traditional glm (TG) and a bayes glm (BG), and different results depending on the data formats of response data and the prior specs using bayesglm in R. I'm not familiar with bayes estimate and my colleague asked me to look into this because the EPA from France reported a quite different estimates for the follwoing ethylene data applying bayes method using MCSIM. As seen below glm give same results regardless of response data format, i.e., two-column or binary formats, but bayesglm would give different results. The result from French report is -91.78+8*lnC+6.055*lnT which lie between prior.df=1 and 2 appling binary data format in R. My question are as follows: 1. What is advantage using bayes estimate? Is it better for small samples? 2. How to resolve different estimates depending on the format of response data, and the prior specs (Ex: prior.df)? 3.Should we use interval estimate rather than point estimate for BG? Two-Column format: cppm Tmin lnClnT Death Number 1 18502407.522945.480645 5 2 16372407.400625.480644 5 3 14432407.274485.480641 5 4 10212406.928545.480640 5 5 4827 60 8.481984.094345 5 6 4202 60 8.343324.094341 5 7 4064 60 8.309924.094345 5 8 3966 60 8.285514.094342 5 9 3609 60 8.191194.094340 5 Binary data format: cppm Tmin lnC lnT resp 1 1850 2407.522945.48064 1 2 1850 2407.522945.48064 1 3 1850 2407.522945.48064 1 4 1850 2407.522945.48064 1 5 1850 2407.522945.48064 1 6 1637 2407.400625.48064 1 7 1637 2407.400625.48064 1 8 1637 2407.400625.48064 1 9 1637 2407.400625.48064 1 10 1637 2407.400625.48064 0 11 1443 2407.274485.480641 12 1443 2407.274485.480640 13 1443 2407.274485.480640 14 14432407.27448 5.480640 attach(ehtylene) DL-cbind(Death,Alive=Number-Death) Call: glm(formula = DL ~ lnC + lnT, family = binomial(link = probit), data = ethylene) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -145.156 43.668 -3.324 0.000887 *** lnC 12.972 3.918 3.311 0.000931 *** lnT9.122 2.736 3.335 0.000854 *** Using binary data: Call: glm(formula = resp ~ lnC + lnT, family = binomial(link = probit), data = ethylene.mod) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -145.157 43.670 -3.324 0.000887 *** lnC 12.972 3.918 3.311 0.000931 *** lnT9.122 2.736 3.334 0.000855 *** Using bayesglm with two-column data: summary(result3) Call: bayesglm(formula = DL ~ lnC + lnT, family = binomial(link = probit), data = ethylene) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -134.971 17.490 -7.717 1.19e-14 *** lnC 12.060 1.570 7.680 1.59e-14 *** lnT8.485 1.095 7.751 9.11e-15 *** Using bayesglm with binary data: Call: bayesglm(formula = resp ~ lnC + lnT, family = binomial(link = probit), data = ethylene.mod) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -98.477 26.919 -3.658 0.000254 *** lnC8.792 2.423 3.628 0.000286 *** lnT6.208 1.681 3.694 0.000221 *** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count A, C, T, G in each row in a big data.frame?
forgot the data. this will count the characters; you can add logic with 'table' to count groups x - structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, AA, AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, GA, AA, GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GA, GG, AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, AG, GG, GA, GG, GT, CT, GA, CT, AA, AA, GA), X2460 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, AG, GG, GG, GG, TG, CT, GG, CC, AA, AA, AA), X2474 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, AG, AG, GG, TT, CC, AG, TC, AA, AA, GA), X2603 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, AG, CC, AA, AA, GA), X2282 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AA, GG, TT, TT,
Re: [R] t-test behavior given that the null hypothesis is true
Ah! You have aqssigned a parameter equal.var=TRUE, and equal.var is not a listed paramater for t.test() -- see ?t.test : t.test(x, y = NULL, alternative = c(two.sided, less, greater), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...) Try it instead with var.equal=TRUE, i.e. in your code: for(i in 1:k){ rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], ##equal.var=TRUE, alternative=two.sided)$p.value var.equal=TRUE, alternative=two.sided)$p.value } When I run your code with equal.var, I indeed repeatedly see the deficient bin for the lowest P-values that you observed. When I run your code with var.equal I do not see it. The explanation is that, since equal.var is not a recognised parameter for t.test(), it has assumed the default value FALSE for var.equal, and has therefore (since it is a 2-sample test) adopted the Welch/Satterthwaite procedure: var.equal: a logical variable indicating whether to treat the two variances as being equal. If 'TRUE' then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used. This has the effect of somewhat adapting the test procedure to the data, so that extreme (i.e. small) values of P are even rarer than they should be. With best wishes, Ted. On 09-Jan-2013 13:24:59 Pavlos Pavlidis wrote: Hi Ted, thanks for the reply. I use a similar code which you can see below: k - 1 c - 6 rv - array(NA, dim=c(k, c) ) for(i in 1:k){ rv[i,] - rnorm(c, mean=0, sd=1) } rv.t.pvalues - array(NA, k) for(i in 1:k){ rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], equal.var=TRUE, alternative=two.sided)$p.value } hist(rv.t.pvalues) The histogram is this one: *http://tinyurl.com/histogram-rt-pvalues-pdf * *all the best idaios * On Wed, Jan 9, 2013 at 12:29 PM, Ted Harding ted.hard...@wlandres.netwrote: On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote: Dear all, I observer a strange behavior of the pvalues of the t-test under the null hypothesis. Specifically, I obtain 2 samples of 3 individuals each from a normal distribution of mean 0 and variance 1. Then, I calculate the pvalue using the t-test (var.equal=TRUE, samples are independent). When I make a histogram of pvalues I see that consistently the bin of the smallest pvalues has a lower frequency. Is this a known behavior of the t-test or it's a kind of bug/random number generation problem? kind regards, idaios Using the following code, I did not observe the behavious you describe. The histograms are consistent with a uniform distribution of the P-values, and the lowest bin for the P-values (when the code is run repeatedly) is not consistently lower (or higher, or anything else) than the other bins. ## My code: N - 1 Ps - numeric(N) for(i in (1:N)){ X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1) Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value } hist(Ps) If you would post the code you used, the reason why you are observing this may become more evident! Hoping this helps, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 09-Jan-2013 Time: 10:29:21 This message was sent by XFMail - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 09-Jan-2013 Time: 14:51:04 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count A, C, T, G in each row in a big data.frame?
If test is the structure, will test2-sapply(test[,-c(1:4)],function(x){table(t(x))}) to what you want? On 09.01.2013, at 15:48, jim holtman wrote: forgot the data. this will count the characters; you can add logic with 'table' to count groups x - structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, AA, AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, GA, AA, GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GA, GG, AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, AG, GG, GA, GG, GT, CT, GA, CT, AA, AA, GA), X2460 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, AG, GG, GG, GG, TG, CT, GG, CC, AA, AA, AA), X2474 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, AG, AG, GG, TT, CC, AG, TC, AA, AA, GA), X2603 =
Re: [R] Need an advise for bayesian estimate
Hi Kyong, Even if it is not -as can be inferred from what you said- a homework or assignment related query (and the group has clear policy against such requests), the questions you posed have nontheless very little to do specifically with R. Instead, they are about statistics. In this respect, I would suggest then you read Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman and Jennifer Hill, which covers these issues. Regards, José -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kyong park Sent: 09 January 2013 14:38 To: R forum Subject: [R] Need an advise for bayesian estimate Hi R bayesians, I need an advise how to resolve the two different estimates applying a traditional glm (TG) and a bayes glm (BG), and different results depending on the data formats of response data and the prior specs using bayesglm in R. I'm not familiar with bayes estimate and my colleague asked me to look into this because the EPA from France reported a quite different estimates for the follwoing ethylene data applying bayes method using MCSIM. As seen below glm give same results regardless of response data format, i.e., two-column or binary formats, but bayesglm would give different results. The result from French report is -91.78+8*lnC+6.055*lnT which lie between prior.df=1 and 2 appling binary data format in R. My question are as follows: 1. What is advantage using bayes estimate? Is it better for small samples? 2. How to resolve different estimates depending on the format of response data, and the prior specs (Ex: prior.df)? 3.Should we use interval estimate rather than point estimate for BG? Two-Column format: cppm Tmin lnClnT Death Number 1 18502407.522945.480645 5 2 16372407.400625.480644 5 3 14432407.274485.480641 5 4 10212406.928545.480640 5 5 4827 60 8.481984.094345 5 6 4202 60 8.343324.094341 5 7 4064 60 8.309924.094345 5 8 3966 60 8.285514.094342 5 9 3609 60 8.191194.094340 5 Binary data format: cppm Tmin lnC lnT resp 1 1850 2407.522945.48064 1 2 1850 2407.522945.48064 1 3 1850 2407.522945.48064 1 4 1850 2407.522945.48064 1 5 1850 2407.522945.48064 1 6 1637 2407.400625.48064 1 7 1637 2407.400625.48064 1 8 1637 2407.400625.48064 1 9 1637 2407.400625.48064 1 10 1637 2407.400625.48064 0 11 1443 2407.274485.480641 12 1443 2407.274485.480640 13 1443 2407.274485.480640 14 14432407.27448 5.480640 attach(ehtylene) DL-cbind(Death,Alive=Number-Death) Call: glm(formula = DL ~ lnC + lnT, family = binomial(link = probit), data = ethylene) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -145.156 43.668 -3.324 0.000887 *** lnC 12.972 3.918 3.311 0.000931 *** lnT9.122 2.736 3.335 0.000854 *** Using binary data: Call: glm(formula = resp ~ lnC + lnT, family = binomial(link = probit), data = ethylene.mod) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -145.157 43.670 -3.324 0.000887 *** lnC 12.972 3.918 3.311 0.000931 *** lnT9.122 2.736 3.334 0.000855 *** Using bayesglm with two-column data: summary(result3) Call: bayesglm(formula = DL ~ lnC + lnT, family = binomial(link = probit), data = ethylene) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -134.971 17.490 -7.717 1.19e-14 *** lnC 12.060 1.570 7.680 1.59e-14 *** lnT8.485 1.095 7.751 9.11e-15 *** Using bayesglm with binary data: Call: bayesglm(formula = resp ~ lnC + lnT, family = binomial(link = probit), data = ethylene.mod) Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -98.477 26.919 -3.658 0.000254 *** lnC8.792 2.423 3.628 0.000286 *** lnT6.208 1.681 3.694 0.000221 *** [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Wrap Up and Run 10k is back! Also, new for 2013 – 2km intergenerational walks at selected venues. So recruit a buddy, dust
[R] Reminder: useR meetup group in Munich, Germany
Dear all, this is a short reminder for the Meetup Munich useR group. Next week Wednesday (16th January 2013) we have our first meeting with two talks about Reporting and Reproducible Research with R and some ideas for an R certification. More information at http://www.meetup.com/munich-useR-group/events/63749502/ Meet you next week Markus On Dec 21, 2012, at 3:28 PM, Markus Schmidberger mschmidber...@freenet.de wrote: Dear all, I would like to invite Munich (Germany) area R users for our first meeting: 16th January 2013. The group is aimed to bring together practitioners (from industry and academia) in order to exchange knowledge and experience in solving data analysis statistical problems by using R. More information about the group at: http://www.meetup.com/munich-useR-group/ 1. Meeting: http://www.meetup.com/munich-useR-group/events/63749502/ Merry Christmas, happy New Year and see you in 2013 Markus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic loop programming
Hi all, newbie question: I am trying to set up a very simple loop without succeeding. Let's say I have monthly observation of two variables for a year - Sales_2012_01, Sales_2012_02, Sales_2012_03, (total sales for jan 2012,feb 2012, etc.) - Customers_2012_01, Customers_2012_02, (total number of customers for jan 2012, etc.) and I want to create new monthly variables in order to compute revenues per customers: Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ... how can I proceed? In other programming language I used just to write something like for (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i' } but in R it seems not to work like that. Further, and correct me if I am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in front of the single-digit months. thanks in advance for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic loop programming
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Paolo Donatelli Sent: Wednesday, January 09, 2013 5:03 PM To: r-help@r-project.org Subject: [R] Basic loop programming Hi all, newbie question: I am trying to set up a very simple loop without succeeding. Let's say I have monthly observation of two variables for a year - Sales_2012_01, Sales_2012_02, Sales_2012_03, (total sales for jan 2012,feb 2012, etc.) - Customers_2012_01, Customers_2012_02, (total number of customers for jan 2012, etc.) and I want to create new monthly variables in order to compute revenues per customers: Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ... how can I proceed? In other programming language I used just to write something like for (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i' } but in R it seems not to work like that. Further, and correct me if I am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in front of the single-digit months. Hm. Why do you want to do it in R if you prefer other languages? Did you find R by accident or are you prepared to use it in future? If you want to use it, it is just right time to learn some basics. Anyway, if you have 2 vectors, call them sales and customers, you can just do av.revenue - sales/customers Until you do not provide more info about your data e.g. by at least some of ?head, ?str or preferably ?dput you hardly get some suitable advice. Regards Petr thanks in advance for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic loop programming
Hi Paolo, You say you have monthly observations of two variables, say Sales and Customers. Then, what you should have is something like this: Year Month Sales Customer 2012Jan ss_12.1 cc_12.1 2012Feb ss_12.2 cc_12.2 ... ... ... ... 2013Jan ss_13.1 cc_13.1 2013Feb ss_13.2 cc_13.2 ... ... ... ... where ss_YY.M and cc_ YY.M are numerical values (the total sales and number of customers for year YY and month M, respectively). For example, Year Month Sales Customer 2012Jan 100 25 2012Feb 120 30 ... ... ... ... If this is the case, and you have the data in a data frame (say df), all you need to do to create a new column in your data frame with the average revenue is: df$Av_revenue - df$Sales/ df$Customer You can omit df$ from the instruction above if you want to create the object Av_revenue but not include it in the data frame. I am not getting it right, would you please send us the first three or four lines of your data? Regards, José -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Paolo Donatelli Sent: 09 January 2013 16:03 To: r-help@r-project.org Subject: [R] Basic loop programming Hi all, newbie question: I am trying to set up a very simple loop without succeeding. Let's say I have monthly observation of two variables for a year - Sales_2012_01, Sales_2012_02, Sales_2012_03, (total sales for jan 2012,feb 2012, etc.) - Customers_2012_01, Customers_2012_02, (total number of customers for jan 2012, etc.) and I want to create new monthly variables in order to compute revenues per customers: Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ... how can I proceed? In other programming language I used just to write something like for (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i' } but in R it seems not to work like that. Further, and correct me if I am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in front of the single-digit months. thanks in advance for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Wrap Up and Run 10k is back! Also, new for 2013 – 2km intergenerational walks at selected venues. So recruit a buddy, dust off the trainers and beat the winter blues by signing up now: http://www.ageuk.org.uk/10k Milton Keynes | Oxford | Sheffield | Crystal Palace | Exeter | Harewood House, Leeds | Tatton Park, Cheshire | Southampton | Coventry Age UK Improving later life http://www.ageuk.org.uk --- Age UK is a registered charity and company limited by guarantee, (registered charity number 1128267, registered company number 6825798). Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA. For the purposes of promoting Age UK Insurance, Age UK is an Appointed Representative of Age UK Enterprises Limited, Age UK is an Introducer Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth Access for the purposes of introducing potential annuity and health cash plans customers respectively. Age UK Enterprises Limited, JLT Benefit Solutions Limited and Simplyhealth Access are all authorised and regulated by the Financial Services Authority. -- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you receive a message in error, please advise the sender and delete immediately. Except where this email is sent in the usual course of our business, any opinions expressed in this email are those of the author and do not necessarily reflect the opinions of Age UK or its subsidiaries and associated companies. Age UK monitors all e-mail transmissions passing through its network and may block or modify mails which are deemed to be unsuitable. Age Concern England (charity number 261794) and Help the Aged (charity number 272786) and their trading and other associated companies merged on 1st April 2009. Together they have formed the Age UK Group, dedicated to improving the lives of people in later life. The three national Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help the Aged in these nations to form three registered charities: Age Scotland, Age NI, Age Cymru. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] rJava Error
On Jun 27, 2012, at 12:16 AM, fabin.ittiachan wrote: Hi, I'm receiving an error when I am trying to install rJava. I have posted the error below. Your R was not compiled with --enable-R-shlib so you can't use JRI (see http://rforge.net/rJava). You can either disable JRI (if you don't need it) or have to use R compiled with shlib support. NB: stats-rosuda-devel mailing list is the proper list for rJava questions. Cheers, Simon RHive_0.0-6.tar.gz rJava_0.9-3.tar.gz RJDBC_0.2-0.tar.gz Rserve_0.6-8.tar.gz [root@localhost Package]# R CMD INSTALL rJava_0.9-3.tar.gz * installing to library ‘/usr/local/lib64/R/library’ * installing *source* package ‘rJava’ ... ** package ‘rJava’ successfully unpacked and MD5 sums checked checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking whether time.h and sys/time.h may both be included... yes configure: checking whether gcc -std=gnu99 supports static inline... yes checking whether setjmp.h is POSIX.1 compatible... yes checking whether sigsetjmp is declared... yes checking whether siglongjmp is declared... yes checking Java support in R... present: interpreter : '/usr/local/Java/jre/bin/java' archiver: '/usr/local/Java/jre/../bin/jar' compiler: '/usr/local/Java/jre/../bin/javac' header prep.: '/usr/local/Java/jre/../bin/javah' cpp flags : '-I/usr/local/Java/jre/../include -I/usr/local/Java/jre/../include/linux' java libs : '-L/usr/local/Java/jre/lib/amd64 -L/usr/local/Java/jre/lib/amd64/server -ljvm' checking whether JNI programs can be compiled... yes checking JNI data types... ok checking whether JRI should be compiled (autodetect)... yes checking whether debugging output should be enabled... no checking whether memory profiling is desired... no checking whether threads support is requested... no checking whether callbacks support is requested... no checking whether JNI cache support is requested... no checking whether JRI is requested... yes configure: creating ./config.status config.status: creating src/Makevars config.status: creating R/zzz.R config.status: creating src/config.h === configuring in jri (/tmp/RtmpUyYk4N/R.INSTALL6645c827c46/rJava/jri) configure: running /bin/sh ./configure '--prefix=/usr/local' --cache-file=/dev/null --srcdir=. checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking for gcc... gcc -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc -std=gnu99 accepts -g... yes checking for gcc -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -std=gnu99 -E checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for ANSI C header files... yes checking whether Java interpreter works... checking whether JNI programs can be compiled... yes checking whether JNI programs can be run... yes checking JNI data types... ok checking whether Rinterface.h exports R_CStackXXX variables... yes checking whether Rinterface.h exports R_SignalHandlers... yes configure: creating ./config.status config.status: creating src/Makefile config.status: creating Makefile config.status: creating run config.status: creating src/config.h ** libs gcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG -I. -I/usr/local/Java/jre/../include -I/usr/local/Java/jre/../include/linux -I/usr/local/include-fpic -g -O2 -c Rglue.c -o Rglue.o gcc -std=gnu99 -I/usr/local/lib64/R/include -DNDEBUG -I. -I/usr/local/Java/jre/../include
Re: [R] Java, rJava, and Windows x64
On Oct 29, 2012, at 11:17 AM, Robert Baer wrote: When running [1] R version 2.15.1 (2012-06-22) x86_64-pc-mingw32, rJava fails. I have installed both the 32-bit and 64-bit versions of Java 7 update 9. library(rJava) Error : .onLoad failed in loadNamespace() for 'rJava', details: call: stop(No CurrentVersion entry in ', key, '! Try re-installing Java and make sure R and Java have matching architectures.) error: object 'key' not found Error: package/namespace load failed for ‘rJava’ It appears that rJava was not seeing the x64 Java. For clarity, I installed the 32-bit java library second, and I imagined this might be the problem. The Java installer told me that it was already present, and the x64 library appeared to be working with the 64-bit IE9 browser Indeed, reinstalling Java x64, the rJava package iloaded fine with the library(rJava) command in 64-bit R. rJava could STILL be loaded with library(rJava) within x86 R. My question is, should the order of Java installation affect the ability of rJava to load under 64-bit R? Are there environmental variables or registry settings that should be checked in such cases or is it literally necessary to do a complete reinstall? The registries are completely separate for 32-bit and 64-bit so installing 32-bit Java doesn't affect 64-bit R and vice-versa. rJava is simply checking the registry that is has access to and it is the one corresponding to the R process (so 32-bit R will check 32-bit registry and 64-bit R will check the 64-bit registry). It is looking for either of Software\JavaSoft\Java Runtime Environment or Software\JavaSoft\Java Development Kit registry tree. Cheers, Simon PS: Please uses stats-rosuda-devel mailing list for rJava questions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] weighted factor analysis
hello there, I am trying to use a weight variable in a factor analysis but apparently the factanal command does not have a weight option. Any way to this? Thanks for your suggestions, V [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select partial name and full name columns
Hi Arun, thank-you for your suggestion. I made a mistake previously when I suggested that there was a prefix in front of 00060_3 possibly suggesting that it was a string of characters rather than numbers. The prefix in front of 00060_3 is actually two numbers, see the examples below: 01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd 02_00060_3 02_00060_3_cd How can the following code be modified to reflect the numerical rather than character prefix? dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] Thank-you. Irucka Embry -Original Message- From: arun [smartpink...@yahoo.com] Sent: 1/9/2013 7:13:05 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] select partial name and full name columns Hi, May be this is creating the problem: set.seed(15) dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_ 3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TR UE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/ 3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y %H:%M)) dat1[,c(datetime,grep(00060_3,colnames(dat1)))] #Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3, colnames(dat1 : #undefined columns selected dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] # datetime A_00060_3 B_00060_3_cd C_00060_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 2810 #3 2011-06-03 00:35:0010 22 8 #4 2011-06-03 00:40:00 7 2711 #5 2011-06-03 00:45:00 4 2913 A.K. - Original Message - From: Irucka Embry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 5:44 AM Subject: [R] select partial name and full name columns Hi, I have the following function: getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator = \t) { DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE, comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings = NA)) DVdatatmper - as.matrix(DVdatatmp[ , c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) retval - as.data.frame(DVdatatmper, colClasses = c(character), fill = TRUE, comment.char = #, stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) - c(dateTime, value) } else if (ncol(retval) == 3) { names(retval) - c(dateTime, value, code) } if (dateFormatCheck(retval$dateTime)) { retval$dateTime - as.Date(retval$dateTime) } else { retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y) } retval$value - as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_3 and 00060_3_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_3 and 00060_3_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way to select those 3 columns only? Thank-you. Irucka Embry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count A, C, T, G in each row in a big data.frame?
Sorry, you wanted rows, i wrote for columns #rows would be: test2-apply(test[,-c(1:4)],1,function(x){table(t(x))}) #find single values in a row sapply(test2,function(row){ allVars-paste(names(row),collapse=) u - unique(strsplit(allVars,)[[1]]) parts-sapply(names(row),function(x){u%in%strsplit(x,)[[1]]}) mat-parts%*%row rownames(mat)-u mat }) though i guess lists aren't ideal, but theres another answer as well i see. On 09.01.2013, at 15:23, Yao He wrote: Dear All I have a data.frame like that: structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, AA, AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, GA, AA, GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GA, GG, AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG, TT, TT, CC, TT,
Re: [R] Integrating Java, C++ and R
On Jan 4, 2013, at 11:41 AM, Dirk Eddelbuettel wrote: On 4 January 2013 at 16:57, Suzen, Mehmet wrote: | On 4 January 2013 11:36, Royden Fernandes roydens...@gmail.com wrote: | Hi, | | I am able to integrate C++ and R through RInside library. However when I | | Questions regarding RInside should go to the rcpp-devel mailing list. | http://lists.r-forge.r-project.org/mailman/listinfo/rcpp-devel Very good. But to the OP's defence -- he posted there. But as he himself stated (in what you still quote here): The RInside integration of R and C++ works for him, but Java created trouble. So I recommended r-devel (not r-help) to seek help from someone with better Java understanding. Agreed. It will require combined knowledge of Rcpp and Java, though. What is testR()? Note that R requires a set of environment variables to be setup correctly in order to run - so did you start your program using R CMD java ...? Also you will likely need to make sure that you disable stack limit checks since java may change the stack depending on the thread. Another alternative would be to use JRI as a starting point since it solves all the R/Java issues and then call C++ code from there. Cheers, Simon Dirk -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select partial name and full name columns
Hi Arun, thanks again for your assistance. Previously I did not read the files with the headers so I could not search for those prefixed names. I corrected my mistake and the code that you suggested does work. Irucka -Original Message- From: arun [smartpink...@yahoo.com] Sent: 1/9/2013 11:09:13 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] select partial name and full name columns Hi, You can use the same code: set.seed(15) dat1-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TR UE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.P OSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),fo rmat=%m/%d/%Y %H:%M)) colnames(dat1)[1:4]-c(01_00060_3,01_60_3_cd,15_6 0_3,15_00060) dat1 # 01_00060_3 01_60_3_cd 15_60_3 15_00060 #1 7 30 27 #2 2 28 104 #3 10 22 88 #4 7 27 112 #5 4 29 137 # datetime #1 2011-06-03 00:00:00 #2 2011-06-03 00:30:00 #3 2011-06-03 00:35:00 #4 2011-06-03 00:40:00 #5 2011-06-03 00:45:00 dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] # datetime 01_00060_3 01_60_3_cd 15_60_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 28 10 #3 2011-06-03 00:35:00 10 22 8 #4 2011-06-03 00:40:00 7 27 11 #5 2011-06-03 00:45:00 4 29 13 A.K. From: Irucka Embry iruc...@mail2world.com To: smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, January 9, 2013 11:36 AM Subject: Re: [R] select partial name and full name columns Hi Arun, thank-you for your suggestion. I made a mistake previously when I suggested that there was a prefix in front of 00060_3 possibly suggesting that it was a string of characters rather than numbers. The prefix in front of 00060_3 is actually two numbers, see the examples below: 01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd 02_00060_3 02_00060_3_cd How can the following code be modified to reflect the numerical rather than character prefix? dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] Thank-you. Irucka Embry -Original Message- From: arun [smartpink...@yahoo.com] Sent: 1/9/2013 7:13:05 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] select partial name and full name columns Hi, May be this is creating the problem: set.seed(15) dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_000 03_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=T RUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as..POSIXct(paste(rep( 6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y %H:%M)) dat1[,c(datetime,grep(00060_3,colnames(dat1)))] #Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3, colnames(dat1 : #undefined columns selected dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))]) ] # datetime A_00060_3 B_00060_3_cd C_00060_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 2810 #3 2011-06-03 00:35:0010 22 8 #4 2011-06-03 00:40:00 7 2711 #5 2011-06-03 00:45:00 4 2913 A.K. - Original Message - From: Irucka Embry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 5:44 AM Subject: [R] select partial name and full name columns Hi, I have the following function: getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator = \t) { DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE, comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings = NA)) DVdatatmper - as.matrix(DVdatatmp[ , c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) retval - as.data.frame(DVdatatmper, colClasses = c(character), fill = TRUE, comment.char = #, stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) - c(dateTime, value) } else if (ncol(retval) == 3) { names(retval) - c(dateTime, value, code) } if (dateFormatCheck(retval$dateTime)) { retval$dateTime - as.Date(retval$dateTime) } else { retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y) } retval$value - as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) :
[R] R encrypt/decrypt
Hello, I am working on a web system (php) that uses R in the backend, and we need some basic fast encryption/decryption for the underlying mysql database that can be used by both R AND php. It does not need to be top-of-the-line, but just provide some basic level of fast encryption/decryption. Any suggestions? Thank you, Ramiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select partial name and full name columns
Hi, May be this is creating the problem: set.seed(15) dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y %H:%M)) dat1[,c(datetime,grep(00060_3,colnames(dat1)))] #Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3, colnames(dat1 : #undefined columns selected dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] # datetime A_00060_3 B_00060_3_cd C_00060_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 28 10 #3 2011-06-03 00:35:00 10 22 8 #4 2011-06-03 00:40:00 7 27 11 #5 2011-06-03 00:45:00 4 29 13 A.K. - Original Message - From: Irucka Embry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 5:44 AM Subject: [R] select partial name and full name columns Hi, I have the following function: getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator = \t) { DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE, comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings = NA)) DVdatatmper - as.matrix(DVdatatmp[ , c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) retval - as.data.frame(DVdatatmper, colClasses = c(character), fill = TRUE, comment.char = #, stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) - c(dateTime, value) } else if (ncol(retval) == 3) { names(retval) - c(dateTime, value, code) } if (dateFormatCheck(retval$dateTime)) { retval$dateTime - as.Date(retval$dateTime) } else { retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y) } retval$value - as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_3 and 00060_3_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_3 and 00060_3_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way to select those 3 columns only? Thank-you. Irucka Embry span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage – POP3 – Calendar – SMS – Translator – Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem adding curve/abline
Hey, I'm stuck on something I already did before (just a different kind of database), and whatever I try, it doesn't work anymore. So thanks for your help. Here's how my data approximately looks like: year season replicate sizefreq weight 2000 summer ch1 6 1 45 2000 summer ch1 6.5 12 46 2000 summer ch1 7 33 470 I have 2 years (2000 and 2001) and 2 seizons (winter and summer). I wanted to plot weight~size, with 2 groups (year and seizon), so here's my shortened script for that: database$groups=paste(database$seizon,database2$year,sep= ) xyplot(database$weight~database2$size, groups=database$groups, par.settings=list(superpose.symbol=list(col=col.list,pch=c(21,16,21,16))), auto.key=list(corner=c(0.1,0.9),lines=F,points=T)) Which works fine, the problem comes when I try to add 2 exponential curves to the data (the 2 seizons). I tried this: summ=subset(database,seizon==summer) modsumm=nls(summ$weight~exp(a+b*summ$size), data=summ, start=list(a=0,b=0)) exposumm=curve(exp(0.05354+0.19872*x), from=0, to=22, add=T, lwd=1, col=blue,lty=1) After having to add plot.new() in the front, the line does or not show up, or shows up but wrongly placed. I thought this might be because of the subset, so I wanted to do something like this: modsumm=nls(weight~exp(a+b* size), data=engsAGG2[seizon==summer], start=list(a=0,b=0)) which returns: undefined columns selected Thanks in advance for the reply. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic loop programming
HI, If you have more than one observation per month, you could do this: dat1-read.table(text= Year Month Sales Customer 2011 Jan 150 35 2011 Jan 125 40 2011 Feb 130 45 2011 Feb 135 25 2012 Jan 100 25 2012 Jan 150 35 2012 Feb 118 45 2012 Feb 120 30 2012 Mar 130 43 2012 Mar 125 35 ,sep=,header=TRUE,stringsAsFactors=FALSE) res-aggregate(.~Year+Month,data=dat1,mean) within(res,{Avrev-Sales/Customer}) # Year Month Sales Customer Avrev #1 2011 Feb 132.5 35.0 3.785714 #2 2012 Feb 119.0 37.5 3.17 #3 2011 Jan 137.5 37.5 3.67 #4 2012 Jan 125.0 30.0 4.17 #5 2012 Mar 127.5 39.0 3.269231 A.K. - Original Message - From: Paolo Donatelli donatellipa...@gmail.com To: r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 11:02 AM Subject: [R] Basic loop programming Hi all, newbie question: I am trying to set up a very simple loop without succeeding. Let's say I have monthly observation of two variables for a year - Sales_2012_01, Sales_2012_02, Sales_2012_03, (total sales for jan 2012,feb 2012, etc.) - Customers_2012_01, Customers_2012_02, (total number of customers for jan 2012, etc.) and I want to create new monthly variables in order to compute revenues per customers: Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ... how can I proceed? In other programming language I used just to write something like for (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i' } but in R it seems not to work like that. Further, and correct me if I am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in front of the single-digit months. thanks in advance for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying a user-defined function
Hi Pradip, Another way to get the results would be: res-cbind(test1,do.call(data.frame,lapply(test1[,seq(1,6,2)],CutQuintiles))) colnames(res)[7:9]-paste(newcols_,colnames(res)[7:9],) sapply(res,is.factor) # ObtMj_P ObtMj_SE ExpPrevMed_P # FALSE FALSE FALSE # ExpPrevMed_SE ParMon_P ParMon_SE # FALSE FALSE FALSE # newcols_ ObtMj_P newcols_ ExpPrevMed_P newcols_ ParMon_P # TRUE TRUE TRUE Hope it helps. A.K. - Original Message - From: Muhuri, Pradip (SAMHSA/CBHSQ) pradip.muh...@samhsa.hhs.gov To: R help r-help@r-project.org Cc: Sent: Tuesday, January 8, 2013 10:06 PM Subject: Re: [R] Applying a user-defined function Hello List, Last time, Arun's following solution worked to create 3 new columns (1,3,5). Now how would I tweak this function to create corresponding (additional) columns (7,8,9) of mode factor (levels = 1,2,3,4,5)? Thanks for your continued support. Pradip ### cut and paste from the reproducible example CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) ## Reproducible Example test1 - read.table (text= State,ObtMj_P,ObtMj_SE,ExpPrevMed_P,ExpPrevMed_SE,ParMon_P,ParMon_SE Alabama,49.60,1.37,80.00,0.91,12.10,0.68 Alaska,55.00,1.41,81.80,1.08,12.40,0.90 Arizona,52.50,1.56,79.60,1.20,15.80,1.08 Arkansas,50.50,1.22,78.00,0.78,12.80,0.72 California,51.10,0.65,80.50,0.53,13.00,0.41 Colorado,55.10,1.26,81.70,1.03,12.10,0.72 Connecticut,56.30,1.28,85.00,0.93,14.60,0.77 Delaware,53.60,1.30,79.50,1.04,14.70,0.97 District of Columbia,53.50,1.22,76.20,1.03,14.30,1.13 Florida,52.70,0.67,78.90,0.52,14.10,0.45 Georgia,52.50,1.15,79.30,1.02,15.90,0.98 Hawaii,49.40,1.33,83.80,1.12,16.00,1.06 Idaho,48.30,1.23,82.40,0.99,11.90,0.74 Illinois,52.70,0.63,81.00,0.46,13.60,0.40 Indiana,49.60,1.16,80.90,0.91,12.60,0.82 Iowa,46.30,1.37,82.10,1.01,13.60,0.87 Kansas,44.30,1.43,79.20,0.98,12.90,0.79 Kentucky,52.90,1.37,78.70,1.05,14.60,0.98 Louisiana,49.70,1.23,76.80,1.06,14.50,0.76 Maine,55.60,1.44,82.90,0.93,16.70,0.83 Maryland,53.90,1.46,83.60,0.95,14.00,0.80 Massachusetts,55.40,1.41,81.00,1.15,14.70,0.80 Michigan,52.40,0.62,80.50,0.47,15.00,0.43 Minnesota,51.50,1.20,84.40,0.87,14.40,0.86 Mississippi,43.20,1.14,76.60,0.91,12.30,0.78 Missouri,48.70,1.20,80.30,0.90,13.70,0.12 Montana,56.40,1.16,83.70,0.95,12.10,0.68 Nebraska,45.70,1.51,83.40,0.95,12.40,0.90 Nevada,54.20,1.17,80.60,1.07,15.80,1.08 New Hampshire,56.10,1.30,83.30,0.93,12.80,0.72 New Jersey,53.20,1.45,83.70,0.95,13.00,0.41 New Mexico,57.60,1.34,78.90,1.03,12.10,0.72 New York,53.70,0.67,82.60,0.48,14.60,0.77 North Carolina,52.20,1.26,81.90,0.84,14.70,0.97 North Dakota,48.60,1.34,84.20,0.88,14.30,1.13 Ohio,50.90,0.61,82.70,0.49,14.10,0.45 Oklahoma,47.20,1.42,78.80,1.33,15.90,0.98 Oregon,54.00,1.35,80.60,1.14,16.00,1.06 Pennsylvania,53.00,0.63,79.90,0.47,11.90,0.74 Rhode Island,57.20,1.20,79.50,1.02,13.60,0.40 South Carolina,50.50,1.21,79.50,0.95,12.60,0.82 South Dakota,43.40,1.30,81.70,1.05,13.60,0.87 Tennessee,48.90,1.35,78.40,1.35,12.90,0.79 Texas,48.70,0.62,79.00,0.48,14.60,0.98 Utah,42.00,1.49,85.00,0.93,14.50,0.76 Vermont,58.70,1.24,83.70,0.84,16.70,0.83 Virginia,51.80,1.18,82.00,1.04,14.00,0.80 Washington,53.50,1.39,84.10,0.96,14.70,0.80 West Virginia,52.80,1.07,79.80,0.93,15.00,0.43 Wisconsin,49.90,1.50,83.50,1.02,14.40,0.86 Wyoming,49.20,1.29,82.00,0.85,12.30,0.78 , sep=,, row.names='State', header=TRUE, as.is=TRUE) # change names () to lower case names (test1) - tolower (names (test1)) #Write a cut/quantile function to apply on different columns of the data frame CutQuintiles - function( x) { cut (x,quantile (x, (0:5/5)),include.lowest=TRUE) } #apply the CutQuintile () on every odd-numbered columns of the test1 data frame test1$newcols - sapply(test1 [, seq (1,6,2)], CutQuintiles) # name 3 new columns based on the odd-numbered columns names(test1$newcols) - paste (names(test1 [, seq (1,6,2)]), _cat) dim (test1) options (width=100) test1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented,
Re: [R] select partial name and full name columns
Hi, You can use the same code: set.seed(15) dat1-data.frame(sample(1:10,5,replace=TRUE),sample(20:30,5,replace=TRUE),sample(1:15,5,replace=TRUE),sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y %H:%M)) colnames(dat1)[1:4]-c(01_00060_3,01_60_3_cd,15_60_3,15_00060) dat1 # 01_00060_3 01_60_3_cd 15_60_3 15_00060 #1 7 30 2 7 #2 2 28 10 4 #3 10 22 8 8 #4 7 27 11 2 #5 4 29 13 7 # datetime #1 2011-06-03 00:00:00 #2 2011-06-03 00:30:00 #3 2011-06-03 00:35:00 #4 2011-06-03 00:40:00 #5 2011-06-03 00:45:00 dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] # datetime 01_00060_3 01_60_3_cd 15_60_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 28 10 #3 2011-06-03 00:35:00 10 22 8 #4 2011-06-03 00:40:00 7 27 11 #5 2011-06-03 00:45:00 4 29 13 A.K. From: Irucka Embry iruc...@mail2world.com To: smartpink...@yahoo.com Cc: r-help@r-project.org Sent: Wednesday, January 9, 2013 11:36 AM Subject: Re: [R] select partial name and full name columns Hi Arun, thank-you for your suggestion. I made a mistake previously when I suggested that there was a prefix in front of 00060_3 possibly suggesting that it was a string of characters rather than numbers. The prefix in front of 00060_3 is actually two numbers, see the examples below: 01_00060_3 01_00060_3_cd 15_00060_3 15_00060_3_cd 02_00060_3 02_00060_3_cd How can the following code be modified to reflect the numerical rather than character prefix? dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] Thank-you. Irucka Embry -Original Message- From: arun [smartpink...@yahoo.com] Sent: 1/9/2013 7:13:05 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] select partial name and full name columns Hi, May be this is creating the problem: set.seed(15) dat1-data.frame(A_00060_3=sample(1:10,5,replace=TRUE),B_00060_3_cd=sample(20:30,5,replace=TRUE),C_00060_3=sample(1:15,5,replace=TRUE),D_00060=sample(1:8,5,replace=TRUE),datetime=as.POSIXct(paste(rep(6/3/2011,5),c(0:00,0:30,0:35,0:40,0:45)),format=%m/%d/%Y %H:%M)) dat1[,c(datetime,grep(00060_3,colnames(dat1)))] #Error in `[.data.frame`(dat1, , c(datetime, grep(00060_3, colnames(dat1 : #undefined columns selected dat1[,c(datetime,colnames(dat1)[grep(00060_3,colnames(dat1))])] # datetime A_00060_3 B_00060_3_cd C_00060_3 #1 2011-06-03 00:00:00 7 30 2 #2 2011-06-03 00:30:00 2 2810 #3 2011-06-03 00:35:0010 22 8 #4 2011-06-03 00:40:00 7 2711 #5 2011-06-03 00:45:00 4 2913 A.K. - Original Message - From: Irucka Embry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 5:44 AM Subject: [R] select partial name and full name columns Hi, I have the following function: getDataFromDVFileCustom - function (file, hasHeader = TRUE, separator = \t) { DVdatatmp - as.matrix(read.table(file, sep = \t, fill = TRUE, comment.char = #, as.is = TRUE, stringsAsFactors = FALSE, na.strings = NA)) DVdatatmper - as.matrix(DVdatatmp[ , c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) retval - as.data.frame(DVdatatmper, colClasses = c(character), fill = TRUE, comment.char = #, stringsAsFactors = FALSE) if (ncol(retval) == 2) { names(retval) - c(dateTime, value) } else if (ncol(retval) == 3) { names(retval) - c(dateTime, value, code) } if (dateFormatCheck(retval$dateTime)) { retval$dateTime - as.Date(retval$dateTime) } else { retval$dateTime - as.Date(retval$dateTime, format = %m/%d/%Y) } retval$value - as.numeric(retval$value) return(retval) } The function gives me this error: getDataFromDVFileCustom(file) Error in as.matrix(DVdatatmp[, c(datetime, grep(^_00060_3, colnames(DVdatatmp)))]) : subscript out of bounds I am trying to only select 3 columns (datetime and then two partial name columns that end in 00060_3 and 00060_3_cd. Each file that I will be reading into the function has a different number of columns and a different prefix in front of 00060_3 and 00060_3_cd. I have searched online and tried those possible solutions, but they did not work for my function and data. What is the best way
Re: [R] R encrypt/decrypt
Dear All, I am wondering if there is a script in R or Python that can convert shape files to KML oKMZ files. I used a free online shp2kml.exe file my locations all went to Africa. But, I know they are in the USA. Thanks, Alemu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [solved] t-test behavior given that the null hypothesis is true
Hi Ted, yes this was the problem. Thank you very much. best idaios On Wed, Jan 9, 2013 at 4:51 PM, Ted Harding ted.hard...@wlandres.netwrote: Ah! You have aqssigned a parameter equal.var=TRUE, and equal.var is not a listed paramater for t.test() -- see ?t.test : t.test(x, y = NULL, alternative = c(two.sided, less, greater), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...) Try it instead with var.equal=TRUE, i.e. in your code: for(i in 1:k){ rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], ##equal.var=TRUE, alternative=two.sided)$p.value var.equal=TRUE, alternative=two.sided)$p.value } When I run your code with equal.var, I indeed repeatedly see the deficient bin for the lowest P-values that you observed. When I run your code with var.equal I do not see it. The explanation is that, since equal.var is not a recognised parameter for t.test(), it has assumed the default value FALSE for var.equal, and has therefore (since it is a 2-sample test) adopted the Welch/Satterthwaite procedure: var.equal: a logical variable indicating whether to treat the two variances as being equal. If 'TRUE' then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used. This has the effect of somewhat adapting the test procedure to the data, so that extreme (i.e. small) values of P are even rarer than they should be. With best wishes, Ted. On 09-Jan-2013 13:24:59 Pavlos Pavlidis wrote: Hi Ted, thanks for the reply. I use a similar code which you can see below: k - 1 c - 6 rv - array(NA, dim=c(k, c) ) for(i in 1:k){ rv[i,] - rnorm(c, mean=0, sd=1) } rv.t.pvalues - array(NA, k) for(i in 1:k){ rv.t.pvalues[i] - t.test(rv[i, 1:(c/2)], rv[i, (c/2+1):c], equal.var=TRUE, alternative=two.sided)$p.value } hist(rv.t.pvalues) The histogram is this one: *http://tinyurl.com/histogram-rt-pvalues-pdf * *all the best idaios * On Wed, Jan 9, 2013 at 12:29 PM, Ted Harding ted.hard...@wlandres.net wrote: On 09-Jan-2013 08:50:46 Pavlos Pavlidis wrote: Dear all, I observer a strange behavior of the pvalues of the t-test under the null hypothesis. Specifically, I obtain 2 samples of 3 individuals each from a normal distribution of mean 0 and variance 1. Then, I calculate the pvalue using the t-test (var.equal=TRUE, samples are independent). When I make a histogram of pvalues I see that consistently the bin of the smallest pvalues has a lower frequency. Is this a known behavior of the t-test or it's a kind of bug/random number generation problem? kind regards, idaios Using the following code, I did not observe the behavious you describe. The histograms are consistent with a uniform distribution of the P-values, and the lowest bin for the P-values (when the code is run repeatedly) is not consistently lower (or higher, or anything else) than the other bins. ## My code: N - 1 Ps - numeric(N) for(i in (1:N)){ X1 - rnorm(3,0,1) ; X2 - rnorm(3,0,1) Ps[i] - t.test(X1,X2,var.equal=TRUE)$p.value } hist(Ps) If you would post the code you used, the reason why you are observing this may become more evident! Hoping this helps, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 09-Jan-2013 Time: 10:29:21 This message was sent by XFMail - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 09-Jan-2013 Time: 14:51:04 This message was sent by XFMail - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using objects within functions in formulas
Hello, Try the following. It uses argument 'data' to pass the data.frame w2. In the function below, I've changed the pastes to two lines of code because the first one changes the way the formula is put together. test1 - function(x2, y2, w2) { #print(str(w2)) p1 - paste((1|, names(w2), ), collapse= + , sep=) p2 - paste(y2 ~ x2 + , p1) form = as.formula(p2) m1 = glmer(form, data = w2) return(m1) } Hope this helps, Rui Barradas Em 09-01-2013 16:53, Aidan MacNamara escreveu: Dear all, I'm looking to create a formula within a function to pass to glmer() and I'm having a problem that the following example will illustrate: library(lme4) y1 = rnorm(10) x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10)) x1 = data.matrix(x1) w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10, replace=TRUE), w13=sample(1:3,10, replace=TRUE)) test1 - function(x2, y2, w2) { print(str(w2)) form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ), collapse= + , sep=))) m1 = glmer(form) return(m1) } model1 = test1(x2=x1, y2=y1, w2=w1) As can be seen from the print statement within the function, the object w2 is present and is a data frame. However, the following error occurs: Error in is.factor(x) : object 'w2' not found This can be rectified by making 'w2' global - defining it outside the function. I know there are issues with defining formulas and environment but I'm not sure why this problem is specific to 'w2' and not the other objects passed to the function. Any help would be appreciated. Aidan MacNamara EMBL-EBI __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to label two figures in the same chunk independently with knitr
Hi Francesco, This is an advanced topic in knitr; it is called a chunk hook: http://yihui.name/knitr/hooks Sorry for the confusion on the name par; you can call it anything, e.g. mypar knit_hooks$set(mypar = function(before, options, envir) { if (before) par(mar = c(4, 4, .1, .1)) }) opts_chunk$set(mypar = TRUE) For par(bg=rgb(runif(1), runif(1), runif(1))), it is nothing but a line of normal R code. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Jan 9, 2013 at 2:59 AM, Francesco Sarracino f.sarrac...@gmail.com wrote: Dear Yihui, thanks a lot for your kind reply. Your solution is very elegant and versatile. However, there is a point that is obscure to me and I didn't manage to fully understand them after looking at the Knitr manual and graphic manual. The issue concerns the hook: knit_hooks$set(par = function(before, options, envir) { if (before) par(mar = c(4, 4, .1, .1)) }) why do you set par as a function? moreover, below you write: par(bg=rgb(runif(1), runif(1), runif(1))) does this mean that before = rgb(runif(1)); options = runif(1) and envir = runif(1) ? and what does this produce? I don't understand what's going on, can you please help me or address me to some documentation? thanks in advance for your kind help, f. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using objects within functions in formulas
On Jan 9, 2013, at 8:53 AM, Aidan MacNamara wrote: Dear all, I'm looking to create a formula within a function to pass to glmer() and I'm having a problem that the following example will illustrate: library(lme4) y1 = rnorm(10) x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10)) x1 = data.matrix(x1) w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10, replace=TRUE), w13=sample(1:3,10, replace=TRUE)) test1 - function(x2, y2, w2) { print(str(w2)) form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ), collapse= + , sep=))) m1 = glmer(form) return(m1) } model1 = test1(x2=x1, y2=y1, w2=w1) As can be seen from the print statement within the function, the object w2 is present and is a data frame. However, the following error occurs: Error in is.factor(x) : object 'w2' not found Generally regression functions in R will be expecting to get one 'data' argument and build formulas using column names from that object. test1 - function(x2, y2, w2) { w3 - cbind(w2, x2, x2) print(str(w3)) form = as.formula(paste(y2 ~ x2 + ,paste((1|, names(w2), ), collapse= + , sep=))) m1 = glmer(form, data=w3); print(summary(m1)) return(m1) } model1 = test1(x2=x1, y2=y1, w2=w1) This can be rectified by making 'w2' global - defining it outside the function. I know there are issues with defining formulas and environment but I'm not sure why this problem is specific to 'w2' and not the other objects passed to the function. Any help would be appreciated. Aidan MacNamara EMBL-EBI David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] writing to .xlsx
Dear r helpers; I'm interested in reading from and writing to large .xlsx files fairly regularly. (Why, the naysayers may ask - and the answer is basically colleagues and clients who prefer that format). I've tried out the XLConnect and xlsx libraries, but the java implementation they use just takes too much RAM for the files I'm working with. gdata leverages perl and works really well for reading in those files, so half the problem is solved for me! I don't see anything in the documentation about writing .xlsx, though. Is anyone aware of any libraries or clever solutions in R that would get the job done for me? I see a couple packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it would be easy to run that from R? I don't use perl myself (yet?). Looking for recommendations. Best Ben Caldwell [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help setting up a mirror
Hello. I am trying to follow the instructions herehttp://mirror.fcaglp.unlp.edu.ar/CRAN/ to set up a mirror at my company as my developers are not allowed to go outside our firewall to download R packages. I am not having much success in getting this to work. Here is how I configured my httpd server: VirtualHost *:80 ServerName cran.xxx.xxx.xxx.org RewriteEngine on RewriteRule ^package=(.+) /web/packages/$1/index.html [R=seeother] RewriteRule ^view=(.+) /web/views/$1.html [R=seeother] DocumentRoot /ddd/ddd/ddd/ftp.ussg.iu.edu/CRAN /VirtualHost Here is the directory structure of my Linux server for CRAN: /opt/OSS/CRAN-Mirror/ftp.ussg.iu.edu/CRAN: bin doc src web I don't find a banner.shtml anywhere or indeed anthing that looks like your front page. I cannot get rsync to work so I used wget. Here is the command I used: /usr/bin/wget -r -P '/ddd/ddd/ddd/' 'http://ftp.ussg.iu.edu/CRAN/web/packages/available_packages_by_name.html' Can you see what I am doing wrong? Valerie ___ valerie duncan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to .xlsx
Can you use '.xls' format files? If so, XLConnect works pretty good for those. If you are using '.xlsx' format (zip files internally), XLConnect takes much more CPU and memory to handle them. On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: Dear r helpers; I'm interested in reading from and writing to large .xlsx files fairly regularly. (Why, the naysayers may ask - and the answer is basically colleagues and clients who prefer that format). I've tried out the XLConnect and xlsx libraries, but the java implementation they use just takes too much RAM for the files I'm working with. gdata leverages perl and works really well for reading in those files, so half the problem is solved for me! I don't see anything in the documentation about writing .xlsx, though. Is anyone aware of any libraries or clever solutions in R that would get the job done for me? I see a couple packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it would be easy to run that from R? I don't use perl myself (yet?). Looking for recommendations. Best Ben Caldwell [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R encrypt/decrypt
I suggest looking at mcrypt. There is a PHP module, and you could either call out from R to the mcrypt program or use libmcrypt and C calls.It supports AES, and other standard things. There's no real saving of effort in using weaker ciphers, and you really don't want to be implementing the processing yourself. -thomas On Thu, Jan 10, 2013 at 6:59 AM, Ramiro Barrantes ram...@precisionbioassay.com wrote: Hello, I am working on a web system (php) that uses R in the backend, and we need some basic fast encryption/decryption for the underlying mysql database that can be used by both R AND php. It does not need to be top-of-the-line, but just provide some basic level of fast encryption/decryption. Any suggestions? Thank you, Ramiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] weighted factor analysis
It depends on what sort of weights you have, but one approach is to construct a weighted covariance matrix and then run factanal() on it. That's what svyfactanal() in the survey package does. The difficult part is the tests: you need to specify the sample size, and in the presence of weights it may not be clear what the right sample size is -- svyfactanal() has four options, probably none of them is ideal. -thomas On Thu, Jan 10, 2013 at 5:36 AM, Virgile Capo-Chichi vcapochi...@gmail.comwrote: hello there, I am trying to use a weight variable in a factor analysis but apparently the factanal command does not have a weight option. Any way to this? Thanks for your suggestions, V [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave, Texshop, and sync with included Rnw file
Hello everyone. I am in the process of writing a book in Latex with Texshop, on Mac. This book contains a lot of R code, hence the need to use Sweave. I was able to compile Rnw files, and to sync back and forth from the pdf to the source Rnw. My problem now is that the book is divided in Chapters, and every chapter is in its own Rnw file. I can compile them from the main one (book.Rnw) using the directive \SweaveInput{chapter1.Rnw} The problem stands in the fact that like this I am missing synchronization between the pdf and the source Rnw. If part of text is in book.Rnw I can synchronize, but if the text is in one of the included files, it just doesn't work. I am using the sweave engine found in the following webpage: http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks Has anybody succeeded in synchronizing with included Rnw files? Thanks, Mic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] We've posted our 2013***R Courses*** by XLSolutions Corp at 9 USA Cities: San Francisco, New York, Washington DC, Houston, Boston, Las Vegas, Seattle, etc
Happy New Year ! XLSolutions January-February 2013 R courses schedule is now available online at 9 USA cities for with 13 new courses: *** Suggest a future course date/city (1) R-PLUS: A Point-and-Click Approach to R (2) S-PLUS / R : Programming Essentials. (3) R/S+ Fundamentals and Programming Techniques (4) R/S-PLUS Functions by Example. (5) S/R-PLUS Programming 3: Advanced Techniques and Efficiencies. (6) R/S+ System: Advanced Programming. (7) R/S-PLUS Graphics: Essentials. (8) R/S-PLUS Graphics for SAS Users (9) R/S-PLUS Graphical Techniques for Marketing Research. (10) Multivariate Statistical Methods in R/S-PLUS: Practical Research Applications (11) Introduction to Applied Econometrics with R/S-PLUS (12) Exploratory Analysis for Large and Complex Problems in R/S-PLUS (13) Determining Power and Sample Size Using R/S-PLUS. (14) R/S-PLUS: Data Preparation for Data Mining (15) Data Cleaning Techniques in R/S-PLUS (16) R/S-PLUS: Applied Clustering Techniques More on website http://www.xlsolutions-corp.com/courselistlisting.aspx Ask for group discount and reserve your seat Now - Earlybird Rates. Payment due after the class! Email Sue Turner: sue at xlsolutions-corp.com Phone: 206-686-1578 Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat. Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com elvis at xlsolutions-corp.com http://www.xlsolutions-corp.com/courselistlisting.aspx __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random Rectangles
Hi, Just curious. Has anyone out there ever written a script to generate 100 random rectangles such as the ones shown on this page? http://www2.math.umd.edu/~jlh/214/Random%20Rectangles.pdf Thanks. D. -- View this message in context: http://r.789695.n4.nabble.com/Random-Rectangles-tp4655072.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R encrypt/decrypt
On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com wrote: I am working on a web system (php) that uses R in the backend, and we need some basic fast encryption/decryption for the underlying mysql database that can be used by both R AND php. It does not need to be top-of-the-line, but just provide some basic level of fast encryption/decryption. Any suggestions? Sounds too generic. This is not really an R-help question. Not sure what do you mean by underlying mysql. Are you going to encrypt data into db? If it is about transport between sql and web servers: these servers can be configured to use SSL! What is your aim? BTW: Maybe you should remove php and use R directly via Rook; http://cran.r-project.org/web/packages/Rook/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to .xlsx
On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: Dear r helpers; I'm interested in reading from and writing to large .xlsx files fairly regularly. (Why, the naysayers may ask - and the answer is basically colleagues and clients who prefer that format). I've tried out the XLConnect and xlsx libraries, but the java implementation they use just takes too much RAM for the files I'm working with. gdata leverages perl and works really well for reading in those files, so half the problem is solved for me! I don't see anything in the documentation about writing .xlsx, though. Is anyone aware of any libraries or clever solutions in R that would get the job done for me? I see a couple packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it would be easy to run that from R? I don't use perl myself (yet?). Looking for recommendations. Best Ben Caldwell Check out http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel and in particular the WriteXLS package can write Excel 2003 files (xls) using perl. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] graphical distance matrix
Dear R-family, I made a distance matrix of about 2000 stations. its extremely hard to visualize the details of that matrix. I heard that there is a way in R to represent the details of distance matrix graphically. more precisely, different sections of our distance matrix can be presented in different colors. low values be presented in light colors and high values in dark. is there really a way of doing it?? thanks in advance regards elisa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R encrypt/decrypt
Dear Suzen, Thank you for your reply. What I meant was that some fields in the database will be encrypted (the data for those fields will be entered via php web interface and then encrypted and stored on the mysql db), and then I will use R to read such database and do appropriate post-processing, which will then need to be encrypted and stored into the mysql db (with R hopefully). In other words, I have a shared mysql database with some encrypted fields, and I need R and php to both understand the encryption/decryption. I thought this would be an appropriate question for the group as perhaps someone might know of an R encrypt/decrypt mechanism that also has a counterpart on php or has suggestions about the situation. Sorry for the confusion in my question. Thank you, Ramiro From: mehmet.su...@gmail.com [mehmet.su...@gmail.com] on behalf of Suzen, Mehmet [msu...@gmail.com] Sent: Wednesday, January 09, 2013 3:38 PM To: Ramiro Barrantes Cc: r-help@r-project.org Subject: Re: [R] R encrypt/decrypt On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com wrote: I am working on a web system (php) that uses R in the backend, and we need some basic fast encryption/decryption for the underlying mysql database that can be used by both R AND php. It does not need to be top-of-the-line, but just provide some basic level of fast encryption/decryption. Any suggestions? Sounds too generic. This is not really an R-help question. Not sure what do you mean by underlying mysql. Are you going to encrypt data into db? If it is about transport between sql and web servers: these servers can be configured to use SSL! What is your aim? BTW: Maybe you should remove php and use R directly via Rook; http://cran.r-project.org/web/packages/Rook/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, Texshop, and sync with included Rnw file
I believe RStudio has done a fairly good job in terms of the synchronization. If you have to stick to TeXShop, I do not have any ideas on how to make it work with Sweave child documents. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Jan 9, 2013 at 2:25 PM, michele caseposta mic.c...@gmail.com wrote: Hello everyone. I am in the process of writing a book in Latex with Texshop, on Mac. This book contains a lot of R code, hence the need to use Sweave. I was able to compile Rnw files, and to sync back and forth from the pdf to the source Rnw. My problem now is that the book is divided in Chapters, and every chapter is in its own Rnw file. I can compile them from the main one (book.Rnw) using the directive \SweaveInput{chapter1.Rnw} The problem stands in the fact that like this I am missing synchronization between the pdf and the source Rnw. If part of text is in book.Rnw I can synchronize, but if the text is in one of the included files, it just doesn't work. I am using the sweave engine found in the following webpage: http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks Has anybody succeeded in synchronizing with included Rnw files? Thanks, Mic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] incrementation within ifelse
Damien, You don't give an example of what your data frame looks like or what you want the new column to look like (given that example data), but I created an example data frame for z, and wrote a few lines of code to add a new column. Check it out and see if it comes close to doing what you want. first - function(x) c(1, 1-(x[-1]==x[-length(x)])) n - 25 z - data.frame(flagFoehn3_durr=sample(0:2, n, TRUE), Guetsch=sample(0:2, n, TRUE)) z$newColumn - cumsum(first(z$flagFoehn3_durr==1) z$flagFoehn3_durr==1) z$newColumn[z$flagFoehn3_durr!=1] - 0 Jean On Tue, Jan 8, 2013 at 12:33 PM, Damien Pilloud damien.pill...@gmail.comwrote: Dear R-helper, I am working on a very large data frame and I am trying to add a new column and write in it with certain conditions. I have try to use this code with the data frame p : ID = 0 p[,newColumn]- ifelse (p$flagFoehn3_durr == 1, ifelse(p$Guetsch == 0, ID - ID ++ , ID ) , 0 ) What I am trying to do is to increment the ID when p$Guetsch == 0 and to put this result in the column. The problem is that ID does not increment itself. An other way is to use a loop for like this example : ID = 0 for (s in 1:(nrow(z))){ z[s,newColumn]- if (z$flagFoehn3_durr[s] == 1){ if(z$flagFoehn3_durr[s-1] == 0){ ID -ID+1 }else{ ID } }else{ 0 } } This work perfectly, but the problem is that it will take me more than a month to run it. Is there a way to increment with the first code I used or a way of running the second code faster (I have more than 1 million rows) Thanks! Cheers, Damien [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to estate the correlation between two autocorrelated variables
Dear R users, In my data, there are two variables t1 and t2. For each observation of t1 and t2, two location indicators (x, y) were provided. The data format is #x y t1 t2 Since the both t1 and t2 are depended on x and y, t1 and t2 are autocorrelated variables. My question is how to calculate the correlation between t1 and t2 by taking into account the structure of residual variance caused by x and y. Seemly, the gls function in nlme/R package might can be used for the purpose. However, I failed to figure out how to use the function for my data. I appreciate your kind help providing an example code for the above data format. Please also let me know if there is any other more suitable R package for the analysis. Best regards, Zhiqiu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, Texshop, and sync with included Rnw file
Hi Yihui, yes, RStudio works flawlessly with synchronization, but working with it I will lose all the features of a full-fledged tex editor, first of all BibDesk integration. Texworks can also do two-way sync in Rnw, and I tried to switch configurations but with no luck. In texworks I am using RScript with the options recommended in this document: http://www.math.montana.edu/~jimrc/classes/Rseminar/TexWorks.pdf The problem with texworks is the same: no integration with bibdesk On Jan 9, 2013, at 4:06 PM, Yihui Xie wrote: I believe RStudio has done a fairly good job in terms of the synchronization. If you have to stick to TeXShop, I do not have any ideas on how to make it work with Sweave child documents. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Jan 9, 2013 at 2:25 PM, michele caseposta mic.c...@gmail.com wrote: Hello everyone. I am in the process of writing a book in Latex with Texshop, on Mac. This book contains a lot of R code, hence the need to use Sweave. I was able to compile Rnw files, and to sync back and forth from the pdf to the source Rnw. My problem now is that the book is divided in Chapters, and every chapter is in its own Rnw file. I can compile them from the main one (book.Rnw) using the directive \SweaveInput{chapter1.Rnw} The problem stands in the fact that like this I am missing synchronization between the pdf and the source Rnw. If part of text is in book.Rnw I can synchronize, but if the text is in one of the included files, it just doesn't work. I am using the sweave engine found in the following webpage: http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks Has anybody succeeded in synchronizing with included Rnw files? Thanks, Mic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R encrypt/decrypt
Hello Ramiro, I am still not sure why do you need to encrypt/decrypt data in R. One can encrypt/decrypt data in the SQL server side. https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html If your concern is on the web traffic, again, sql servers supports SSL http://dev.mysql.com/doc/refman/5.1/en/ssl-connections.html I think RMySQL can connect via SSL. Also you may consider RCurl to talk to your php code. Best, -m On 9 January 2013 21:54, Ramiro Barrantes ram...@precisionbioassay.com wrote: Dear Suzen, Thank you for your reply. What I meant was that some fields in the database will be encrypted (the data for those fields will be entered via php web interface and then encrypted and stored on the mysql db), and then I will use R to read such database and do appropriate post-processing, which will then need to be encrypted and stored into the mysql db (with R hopefully). In other words, I have a shared mysql database with some encrypted fields, and I need R and php to both understand the encryption/decryption. I thought this would be an appropriate question for the group as perhaps someone might know of an R encrypt/decrypt mechanism that also has a counterpart on php or has suggestions about the situation. Sorry for the confusion in my question. Thank you, Ramiro From: mehmet.su...@gmail.com [mehmet.su...@gmail.com] on behalf of Suzen, Mehmet [msu...@gmail.com] Sent: Wednesday, January 09, 2013 3:38 PM To: Ramiro Barrantes Cc: r-help@r-project.org Subject: Re: [R] R encrypt/decrypt On 9 January 2013 18:59, Ramiro Barrantes ram...@precisionbioassay.com wrote: I am working on a web system (php) that uses R in the backend, and we need some basic fast encryption/decryption for the underlying mysql database that can be used by both R AND php. It does not need to be top-of-the-line, but just provide some basic level of fast encryption/decryption. Any suggestions? Sounds too generic. This is not really an R-help question. Not sure what do you mean by underlying mysql. Are you going to encrypt data into db? If it is about transport between sql and web servers: these servers can be configured to use SSL! What is your aim? BTW: Maybe you should remove php and use R directly via Rook; http://cran.r-project.org/web/packages/Rook/index.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Basic loop programming
Yes, R is a different language, and has different syntax and different built-in functions, so, yes it works differently. If you want to do it the same way in R as in that other language, you have to use a different method for constructing the variable names inside the loop. Here's an example, using the get() and assign() functions to construct the variable names, essentially replacing your constructions like Customers_2012_'i'. I have four variables, named s01, s02 c01, c02 (sales and customers for two months) Something like this should do it: for (i in c('01','02')) { assign( paste0('r',i) , get(paste0('s',i))/get(paste0('c',i)) ) } I should now have new variables r01 and r02. This is not tested, so hopefully I got all the parentheses matched. Of course, that looks cumbersome and ugly, and it is. There are other ways in R to store your data, for which the code will be much friendlier. If you use i in 1:12 you are creating numbers, but your variable names use character strings, '01','02', etc. So, no, you can't use i in 1:12 directly. But you can use i in 1:12 if you use a formatting function on i to convert it to a character string with leading zeros. The formatC function is one such function; there are others. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 1/9/13 8:02 AM, Paolo Donatelli donatellipa...@gmail.com wrote: Hi all, newbie question: I am trying to set up a very simple loop without succeeding. Let's say I have monthly observation of two variables for a year - Sales_2012_01, Sales_2012_02, Sales_2012_03, (total sales for jan 2012,feb 2012, etc.) - Customers_2012_01, Customers_2012_02, (total number of customers for jan 2012, etc.) and I want to create new monthly variables in order to compute revenues per customers: Av_revenue_2012_01 = Sales_2012_01 / Customers_2012_01 Av_revenue_2012_02 = Sales_2012_02 / Customers_2012_02 ... how can I proceed? In other programming language I used just to write something like for (i in list(01,02, ..., 12) { Av_revenue_2012_'i' = Sales_2012_'i' / Customers_2012_'i' } but in R it seems not to work like that. Further, and correct me if I am wrong, I cannot use simple (i in 1:12) since I have a 0 digit in front of the single-digit months. thanks in advance for your help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLMM post- hoc comparisons
El día 08/01/2013 a las 12:40, Silvina Velez sve...@mendoza-conicet.gob.ar escribió: Hi All, I have data about seed predation (SP) in fruits of three differents colors (yellow, motted, dark) and in two fruiting seasons (2007, 2008). I performed a GLMM (lmer function, lme4 package) and the outcome showed that the interaction term (color:season) was significant, and some combinations of this interaction have significant Pr(|z|), but I don't think they are the right significant combinations, because when I look the bwplot, this combinations seems to be very different from the other ones. So, I would like to know if there is any test a posteriori to know the p-values for each combination of color:season, and thereby be able to know what conbination/s is/are really significant. m1=lmer(SP ~ color + season:color +(1|Site:tree), data=datosfl, family=poisson) AIC BIC logLik deviance 178.3 196.6 -81.14162.3 Random effects: Groups NameVariance Std.Dev. obsBR (Intercept) 0.064324 0.25362 Site:tree (Intercept) 0.266490 0.51623 Number of obs: 73, groups: obsBR, 73; Site:tree, 37 Estimate Std. Error z value Pr(|z|) (Intercept)2.5089 0.2750 9.125 2e-16 *** colorM-0.1140 0.3242 -0.352 0.7250 colorD-0.6450 0.4178 -1.544 0.1227 Season2008-0.7343 0.3104 -2.365 0.0180 * colorM:Season2008 0.2505 0.4352 0.576 0.5648 colorD:Season2008 1.1445 0.5747 1.992 0.0464 * Hi Silvina, What do you exactly mean with what combination(s) is/are significant? If you mean what combinations have significantly greater SP than the baseline combination (yellow:2007), the table that you have copied may be what you actually want. If you want to test other contrasts between color:season combinations, perhaps you can use the function testInteractions() from package phia. For instance: testInteractions(m1) will give you a test of all the pairwise contrasts between color and season. You can also test simple main effects, or other specific contrasts by adding further arguments (see the documentation and the package vignette). Anyway, the calculation of p-values in mixed models must always be taken with care. Helios De Rosario-Martinez Instituto de Biomecánica de Valencia INSTITUTO DE BIOMECÁNICA DE VALENCIA Universidad Politécnica de Valencia • Edificio 9C Camino de Vera s/n • 46022 VALENCIA (ESPAÑA) Tel. +34 96 387 91 60 • Fax +34 96 387 91 69 www.ibv.org Antes de imprimir este e-mail piense bien si es necesario hacerlo. En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección de Datos de Carácter Personal, le informamos de que el presente mensaje contiene información confidencial, siendo para uso exclusivo del destinatario arriba indicado. En caso de no ser usted el destinatario del mismo le informamos que su recepción no le autoriza a su divulgación o reproducción por cualquier medio, debiendo destruirlo de inmediato, rogándole lo notifique al remitente. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, Texshop, and sync with included Rnw file
Hi Perhaps you need to make a master file and call the chapter files from it eg (just copying the relevant section from my master GClimate12.Rnw file) latex preliminaries + R options + begin % plots \SweaveInput{GClimate12RX.Rnw} % Soil 300 % 15 \SweaveInput{GClimate12SP.Rnw} % proportions % 16 \SweaveInput{GClimate12RS.Rnw} % cumsum days % 17 \SweaveInput{GClimate12RC.Rnw} closing commands etc end document This means that you require 1 setup page rather than individual ones and the benefits attached for TOC etc Regards Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mac...@northnet.com.au At 06:25 10/01/2013, you wrote: Hello everyone. I am in the process of writing a book in Latex with Texshop, on Mac. This book contains a lot of R code, hence the need to use Sweave. I was able to compile Rnw files, and to sync back and forth from the pdf to the source Rnw. My problem now is that the book is divided in Chapters, and every chapter is in its own Rnw file. I can compile them from the main one (book.Rnw) using the directive \SweaveInput{chapter1.Rnw} The problem stands in the fact that like this I am missing synchronization between the pdf and the source Rnw. If part of text is in book.Rnw I can synchronize, but if the text is in one of the included files, it just doesn't work. I am using the sweave engine found in the following webpage: http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks Has anybody succeeded in synchronizing with included Rnw files? Thanks, Mic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using objects within functions in formulas
David Winsemius dwinsemius at comcast.net writes: On Jan 9, 2013, at 8:53 AM, Aidan MacNamara wrote: I'm looking to create a formula within a function to pass to glmer() and I'm having a problem that the following example will illustrate: library(lme4) y1 = rnorm(10) x1 = data.frame(x11=rnorm(10), x12=rnorm(10), x13=rnorm(10)) x1 = data.matrix(x1) w1 = data.frame(w11=sample(1:3,10, replace=TRUE), w12=sample(1:3,10, replace=TRUE), w13=sample(1:3,10, replace=TRUE)) test1 - function(x2, y2, w2) { print(str(w2)) form = as.formula(paste(y2 ~ x2 + ,paste((1|w2$, names(w2), ), collapse= + , sep=))) m1 = glmer(form) return(m1) } model1 = test1(x2=x1, y2=y1, w2=w1) As can be seen from the print statement within the function, the object w2 is present and is a data frame. However, the following error occurs: Error in is.factor(x) : object 'w2' not found [snip David's solution to try to make gmane happy about the amount of quoted material] This can be rectified by making 'w2' global - defining it outside the function. I know there are issues with defining formulas and environment but I'm not sure why this problem is specific to 'w2' and not the other objects passed to the function. Any help would be appreciated. Aidan MacNamara EMBL-EBI I haven't had a chance to look at this, but I will try to get to it. It would help if you could post it on the Issues page of the lme4 github site, https://github.com/lme4/lme4/ . The bottom line is that dealing appropriately with all the different possible ways to assign and evaluate variables within formulas is trickier than I would like it to be. To the best of my knowledge I have solved most of these problems in the development version of lme4, but another test case will be useful. As long as there is a reasonable workaround I'm unlikely to put the effort into fixing the stable version of lme4 (sorry ...) Follow-ups to r-sig-mixed-mod...@r-project.org or (preferably) to the aforementioned Issues list. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to .xlsx
On Jan 9, 2013, at 2:45 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: Dear r helpers; I'm interested in reading from and writing to large .xlsx files fairly regularly. (Why, the naysayers may ask - and the answer is basically colleagues and clients who prefer that format). I've tried out the XLConnect and xlsx libraries, but the java implementation they use just takes too much RAM for the files I'm working with. gdata leverages perl and works really well for reading in those files, so half the problem is solved for me! I don't see anything in the documentation about writing .xlsx, though. Is anyone aware of any libraries or clever solutions in R that would get the job done for me? I see a couple packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it would be easy to run that from R? I don't use perl myself (yet?). Looking for recommendations. Best Ben Caldwell Check out http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel and in particular the WriteXLS package can write Excel 2003 files (xls) using perl. Thanks for the referral Gabor. If Benjamin needs the xlsx format due to the larger dimensions supported, WriteXLS, since it writes xls format files, would not likely be suitable. Otherwise, of course, current versions of Excel can open the older format. If Benjamin simply needs to dump larger (for some definition of larger) datasets externally in format that is compatible with Excel, he could write out CSV files that, of course, can then be opened in Excel. That presumes that he is not looking to do any other formatting of the worksheets or other similar functionality that is native to Excel. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to estate the correlation between two autocorrelated variables
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Zhiqiu Hu Sent: Wednesday, January 09, 2013 1:45 PM To: r-help@r-project.org Subject: [R] How to estate the correlation between two autocorrelated variables Dear R users, In my data, there are two variables t1 and t2. For each observation of t1 and t2, two location indicators (x, y) were provided. The data format is #x y t1 t2 Since the both t1 and t2 are depended on x and y, t1 and t2 are autocorrelated variables. My question is how to calculate the correlation between t1 and t2 by taking into account the structure of residual variance caused by x and y. Seemly, the gls function in nlme/R package might can be used for the purpose. However, I failed to figure out how to use the function for my data. I appreciate your kind help providing an example code for the above data format. Please also let me know if there is any other more suitable R package for the analysis. Best regards, Zhiqiu If you want the partial correlation between t1 and t2 given x and y, then look at the pcor() function in the ppcor package. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing to .xlsx
Folks, Thanks for your input. I'm pretty comfortable with the options for writing to .xls; I'm interested in 1. Something that can write to .xlsx for the larger supported dimensions, as Marc guessed; but of course he's right that .csv would work very well if that was the main goal. I'm really looking for 2. something I can use to write a new sheet (and/or columns within existing sheets) of .xlsx workbooks so I can more easily work with colleagues who're using those workbooks. It's a bit unwieldy, but they like to use vb, which its shortcomings I don't have to go into here, and I'd like to get in and out of their workbooks without all the current cumbersome open - export as .csv - read in - export as .csv - copy into workbook. The other option would be to convert them to using R, but so far no luck there! Thanks again *Ben Caldwell* PhD Candidate University of California, Berkeley 130 Mulford Hall #3114 Berkeley, CA 94720 Office 223 Mulford Hall (510)859-3358 On Wed, Jan 9, 2013 at 2:03 PM, Marc Schwartz marc_schwa...@me.com wrote: On Jan 9, 2013, at 2:45 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Wed, Jan 9, 2013 at 2:19 PM, Benjamin Caldwell btcaldw...@berkeley.edu wrote: Dear r helpers; I'm interested in reading from and writing to large .xlsx files fairly regularly. (Why, the naysayers may ask - and the answer is basically colleagues and clients who prefer that format). I've tried out the XLConnect and xlsx libraries, but the java implementation they use just takes too much RAM for the files I'm working with. gdata leverages perl and works really well for reading in those files, so half the problem is solved for me! I don't see anything in the documentation about writing .xlsx, though. Is anyone aware of any libraries or clever solutions in R that would get the job done for me? I see a couple packages on CPAN for writing an xlsx, so it's been done in perl; perhaps it would be easy to run that from R? I don't use perl myself (yet?). Looking for recommendations. Best Ben Caldwell Check out http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel and in particular the WriteXLS package can write Excel 2003 files (xls) using perl. Thanks for the referral Gabor. If Benjamin needs the xlsx format due to the larger dimensions supported, WriteXLS, since it writes xls format files, would not likely be suitable. Otherwise, of course, current versions of Excel can open the older format. If Benjamin simply needs to dump larger (for some definition of larger) datasets externally in format that is compatible with Excel, he could write out CSV files that, of course, can then be opened in Excel. That presumes that he is not looking to do any other formatting of the worksheets or other similar functionality that is native to Excel. Regards, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count A, C, T, G in each row in a big data.frame?
In fact I want to calculate the gene frequency of each SNP. The key problems are that: 1. my data.frame is large ,about 50,000 rows. So it is so slow to split() it by row 2 .The allele in each SNP (each row) are different.Some are A/G, some are G/C. It is a little bit embarrassed for me to handle it. Thank you for your help 2013/1/9 jim holtman jholt...@gmail.com: forgot the data. this will count the characters; you can add logic with 'table' to count groups x - structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, AA, AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, GA, AA, GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AA, GG, TT, CC, AA, TC, AA, AA, GA), X2296 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GA, GG, AG, GG, TG, TC, AG, CC, AA, AA, AA), X2453 = c(AG,
Re: [R] how to count A, C, T, G in each row in a big data.frame?
Thanks a lot. The problem is that I don't know how to handle the output list as I want calculate the frequency of A or G or T or C by row. Yao He 2013/1/10 Jessica Streicher j.streic...@micromata.de: Sorry, you wanted rows, i wrote for columns #rows would be: test2-apply(test[,-c(1:4)],1,function(x){table(t(x))}) #find single values in a row sapply(test2,function(row){ allVars-paste(names(row),collapse=) u - unique(strsplit(allVars,)[[1]]) parts-sapply(names(row),function(x){u%in%strsplit(x,)[[1]]}) mat-parts%*%row rownames(mat)-u mat }) though i guess lists aren't ideal, but theres another answer as well i see. On 09.01.2013, at 15:23, Yao He wrote: Dear All I have a data.frame like that: structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT, CC, GG, CC, AA, AA, AG), X2316 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, AA, AA, AG, TT, TC, GG, CT, AA, GG, GG), X2339 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, GA, AA, GG, GG, GT, CT, GG, TT, AA, AA, AG), X2331 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2343 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2352 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, GA, AG), X2293 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TT, TC, AA, CT, AA, AA, AA), X2338 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2449 = c(AA,
Re: [R] how to count A, C, T, G in each row in a big data.frame?
Can you get what you need from the following, where 'd' is your data.frame, the first four columns of which are irrelevant to this problem? dd - d[,-(1:4)] ; table(rownames(dd)[row(dd)], unlist(dd)) AA AG CC CT GA GG GT TC TG TT 27412 29 10 0 0 13 1 0 0 0 0 27413 0 0 4 9 0 0 0 12 0 28 27414 0 0 0 0 0 0 0 0 0 53 27415 0 0 53 0 0 0 0 0 0 0 ... 27430 46 3 0 0 2 2 0 0 0 0 27431 19 15 0 0 15 4 0 0 0 0 table() is pretty quick. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Yao He Sent: Wednesday, January 09, 2013 4:04 PM To: jim holtman Cc: R help Subject: Re: [R] how to count A, C, T, G in each row in a big data.frame? In fact I want to calculate the gene frequency of each SNP. The key problems are that: 1. my data.frame is large ,about 50,000 rows. So it is so slow to split() it by row 2 .The allele in each SNP (each row) are different.Some are A/G, some are G/C. It is a little bit embarrassed for me to handle it. Thank you for your help 2013/1/9 jim holtman jholt...@gmail.com: forgot the data. this will count the characters; you can add logic with 'table' to count groups x - structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TC, AG, CC, AA, AA, AG), X2534 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GA, AG, GG, TG, CC, AG, TC, AA, AA, AA), X2280 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, AG, GG, TT,
Re: [R] how to count A, C, T, G in each row in a big data.frame?
It is really a good output. Maybe I could go on with this output. Everytime I understand R further from your help. The first four cols are irrelevant. It is a negligence 2013/1/10 William Dunlap wdun...@tibco.com: Can you get what you need from the following, where 'd' is your data.frame, the first four columns of which are irrelevant to this problem? dd - d[,-(1:4)] ; table(rownames(dd)[row(dd)], unlist(dd)) AA AG CC CT GA GG GT TC TG TT 27412 29 10 0 0 13 1 0 0 0 0 27413 0 0 4 9 0 0 0 12 0 28 27414 0 0 0 0 0 0 0 0 0 53 27415 0 0 53 0 0 0 0 0 0 0 ... 27430 46 3 0 0 2 2 0 0 0 0 27431 19 15 0 0 15 4 0 0 0 0 table() is pretty quick. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Yao He Sent: Wednesday, January 09, 2013 4:04 PM To: jim holtman Cc: R help Subject: Re: [R] how to count A, C, T, G in each row in a big data.frame? In fact I want to calculate the gene frequency of each SNP. The key problems are that: 1. my data.frame is large ,about 50,000 rows. So it is so slow to split() it by row 2 .The allele in each SNP (each row) are different.Some are A/G, some are G/C. It is a little bit embarrassed for me to handle it. Thank you for your help 2013/1/9 jim holtman jholt...@gmail.com: forgot the data. this will count the characters; you can add logic with 'table' to count groups x - structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
[R] SRS, Stratified, and Cluster sampling
Hi, Has anyone done (or know of) any nice R activities that help introductory students ( and teachers :) ) better understand the concepts of simple vs stratified vs cluster sampling? Any links? David -- View this message in context: http://r.789695.n4.nabble.com/SRS-Stratified-and-Cluster-sampling-tp4655099.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] piece-wise linear regression nls function
windows 7, R 2.12 I am trying to run a piecewise linear regression with a single knot, i.e. a regression composed of two straight lines where the two lines intersect at an x value given by the variable knot. I wish to estimate the slope of both lines, the value of knot, the x value where the two lines intersect, and an intercept. I am using the nls code below, and get the following error message: Error in nls(FM ~ blow * BMIJS + bhi * sapply(BMIJS - knot, max, 0), start = list(knot = 25, : singular gradient nls code: test - nls(FM~blow*BMIJS+bhi*sapply(BMIJS-knot,max,0),start=list(knot=25,blow=1,bhi=1),data=FeMaleData) summary(test) greatly shortened version of my data (the full data set has 450 records) FMBMIJS 2 55.878 40.57273 4 34.270 27.76939 5 20.123 21.73818 6 19.320 19.71203 9 49.701 43.55356 10 51.188 37.84742 11 46.753 37.71003 13 65.079 37.23438 14 37.097 36.81806 15 30.625 29.92783 17 50.617 42.42754 18 63.954 48.78709 20 29.790 26.97648 21 36.558 34.79373 22 41.275 33.03063 24 27.682 27.24508 26 37.968 35.41399 28 24.878 27.20250 30 47.513 35.77961 31 51.315 37.46032 33 41.944 36.40212 34 38.150 32.83818 35 60.719 42.48594 36 42.643 34.29355 38 40.728 32.42817 42 34.814 30.57573 43 32.896 29.32912 44 30.430 25.44183 46 48.986 37.90910 49 47.485 36.34642 52 46.312 38.64647 54 45.228 33.08783 55 45.391 35.86965 59 37.256 32.66507 60 27.367 28.49880 63 38.663 34.34131 64 34.527 29.57858 67 58.368 38.97266 68 13.473 17.35397 69 22.456 20.80958 71 28.829 25.50056 73 15.487 20.22202 76 18.313 21.38991 77 41.535 36.85707 78 56.124 40.51978 80 52.587 40.77256 81 24.991 25.48543 83 56.327 39.97214 84 70.836 36.52915 85 62.294 42.45244 86 39.689 35.18527 87 35.006 35.15136 88 47.378 37.54779 89 18.149 23.99236 90 33.041 28.10476 91 28.884 26.74443 92 37.670 32.25230 94 55.410 43.72364 99 34.461 35.05930 101 59.727 42.83035 102 41.913 35.64677 104 66.644 41.01642 105 55.250 43.86426 107 45.196 31.78370 108 36.476 33.45537 109 34.386 29.08402 110 39.277 36.98500 111 53.789 45.54654 112 33.077 29.09559 116 57.246 39.98031 120 52.546 40.12191 122 34.409 29.70977 123 31.188 28.75295 126 54.567 38.15226 129 19.193 22.71878 133 39.322 33.45712 134 41.415 31.28980 136 57.616 36.94016 140 28.162 24.40219 142 37.524 29.92673 143 29.611 29.15452 144 26.780 26.53462 146 47.219 35.14919 147 35.341 28.68955 148 44.827 37.68317 149 54.180 41.12226 150 41.636 30.00930 151 33.626 28.00164 156 34.334 29.64970 160 36.317 30.12031 161 46.823 35.64603 163 39.506 34.27740 164 61.619 39.20019 169 48.984 35.77558 171 66.467 41.59008 172 70.144 42.79996 173 37.324 31.56521 174 66.882 46.04938 182 54.239 38.21065 184 48.800 32.01630 Thanks,John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple versions of function
On Jan 9, 2013, at 1:00 PM, ivo welch wrote: mea culpa. f - function(...) { ## parse out the arguments and then do something with them } ## all of these should result in the same actions f(2,3) ## interprets a to be first and b to be second f(a=2,b=3) f(b=3,a=2) f(data.frame(a=2,b=3)) f(data.frame(b=3,a=1)) In the last two instances you are only passing a single object. I suppose you could construct the argument list with f - function( a=NA, ...) { code} But this works: f - function(a=NA, b=NA) if( !is.list(a) ) {print(a); cat(\n); print(b) } else{ with(a, {print(a); cat(\n); print(b)} ) } There is some concern for using with in functions so maybe you would want access values with a[[a]] and a[[b]] Test output. f(2,3) [1] 2 [1] 3 f(a=2,b=3) [1] 2 [1] 3 f(b=3,a=2) [1] 2 [1] 3 f(data.frame(a=2,b=3)) [1] 2 [1] 3 f(data.frame(b=3,a=1)) [1] 1 [1] 3 On Tue, Jan 8, 2013 at 8:00 AM, David Winsemius dwinsem...@comcast.net wrote: On Jan 7, 2013, at 6:58 PM, ivo welch wrote: hi david---can you give just a little more of an example? the function should work with call by order, call by name, and data frame whose columns are the names. /iaw It is I who should be expecting you to provide an example. -- David. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, Texshop, and sync with included Rnw file
On 13-01-09 3:25 PM, michele caseposta wrote: Hello everyone. I am in the process of writing a book in Latex with Texshop, on Mac. This book contains a lot of R code, hence the need to use Sweave. I was able to compile Rnw files, and to sync back and forth from the pdf to the source Rnw. My problem now is that the book is divided in Chapters, and every chapter is in its own Rnw file. I can compile them from the main one (book.Rnw) using the directive \SweaveInput{chapter1.Rnw} The problem stands in the fact that like this I am missing synchronization between the pdf and the source Rnw. If part of text is in book.Rnw I can synchronize, but if the text is in one of the included files, it just doesn't work. I am using the sweave engine found in the following webpage: http://cameron.bracken.bz/synctex-with-sweavepgfsweave-in-texshoptexworks Has anybody succeeded in synchronizing with included Rnw files? This is a problem addressed by my patchDVI package, available on R-forge. You have a main file (which can be .tex or .Rnw), and put code at the start of each .Rnw file to indicate where to find it. Then you just run Sweave on one of the chapters, and it automatically produces the full document. The sample document here: http://www.umanitoba.ca/statistics/seminars/2011/3/4/duncan-murdoch-using-sweave-R/ includes an appendix describing how to set this up with TeXShop. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] piece-wise linear regression nls function
On Jan 9, 2013, at 5:33 PM, John Sorkin wrote: windows 7, R 2.12 I am trying to run a piecewise linear regression with a single knot, i.e. a regression composed of two straight lines where the two lines intersect at an x value given by the variable knot. I wish to estimate the slope of both lines, the value of knot, the x value where the two lines intersect, and an intercept. I am using the nls code below, and get the following error message: Error in nls(FM ~ blow * BMIJS + bhi * sapply(BMIJS - knot, max, 0), start = list(knot = 25, : singular gradient nls code: test - nls(FM~blow*BMIJS+bhi*sapply(BMIJS-knot,max,0),start=list(knot=25,blow=1,bhi=1),data=FeMaleData) summary(test) I was surprised to see `sapply` inside a formula expression. I instead imagined that this might have been what was meant: test - nls( FM ~ blow*BMIJS + bhi*pmax(BMIJS-knot,0) , start=list(knot=25,blow=1,bhi=1),data=FeMaleData) summary(test) Formula: FM ~ blow * BMIJS + bhi * pmax(BMIJS - knot, 0) Parameters: Estimate Std. Error t value Pr(|t|) knot 21.4960 3.2095 6.698 1.39e-09 *** blow 0.8983 0.1264 7.106 2.02e-10 *** bhi0.9551 0.1610 5.931 4.63e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 5.638 on 97 degrees of freedom Number of iterations to convergence: 4 Achieved convergence tolerance: 8.684e-09 I offer not particular opinion on whether this is sensible, only htat it does not break the interpreter's understanding of function application. and the know seems within the range of the values, albeit to the left hand edge: with(FeMaleData, plot(FM~BMIJS) ) lines(seq(15, 50), predict(test, newdata=list(BMIJS=seq(15, 50)) ) ) -- David. greatly shortened version of my data (the full data set has 450 records) FMBMIJS 2 55.878 40.57273 4 34.270 27.76939 5 20.123 21.73818 6 19.320 19.71203 9 49.701 43.55356 10 51.188 37.84742 11 46.753 37.71003 13 65.079 37.23438 14 37.097 36.81806 15 30.625 29.92783 17 50.617 42.42754 18 63.954 48.78709 20 29.790 26.97648 21 36.558 34.79373 22 41.275 33.03063 24 27.682 27.24508 26 37.968 35.41399 28 24.878 27.20250 30 47.513 35.77961 31 51.315 37.46032 33 41.944 36.40212 34 38.150 32.83818 35 60.719 42.48594 36 42.643 34.29355 38 40.728 32.42817 42 34.814 30.57573 43 32.896 29.32912 44 30.430 25.44183 46 48.986 37.90910 49 47.485 36.34642 52 46.312 38.64647 54 45.228 33.08783 55 45.391 35.86965 59 37.256 32.66507 60 27.367 28.49880 63 38.663 34.34131 64 34.527 29.57858 67 58.368 38.97266 68 13.473 17.35397 69 22.456 20.80958 71 28.829 25.50056 73 15.487 20.22202 76 18.313 21.38991 77 41.535 36.85707 78 56.124 40.51978 80 52.587 40.77256 81 24.991 25.48543 83 56.327 39.97214 84 70.836 36.52915 85 62.294 42.45244 86 39.689 35.18527 87 35.006 35.15136 88 47.378 37.54779 89 18.149 23.99236 90 33.041 28.10476 91 28.884 26.74443 92 37.670 32.25230 94 55.410 43.72364 99 34.461 35.05930 101 59.727 42.83035 102 41.913 35.64677 104 66.644 41.01642 105 55.250 43.86426 107 45.196 31.78370 108 36.476 33.45537 109 34.386 29.08402 110 39.277 36.98500 111 53.789 45.54654 112 33.077 29.09559 116 57.246 39.98031 120 52.546 40.12191 122 34.409 29.70977 123 31.188 28.75295 126 54.567 38.15226 129 19.193 22.71878 133 39.322 33.45712 134 41.415 31.28980 136 57.616 36.94016 140 28.162 24.40219 142 37.524 29.92673 143 29.611 29.15452 144 26.780 26.53462 146 47.219 35.14919 147 35.341 28.68955 148 44.827 37.68317 149 54.180 41.12226 150 41.636 30.00930 151 33.626 28.00164 156 34.334 29.64970 160 36.317 30.12031 161 46.823 35.64603 163 39.506 34.27740 164 61.619 39.20019 169 48.984 35.77558 171 66.467 41.59008 172 70.144 42.79996 173 37.324 31.56521 174 66.882 46.04938 182 54.239 38.21065 184 48.800 32.01630 Thanks,John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Rectangles
On 01/10/2013 07:37 AM, David Arnold wrote: Hi, Just curious. Has anyone out there ever written a script to generate 100 random rectangles such as the ones shown on this page? http://www2.math.umd.edu/~jlh/214/Random%20Rectangles.pdf Hi David, There are a number of ways to generate random rectangles, for instance: # each row specifies the number of rows and columns of squares rr.df-data.frame(nrow=sample(1:12,100,TRUE,prob=12:1), ncol=sample(1:12,100,TRUE,prob=12:1)) Then just plot the resulting rectangles: sqrect-function(x0,y0,x1,y1) { nx-x1-x0-1 ny-y1-y0-1 for(x in 0:nx) { for(y in 0:ny) rect(x0+x,y0+y,x0+x+1,y0+y+1) } } rrPlot-function(rrdf,div=1.3) { nrect-dim(rrdf)[1] plotspace-nrect/div plot(c(1,plotspace),c(1,plotspace),type=n, axes=FALSE,xlab=,ylab=,main=Random Rectangles) xpos-ypos-maxypos-1 for(rectangle in 1:nrect) { if(xpos+rrdf[rectangle,1] plotspace) { xpos-1 ypos-maxypos maxypos-1 } sqrect(xpos,ypos,xpos+rrdf[rectangle,1], ypos+rrdf[rectangle,2]) xpos-xpos+rrdf[rectangle,1]+1 if(ypos+rrdf[rectangle,2] maxypos) maxypos-ypos+rrdf[rectangle,2]+2 } } The example above does not do any sophisticated placing of the rectangles, but more importantly, shows that there are probably unstated constraints on the randomness of the rectangles. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphical distance matrix
On 01/10/2013 07:50 AM, eliza botto wrote: Dear R-family, I made a distance matrix of about 2000 stations. its extremely hard to visualize the details of that matrix. I heard that there is a way in R to represent the details of distance matrix graphically. more precisely, different sections of our distance matrix can be presented in different colors. low values be presented in light colors and high values in dark. is there really a way of doing it?? thanks in advance regards elisa Hi elisa, In the example for the function color.scale.lines (plotrix) you will find one method of coloring something (lines in this case) depending upon the distance from something else (the starting point). With judicious modification, I think it might do what you want. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to count A, C, T, G in each row in a big data.frame?
Hi arun Then how could spilt them and get a table of letters count such as: id AA AG CC CT GA GG GT TC TG TT id A T C G #1 27412 81 0 0 25 #2 27413 0 77 29 0 Thanks 2013/1/10 arun smartpink...@yahoo.com: Hi Yao, You could also use: library(reshape2) dd-dat1[,-(1:4)] res-dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length) head(res) # id AA AG CC CT GA GG GT TC TG TT #1 27412 29 10 0 0 13 1 0 0 0 0 #2 27413 0 0 4 9 0 0 0 12 0 28 #3 27414 0 0 0 0 0 0 0 0 0 53 #4 27415 0 0 53 0 0 0 0 0 0 0 #5 27416 0 0 3 9 0 0 0 12 0 29 #6 27417 0 0 53 0 0 0 0 0 0 0 #Just for comparison: dat2- dat1[rep(row.names(dat1),2000),] nrow(dat2) #[1] 4 row.names(dat2)-1:4 dd - dat2[,-(1:4)] system.time(res1- table(rownames(dd)[row(dd)], unlist(dd))) # user system elapsed # 5.840 0.104 5.954 system.time(res2 - dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length)) # user system elapsed # 3.100 0.064 3.167 head(res1,3) # AA AG CC CT GA GG GT TC TG TT # 1 29 10 0 0 13 1 0 0 0 0 # 10 0 4 0 0 6 43 0 0 0 0 # 100 19 15 0 0 15 4 0 0 0 0 head(res2,3) # id AA AG CC CT GA GG GT TC TG TT #1 1 29 10 0 0 13 1 0 0 0 0 #2 10 0 4 0 0 6 43 0 0 0 0 #3 100 19 15 0 0 15 4 0 0 0 0 A.K. - Original Message - From: Yao He yao.h.1...@gmail.com To: R help r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 9:23 AM Subject: [R] how to count A,C,T,G in each row in a big data.frame? Dear All I have a data.frame like that: structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, CC, GG, CC, AA, AA, AG), X2536 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, GG, AG, GG, TT, TC, AG, TC, AA, AA, GA), X2581 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GA, GG, TT, CC, GA, CT, AA, AA, AG), X2570 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, GG, GG, TT, TC, GG, CC, AA, AA, GG), X2476 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG,
Re: [R] how to count A, C, T, G in each row in a big data.frame?
Here is one option (not the best, but does the job): foo - function(x) table(factor(unlist(strsplit(as.character(x), )), levels = c('A','C','G','T'))) t(apply(d[, -c(1:4)], 1, foo)) What's wrong with Jim Holtman's solution? HTH, Jorge.- On Thu, Jan 10, 2013 at 3:46 PM, Yao He wrote: Hi arun Then how could spilt them and get a table of letters count such as: id AA AG CC CT GA GG GT TC TG TT id A T C G #1 27412 81 0 0 25 #2 27413 0 77 29 0 Thanks 2013/1/10 arun smartpink...@yahoo.com: Hi Yao, You could also use: library(reshape2) dd-dat1[,-(1:4)] res-dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length) head(res) # id AA AG CC CT GA GG GT TC TG TT #1 27412 29 10 0 0 13 1 0 0 0 0 #2 27413 0 0 4 9 0 0 0 12 0 28 #3 27414 0 0 0 0 0 0 0 0 0 53 #4 27415 0 0 53 0 0 0 0 0 0 0 #5 27416 0 0 3 9 0 0 0 12 0 29 #6 27417 0 0 53 0 0 0 0 0 0 0 #Just for comparison: dat2- dat1[rep(row.names(dat1),2000),] nrow(dat2) #[1] 4 row.names(dat2)-1:4 dd - dat2[,-(1:4)] system.time(res1- table(rownames(dd)[row(dd)], unlist(dd))) # user system elapsed # 5.840 0.104 5.954 system.time(res2 - dcast(melt(within(dd,{id=row.names(dd)}),id.var=id),id~value,length)) # user system elapsed # 3.100 0.064 3.167 head(res1,3) # AA AG CC CT GA GG GT TC TG TT # 1 29 10 0 0 13 1 0 0 0 0 # 10 0 4 0 0 6 43 0 0 0 0 # 100 19 15 0 0 15 4 0 0 0 0 head(res2,3) # id AA AG CC CT GA GG GT TC TG TT #1 1 29 10 0 0 13 1 0 0 0 0 #2 10 0 4 0 0 6 43 0 0 0 0 #3 100 19 15 0 0 15 4 0 0 0 0 A.K. - Original Message - From: Yao He yao.h.1...@gmail.com To: R help r-help@r-project.org Cc: Sent: Wednesday, January 9, 2013 9:23 AM Subject: [R] how to count A,C,T,G in each row in a big data.frame? Dear All I have a data.frame like that: structure(list(name = c(Gga_rs10722041, Gga_rs10722249, Gga_rs10722565, Gga_rs10723082, Gga_rs10723993, Gga_rs10724555, Gga_rs10726238, Gga_rs10726461, Gga_rs10726774, Gga_rs10726967, Gga_rs10727581, Gga_rs10728004, Gga_rs10728156, Gga_rs10728177, Gga_rs10728373, Gga_rs10728585, Gga_rs10729598, Gga_rs10729643, Gga_rs10729685, Gga_rs10729827), chr = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), pos = c(11248993L, 20038370L, 16164457L, 38050527L, 20307106L, 13707090L, 12230458L, 36732967L, 2790856L, 1305785L, 29631963L, 13606593L, 13656397L, 2261611L, 32096703L, 13733153L, 16524147L, 558735L, 12514023L, 3619538L), strand = c(+, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +, +), X2353 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AG, AG, AG, TT, CC, AG, CC, AA, GG, GG), X2409 = c(AA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GG, AG, AG, TT, CC, AG, CC, AA, AG, GA), X2500 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, CT, GG, CC, AA, AA, AA), X2598 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, AG, GG, TT, CC, AG, TC, AA, AA, AG), X2610 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, GA), X2300 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, AG, TT, TC, AA, TC, AA, AG, AA), X2507 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GA, GG, TT, TC, GG, CC, AA, GA, AG), X2530 = c(AG, TC, TT, CC, TC, CC, CC, TT, CC, GG, AA, GG, GG, TT, CC, GG, CC, AA, AA, AA), X2327 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GG, GG, TT, TC, GG, CC, AA, AA, AA), X2389 = c(AA, CC, TT, CC, CC, CC, CC, TT, CC, AG, GG, AG, GG, TT, TC, AG, CC, AA, AA, AA), X2408 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, GA, GG, TT, CC, GA, CC, AA, AA, AG), X2463 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, TT, CT, GG, CC, AA, AA, GA), X2420 = c(GA, TC, TT, CC, TC, CC, CC, TT, CC, GG, AG, GG, GG, TG, TT, GG, CT, AA, AA, AA), X2563 = c(GA, CC, TT, CC, TC, CC, CC, TT, CC, GG, GA, GG, GG, GT, TT, GG, CT, AA, AA, AA), X2462 = c(AA, TT, TT, CC, TT, CC, CC, TT, CC, GG, AA, GG, GG, GT, TC, GG, CC, AA, AA, AA), X2292 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GA, AA, GG, TG, TC, AA, TC, AA, AA, AA), X2405 = c(GA, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, AG, GG, TG, TT, AA, CT, AA, AA, AA), X2543 = c(AA, TC, TT, CC, TC, CC, CC, TT, CC, GA, GA, GA, GG, TT, CT, GA, TT, AA, AA, GG), X2557 = c(AG, CT, TT, CC, CT, CC, CC, TT, CC, GG, AG, GA, GG, GT, CT, GA, CT, AA, AA, AG), X2583 = c(GA, CT, TT, CC, CT, CC, CC, TT, CC, GG, GA, GG, GG, GG, CT, GA, CT, AA, AA, AG), X2322 = c(AG, TT, TT, CC, TT, CC, CC, TT, CC, GG, GG, GG, GG, GT, TT, GG, CC, AA, AA, GA), X2535 = c(AA, TC, TT, CC, TT, CC,
[R] ./R: error while loading shared libraries
Hi, I have installed R on linux using a non root account. I am getting this error when trying to use it : ./R: error while loading shared libraries: libRblas.so: cannot open shared object file: No such file or directory Linux version I am using : Linux version 2.6.32-131.17.1.el6.x86_64 (mockbu...@x86-007.build.bos.redhat.com) (gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC) ) #1 SMP Thu Sep 29 10:24:25 EDT 2011 Can someone help ? Regards, Adam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Determining sample size from power function
Hello, I am trying to get the power function to report the sample size rather than the power. My goal is to input a variety of values for theta and then for the power function to report the corresponding sample sizes. I haven't had much luck trying to create my own function, something along the lines of: f - function (x) { power(N=z,a=6,f=6,pi=.5,alpha=.1,t0=10,theta=(1/x),CIFev0=.476,CIFcr0=0))=0.8 read(z) } In the above example, I am trying to fix the power at 0.80 and solve for z, which is the sample size. I would like x to be a random distribution of thetas. For instance: x=rnorm(30,.5,.2) and then receive the 30 corresponding sample sizes. Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple versions of function
mea culpa. f - function(...) { ## parse out the arguments and then do something with them } ## all of these should result in the same actions f(2,3) ## interprets a to be first and b to be second f(a=2,b=3) f(b=3,a=2) f(data.frame(a=2,b=3)) f(data.frame(b=3,a=1)) On Tue, Jan 8, 2013 at 8:00 AM, David Winsemius dwinsem...@comcast.net wrote: On Jan 7, 2013, at 6:58 PM, ivo welch wrote: hi david---can you give just a little more of an example? the function should work with call by order, call by name, and data frame whose columns are the names. /iaw It is I who should be expecting you to provide an example. -- David. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interpreting Rasch models
I'm testing an education assessment (evaluating quantitative skills in biology students) for reliability. I used cronbach's alpha, but got a really low alpha which I think is likely due to my small number of questions (12) and the fact that the questions are of varying difficulty. After a lot of reading, it looks like Rasch models are probably a more appropriate tool for my question so I've figured out how to do that analysis in R using the eRm package. But now I'm having a really hard time interpreting the output and finding resources to help. Here are the results of my RM estimation: Conditional log-likelihood: -394.9651 Number of iterations: 25 Number of parameters: 11 Can anyone help me interpret? I can post further output if you can help, but didn't want to send a giant email of results! Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Parameter estimates for each observation (ordered choice)
I have several demographic variables with which I want to explain the ordered choice of individuals within a survey in an ordered choice (probit or logit, this is not important) framework. Standard ordered choice estimations of course just give me aggregate/average parameter estimates. For my task it would however be useful to estimate or extract hypothetical individual-level parameter estimates (betas) for a certain independent variable and each individual in the survey. I have experimented with hierarchical Bayes algorithms provided by the bayesm and ChoiceModelR. Correct me if I am wrong but I think these techniques also demand that individuals to appear several times within a survey (thus it should be a panel) and are confronted with different choice situations, so that one can estimate the influence of certain attributes on the individuals choices. Anyway ChoiceModelR and bayesm just provide multinomial choice models while I am seeking for an ordinal probit. My data however doesn't have any panel structure. I was also experimenting with Bayesian inference in example by the MCMCoprobit function in the MCMCpack package, but this function just simulates betas. I can't however, as far as I know, attribute them to certain individuals in the survey, which would be good. I would be very glad if somebody could give me a hint, sometimes already a catchword is helpful to google the correct solution! Thanks and best regards, AK P.S.: the last thing I tried was Compound Hierarchical Ordered Probit (CHOPIT) because with that I am able to calculate individual cut-off points which maybe allow be to calculate individual betas. but i didn't try it exetnsively yet. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Determining sample size from power function
Dear Kaveh, Take a look at http://www.statmethods.net/stats/power.html HTH, Jorge.- On Thu, Jan 10, 2013 at 3:21 PM, Kaveh Zakeri wrote: Hello, I am trying to get the power function to report the sample size rather than the power. My goal is to input a variety of values for theta and then for the power function to report the corresponding sample sizes. I haven't had much luck trying to create my own function, something along the lines of: f - function (x) { power(N=z,a=6,f=6,pi=.5,alpha=.1,t0=10,theta=(1/x),CIFev0=.476,CIFcr0=0))=0.8 read(z) } In the above example, I am trying to fix the power at 0.80 and solve for z, which is the sample size. I would like x to be a random distribution of thetas. For instance: x=rnorm(30,.5,.2) and then receive the 30 corresponding sample sizes. Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.