Re: [R] Loop question
Dear Sebastian, The following will create the names paste(sb,1:5,sep=) paste(sw,1:5,sep=) paste(Lw,1:5,sep=) paste(Lb,1:5,sep=) Then you can easily combine and/or order them in R. Hope this helps. Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/Loop-question-tp4631896p4631900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub/strsplit with multiple patterns/splits
There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help page for specific tools. After you have waded through a tutorial, the following explanation should make more sense. The braces are extended regex syntax for a repetition of a pattern by some minimum to some maximum number of times. The pattern immediately precedes the repetition specification. In the first case of {0,1} the pattern being repeated is the comma, and in the second case it is any of the characters in the square brackets (a period in this case). The period is a special match any character pattern when not part of a set of characters. A common shorthand for zero or one of something is a + symbol. Also, please learn to provide quoting context for the majority of us who do not use Nabble. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. mdvaan mathijsdev...@gmail.com wrote: Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to remove the word Energy? Thank you very much in advance. Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: numerical expression has 1000 elements: only the first used
Hi, Your mistake seems to be in sum(v[1:x]) You create x as a vector but your treat it as a single number. v[1:x] expects x to be a single number and only considers its first element which is 1. If I understand your query correctly, the following might handle your problem: sum.vec -NULL for (x in 1:1000){ t - rbinom(1000, 1, 0.5) v - replace(t,t==0,-1) sum.vec-c(sum.vec,sum(v[1:x])) } Best Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/Warning-message-numerical-expression-has-1000-elements-only-the-first-used-tp4631813p4631903.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop question
Note that in R = 2.15 you can also use paste0 for this operation more efficiently. Michael On Thu, May 31, 2012 at 1:58 AM, Özgür Asar oa...@metu.edu.tr wrote: Dear Sebastian, The following will create the names paste(sb,1:5,sep=) paste(sw,1:5,sep=) paste(Lw,1:5,sep=) paste(Lb,1:5,sep=) Then you can easily combine and/or order them in R. Hope this helps. Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/Loop-question-tp4631896p4631900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem in a programming
Hello Everyone, I am a MS students in the Department of Statistics, Biostatistics Informatics, in University of Dhaka, Bangladesh. Recently, I am doing thesis on D-optimal Designs for Regression models in Biostatistics. In my thesis work, I am having problem with a particular program where I need to replace each element of a vector one by one by every element of another seq (which is of higher length than that of the vector), and then calculated a quantity from each of the resulting vector and compare them. I have written the following program, but I think it is not running correctly, Can anyone please correct this Please! dopt-function(beta) { set.covar-seq(-1,1,0.001) n-length(set.covar) xi-sample(set.covar,5) ## start of information loop infor-function(xi) { mu-exp(t(xi)*beta)/(1+exp(t(xi)*beta)) mui-mu%*%t(1-mu) xmui-mui%*%t(xi) info-xmui%*%xi } ## end of information loop info-infor(xi) ##start of loop to replace every element of x (i loop) for(i in 1:5) { ## start of a loop to replace each element of xi with every element of set.covar (j loop) for(j in 1:n) { x-xi xi-replace(x,i,set.covar[j]) infoi-infor(xi) info-ifelse(infoiinfo,c(infoi,return(xi)),c(info,return(x))) } ## end of j loop } ## end of i loop } ## end of mainloop (dopt) Please help me with the loop. With best regards, Farzana. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sorting a data set
Hi Tony, Try this: dataT-data.frame(Var1=rep(c(Sole,Lack,ABD,Zad),rep(5,4)), Var2=rnorm(20,0.5),Var3=runif(20,0.4)) dataT1-dataT[with(dataT,order(Var1,Var2,Var3)),] dataT1 Var1 Var2 Var3 12 ABD -0.19842353 0.4333720 13 ABD 0.14050814 0.9194297 11 ABD 1.07544531 0.4539302 14 ABD 1.17039127 0.7840392 15 ABD 1.23533897 0.5105670 6 Lack -0.14460512 0.7106342 10 Lack 0.36935316 0.6118821 9 Lack 0.62868056 0.5915753 --- A.K. - Original Message - From: tony.anderson tony.ander...@noaa.gov To: r-help@r-project.org Cc: Sent: Wednesday, May 30, 2012 6:07 PM Subject: [R] Sorting a data set I am a novice user of R and am stumbling on how to order a dataset produced during my session. I have a 1863 row X 14 column dataset that I want to put out to a file. I want the output sorted by the first column and then by the second column both in ascending order. The first column is character and the second is numeric (I hope). I used an as.numeric function to assign that variable. Is there a reason R would not accept 0 or 00 as a numeric value? I have tried using the order function but the examples I have seen don't seem to translate for me. I tried something like this assuming my dataset is called data. datanew-data[order(var1, var2),] print(datanew) This generates an incorrect number of dimensions error in the order function. I also tried listing all the variables in the parentheses. Your help is appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub/strsplit with multiple patterns/splits
0 and 1 means zero or 1 match. Want to remove the word Energy? gsub(( Energy){0,1},{0,1} Inc[.]{0,1}, , DF) On Thu, May 31, 2012 at 11:45 AM, mdvaan mathijsdev...@gmail.com wrote: Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to remove the word Energy? Thank you very much in advance. Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: numerical expression has 1000 elements: only the first used
Ozgur -- No, this is not what the OP seems to be asking for (and it's bad code anyways -- I've mentioned the importance of pre-allocation to you before): OP, if I understand you, you are looking for a cumsum() operation: something like: t - rbinom(1000, 1, 0.5) t[t==0] - (-1) cumsum(t) Alternatively, noting that rbinom(, 1, ) gives only values of 0 and one, it's even faster to do: cumsum(2*rbinom(1000,1,0.5)-1) e.g., set.seed(1) t - rbinom(1000, 1, 0.5) t[t==0] - (-1) a - cumsum(t) set.seed(1) b - cumsum(2*rbinom(1000,1,0.5)-1) identical(a,b) # TRUE Best, Michael On Thu, May 31, 2012 at 2:08 AM, Özgür Asar oa...@metu.edu.tr wrote: Hi, Your mistake seems to be in sum(v[1:x]) You create x as a vector but your treat it as a single number. v[1:x] expects x to be a single number and only considers its first element which is 1. If I understand your query correctly, the following might handle your problem: sum.vec -NULL for (x in 1:1000){ t - rbinom(1000, 1, 0.5) v - replace(t,t==0,-1) sum.vec-c(sum.vec,sum(v[1:x])) } Best Ozgur - Ozgur ASAR Research Assistant Middle East Technical University Department of Statistics 06531, Ankara Turkey Ph: 90-312-2105309 http://www.stat.metu.edu.tr/people/assistants/ozgur/ -- View this message in context: http://r.789695.n4.nabble.com/Warning-message-numerical-expression-has-1000-elements-only-the-first-used-tp4631813p4631903.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R learning
Learn by solving your own problem. Break down your [real or toy] problem into solvable subtasks. Find out how to solve these subtasks using R. Quick-R is a good reference for task specific information. http://www.statmethods.net/ On Thu, May 31, 2012 at 5:57 AM, arun.gurubaramurugeshan arun.gurubaramuruges...@autozone.com wrote: If you haven't already look at Introduction to R, please follow this link http://cran.r-project.org/doc/manuals/R-intro.pdf;. There are several books which will teach you R, please look at online retailers like Amazon, Ebay etc., Online search for specific task will also to help you to gather knowledge, what I mean is, search online for summarize a data table in R it will produce a lot of results and you will find different people saying different ways to get the task done which will help to learn more R coding. Hope this helps. Thanks Arun -- View this message in context: http://r.789695.n4.nabble.com/R-learning-tp4631814p4631871.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: How to calculate a confidence interval of a ratio
Given that this is just s(x2) - s(x1) then you can get the CI using the type= lpmatrix with predict.gam. Here's an example... library(mgcv) ## simulate some data dat - gamSim(1,n=400,dist=poisson,scale=.25) ## fit log-linear model... b - gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=poisson, data=dat,method=REML) ## data at which predictions to be compared... pd - data.frame(x0=c(.2,.3),x1=c(.5,.5),x2=c(.5,.5), x3=c(.5,.5)) ## log(E(y_1)/E(y_2)) = s(x_1) - s(x_2) Xp - predict(b,newdata=pd,type=lpmatrix) ## ... Xp%*%coef(b) gives log(E(y_1)) and log(E(y_2)), ## so the required difference is computed as... diff - (Xp[1,]-Xp[2,]) dly - t(diff)%*%coef(b) ## required log ratio (diff of logs) se.dly - sqrt(t(diff)%*%vcov(b)%*%diff) ## corresponding s.e. dly + c(-2,2)*se.dly ## 95%CI On 23/05/12 15:37, Gevin Brown wrote: Dear R-Users, Dr. Wood replied to a similar topic before where confidence intervals were for a ratio of two treatments ( https://stat.ethz.ch/pipermail/r-help/2011-June/282190.html). But my question is more complicated than that one. In my case, log(E(y)) = s(x) where y is a smooth function of x. What I want is the confidence interval of a ratio of log[(E(y2))/E(y1)] given two fixed x values of interest. This is complicated than two treatments because they can be modeled as a binary variable as Dr. Wood pointed out. I am wondering if mgcv has some embedded functions to calculate this quickly. Best regards, Gevin 2012-05-23 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ignore NA column in a DF (for calculation) without removing them
Dear users, I have for the moment a function which looks for the best correlation for each file I have in my correlation matrix. I'm working on a list.files. Here's the function: get.max.cor - function(station, mat){ mat[row(mat) == col(mat)] - -Inf which( mat[station, ] == max(mat[station, ],na.rm=TRUE) ) } If I have a correlation matrix like this (no NA-value): cor1 - read.table(text= ST208 ST209 ST210 ST211 ST212 ST208 1.000 0.8646358 0.8104837 0.8899451 0.7486417 ST209 0.8646358 1.000 0.9335584 0.8392696 0.8676857 ST210 0.8104837 0.9335584 1.000 0.8304132 0.9141465 ST211 0.8899451 0.8392696 0.8304132 1.000 0.8064669 ST212 0.7486417 0.8676857 0.9141465 0.8064669 1.000 , header=TRUE) It works perfectly. If I have a correlation matrix with some NAs (but not only NAs) like this: cor2 - read.table(text= ST208 ST209 ST210 ST211 ST212 ST208 1.000 NA 0.9666491 0.9573701 0.9233598 ST209 NA 1.000 0.9744054 0.9577192 0.9346706 ST210 0.9666491 0.9744054 1.000 0.9460145 0.9582683 ST211 0.9573701 0.9577192 0.9460145 1.000 NA ST212 0.9233598 0.9346706 0.9582683 NA 1.000 , header=TRUE) It still works thanks to na.rm=TRUE, but when I have one file with no data, and so only NAs in the column like this: cor3 - read.table(text= ST208 ST209 ST210 ST211 ST212 ST208 1.000 NA 0.8104837 0.8899451 0.7486417 ST209 NA NA NA NA NA ST210 0.8104837 NA 1.000 0.8304132 0.9141465 ST211 0.8899451 NA 0.8304132 1.000 0.8064669 ST212 0.7486417 NA 0.9141465 0.8064669 1.000 , header=TRUE) It doesn't work of course, because there's no non-NA value and so, no max correlation for this file. That's why I have this error: 0 (non-na) cases. I tried to remove the NA columns, but as I'm working on a list.files, the number of files in the list and in the matrix will be not the same. I searched on the web but I only found some topics about removing NA columns. In my case, I would like to ignore these NA columns without removing them. I would like to say to R: when you are looking for the highest correlation for each file in the correlation matrix, if you see a file with no correlation coeff (only NAs column), don't do anything with it, keep it like this and go to the next file (next column or row). I also tried to put else {NA} or else {NULL} to avoid this problem but it still doesn't work. Does somebody have an idea how to solve this problem? Thank you very much. Best regards Geoffrey -- View this message in context: http://r.789695.n4.nabble.com/ignore-NA-column-in-a-DF-for-calculation-without-removing-them-tp4631912.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error - Could not resolve host: search.twitter.com;
I am trying to use the TwitteR package but getting this error on all functions - Could not resolve host: search.twitter.com or Could not resolve host: api.twitter.com; I think the issue is because R is not able to connect to the site. What should i do? Please help. -- View this message in context: http://r.789695.n4.nabble.com/Error-Could-not-resolve-host-search-twitter-com-tp4631910.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Automated essay scoring by R
Hi, I am a doctoral student and I want to have a study about the automated essay scoring system. From some papers the authors mentioned that some experiments of the automated essay scoring study are using the package of the R open source software. I am a new learner of R and I would like to know which package of R can be used to serve this purpose and how to do then. Grateful if you could give me the guidance. Thank you very much for your help. Regards, Andrew Cheung [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maptools: using to sets of information (within two shape files) for one plot
Dear all, I am using a shape file containing the information regarding the borders of the PARISHES of Austria. I created a plot with different colours for different percentages of child care institutions. Now I would like to add the information of the COUNTY boundaries to this plot which I have got in another shape file. I would like these boundaries to be shown with bold lines so that they can be seen well. I tried this with adding another plot to the existing plot, but the second plot was not at the same place as the first plot and I couldn't change that by using the commands fig or mar. Could anybody help me with this? Thank you very much in advance. Marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probably a good use for apply
On 05/31/2012 10:50 AM, LCOG1 wrote: Hi all, I Have a data frame test.. that I would like to convert into a list below test_ but am unsure how to efficiently do this. I can do it in a for loop but my data set is huge and it takes forever. Wondering how I can do this more efficiently. So again how to I go from test.. to test_ below? #Data frame test..- data.frame(Apples = c(1,3,0,0,1), Pears = c(0,0,1,0,2), Beans = c(1,2,1,0,0)) #list - my desired outcome test_- list(1 = c(Apples,Beans), 2 = c(Apples,Apples,Apples,Beans,Beans), 3 = c(Pears,Beans), 4 = c(NULL), 5 = c(Apples,Pears,Pears)) Hi Josh, How about this? test.. Apples Pears Beans 1 1 0 1 2 3 0 2 3 0 1 1 4 0 0 0 5 1 2 0 indices2names-function(x,xnames) return(rep(xnames,x)) apply(as.matrix(test..),1,indices2names,names(test..)) [[1]] [1] Apples Beans [[2]] [1] Apples Apples Apples Beans Beans [[3]] [1] Pears Beans [[4]] character(0) [[5]] [1] Apples Pears Pears Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading file in zip archive
Hi Phil That's it. Thanks. Will have a read at the docs now and see if I can figure out why leaving the 'r'ead instruction out works. Seems counter-intuitive! Best Iain From: Phil Spector spec...@stat.berkeley.edu To: Iain Gallagher iaingallag...@btopenworld.com Cc: r-help r-help@r-project.org Sent: Thursday, 31 May 2012, 0:06 Subject: Re: [R] reading file in zip archive Iain - Do you see the same behaviour if you use z - unz(pathToZip, 'x.txt') instead of z - unz(pathToZip, 'x.txt','r') - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 30 May 2012, Iain Gallagher wrote: Hi Phil Thanks, but this still doesn't work. Here's a reproducible example (was wrapping my head around these functions before). x - as.data.frame(cbind(rep('a',5), rep('b',5))) y - as.data.frame(cbind(rep('c',5), rep('d',5))) write.table(x, 'x.txt', sep='\t', quote=FALSE) write.table(y, 'y.txt', sep='\t', quote=FALSE) zip('test.zip', files = c('x.txt', 'y.txt')) pathToZip - paste(getwd(), '/test.zip', sep='') z - unz(pathToZip, 'x.txt', 'r') zT - read.table(z, header=FALSE, sep='\t') Error in read.table(z, header = FALSE, sep = \t) : seek not enabled for this connection As I said in my previous email readLines fails as well. Rather strange really. Anyway, as before any advice would be appreciated. Best Iain _ From: Phil Spector spec...@stat.berkeley.edu To: Iain Gallagher iaingallag...@btopenworld.com Cc: r-help r-help@r-project.org Sent: Wednesday, 30 May 2012, 20:16 Subject: Re: [R] reading file in zip archive Iain - Once you specify the file to unzip in the call to unz, there's no need to repeat the filename in read.table. Try: z - unz(pathToZip, 'goCats.txt', 'r') zT - read.table(z, header=TRUE, sep='\t') (Although I can't reproduce the exact error which you saw.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 30 May 2012, Iain Gallagher wrote: Hi List I have a series of zip archives each containing several files. One of these files is called goCats.txt and I would like to read it into R from the archive. It's a simple tab delimited text file. pathToZip -'/home/iain/Documents/Work/Results/bovineMacRNAData/deAnalysis/afInfection/commonNorm/twoHrs/af2 hrs.zip' z - unz(pathToZip, 'goCats.txt', 'r') zT - read.table(z, 'goCats.txt', header=T, sep='\t') Error in read.table(z, goCats.txt, header = T, sep = \t) : ? seek not enabled for this connection The same error arises with readLines. Can anyone advise? Best iain sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: ?[1] LC_CTYPE=en_GB.utf8?? LC_NUMERIC=C ?[3] LC_TIME=en_GB.utf8??? LC_COLLATE=en_GB.utf8??? ?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8?? ?[7] LC_PAPER=C??? LC_NAME=C??? ?[9] LC_ADDRESS=C? LC_TELEPHONE=C?? [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C? attached base packages: [1] stats graphics? grDevices utils datasets? methods?? base loaded via a namespace (and not attached): [1] tools_2.15.0 [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] maptools: using to sets of information (within two shape files) for one plot
Please let us know at least what package/s you are using to read the data from shapefiles and the code you are using. The two data sets may be using different projections, so use summary(obj1) and summary(obj2) to describe their projection metadata and data extents (well, at least if they are Spatial* objects from the sp package). With sp/rgdal, or other packages these can be reprojected if you know the original coordinate systems so that plotting them together makes sense. There are good resources and vignettes sp and related tools that explain this, and a dedicated mailing list for data like these (R-Sig-Geo). Cheers, Mike. On Thu, May 31, 2012 at 7:32 PM, Marion Wenty marion.we...@gmail.com wrote: Dear all, I am using a shape file containing the information regarding the borders of the PARISHES of Austria. I created a plot with different colours for different percentages of child care institutions. Now I would like to add the information of the COUNTY boundaries to this plot which I have got in another shape file. I would like these boundaries to be shown with bold lines so that they can be seen well. I tried this with adding another plot to the existing plot, but the second plot was not at the same place as the first plot and I couldn't change that by using the commands fig or mar. Could anybody help me with this? Thank you very much in advance. Marion [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael Sumner Hobart, Australia e-mail: mdsum...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unable to run smoother in qplot() or ggplot() - complains about knots
I am guessing his data is at fixed time points or something similar. As a cheat you can apply a jitter function to the x values and it should draw... g - ggplot(data, aes(x = jitter(x), y = y,... J -- View this message in context: http://r.789695.n4.nabble.com/Unable-to-run-smoother-in-qplot-or-ggplot-complains-about-knots-tp879862p4631917.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Transform counts into presence/absence
Hi, I am looking for a very easy way to transform a column in a dataframe from counts (eg. c(1,0,21,2,0,0,234,2,0)) into a binary form to get presence/absence values e.g. c(1,0,1,1,0,0,1,1,0). Is there a simple built-in function? or do I have do to it with a replaceement funciton using IF x 0 THEN 1 etc.? /johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transform counts into presence/absence
Just use the logical operators. Counts - c(1,0,21,2,0,0,234,2,0) Counts 0 1 *(Counts 0) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Johannes Radinger Verzonden: donderdag 31 mei 2012 13:13 Aan: R-help@r-project.org Onderwerp: [R] Transform counts into presence/absence Hi, I am looking for a very easy way to transform a column in a dataframe from counts (eg. c(1,0,21,2,0,0,234,2,0)) into a binary form to get presence/absence values e.g. c(1,0,1,1,0,0,1,1,0). Is there a simple built-in function? or do I have do to it with a replaceement funciton using IF x 0 THEN 1 etc.? /johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transform counts into presence/absence
On 05/31/2012 09:13 PM, Johannes Radinger wrote: Hi, I am looking for a very easy way to transform a column in a dataframe from counts (eg. c(1,0,21,2,0,0,234,2,0)) into a binary form to get presence/absence values e.g. c(1,0,1,1,0,0,1,1,0). Is there a simple built-in function? or do I have do to it with a replaceement funciton using IF x 0 THEN 1 etc.? Hi Johannes, Probably the easiest is: testvec-c(1,0,21,2,0,0,234,2,0) newvec-ifelse(testvec0,1,0) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transform counts into presence/absence
Hello, Try x - c(1,0,21,2,0,0,234,2,0) as.integer(x != 0) Hope this helps, Rui Barradas Em 31-05-2012 12:13, Johannes Radinger escreveu: Hi, I am looking for a very easy way to transform a column in a dataframe from counts (eg. c(1,0,21,2,0,0,234,2,0)) into a binary form to get presence/absence values e.g. c(1,0,1,1,0,0,1,1,0). Is there a simple built-in function? or do I have do to it with a replaceement funciton using IF x 0 THEN 1 etc.? /johannes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to create a floating bar plot
I would like to create a 'bar' plot with the following look -0 + -+-- * o oo where the positive and negative parts of the bar should have a different color. Is there any function/package supporting this kind of plot? Thanks a lot, Roberto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transform counts into presence/absence
Original-Nachricht Datum: Thu, 31 May 2012 11:16:32 + Von: ONKELINX, Thierry thierry.onkel...@inbo.be An: Johannes Radinger jradin...@gmx.at, R-help@r-project.org R-help@r-project.org Betreff: RE: [R] Transform counts into presence/absence Just use the logical operators. of course, that simple :) . thank you! Counts - c(1,0,21,2,0,0,234,2,0) Counts 0 1 *(Counts 0) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Johannes Radinger Verzonden: donderdag 31 mei 2012 13:13 Aan: R-help@r-project.org Onderwerp: [R] Transform counts into presence/absence Hi, I am looking for a very easy way to transform a column in a dataframe from counts (eg. c(1,0,21,2,0,0,234,2,0)) into a binary form to get presence/absence values e.g. c(1,0,1,1,0,0,1,1,0). Is there a simple built-in function? or do I have do to it with a replaceement funciton using IF x 0 THEN 1 etc.? /johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] anova of lme objects (model1, model2) gives different results depending on order of models
Hello- I understand that it's convention, when comparing two models using the anova function anova(model1, model2), to put the more complicated (for want of a better word) model as the second model. However, I'm using lme in the nlme package and I've found that the order of the models actually gives opposite results. I'm not sure if this is supposed to be the case or if I have missed something important, and I can't find anything in the Pinheiro and Bates book or in ?anova, or in Google for that matter which unfortunately only returns results about ANOVA which isn't much help. I'm using the latest version of R and nlme, just checked both. Here is the code and output: PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit) PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit, + correlation=corAR1(form=~Date|Case)) anova(PHQmodel1, PHQmodel2) # accept model 2 Model df AIC BIClogLik Test L.Ratio p-value PHQmodel1 1 8 48784.57 48840.43 -24384.28 PHQmodel2 2 9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit, + correlation=corAR1(form=~Date|Case)) PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit) anova(PHQmodel1, PHQmodel2) # accept model 2 Model df AIC BIClogLik Test L.Ratio p-value PHQmodel1 1 9 48284.68 48347.51 -24133.34 PHQmodel2 2 8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001 In both cases I am led to accept model 2 even though they are opposite models. Is it really just that you have to put them in the right order? It just seems like if there were say four models you wouldn't necessarily be able to determine the correct order. Many thanks, Chris Beeley, Institute of Mental Health, UK ...session info follows sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] gridExtra_0.9 RColorBrewer_1.0-5 car_2.0-12 nnet_7.3-1 MASS_7.3-17 [6] xtable_1.7-0 psych_1.2.4languageR_1.4 nlme_3.1-104 ggplot2_0.9.1 loaded via a namespace (and not attached): [1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2 labeling_0.1 lattice_0.20-6 memoise_0.1 [7] munsell_0.3 plyr_1.7.1 proto_0.3-9.2 reshape2_1.2.1 scales_0.2.1 stringr_0.6 [13] tools_2.15.0 packageDescription(nlme) Package: nlme Version: 3.1-104 Date: 2012-05-21 Priority: recommended Title: Linear and Nonlinear Mixed Effects Models Authors@R: c(person(Jose, Pinheiro, comment = S version), person(Douglas, Bates, comment = up to 2007), person(Saikat, DebRoy, comment = up to 2002), person(Deepayan, Sarkar, comment = up to 2005), person(R-core, email = r-c...@r-project.org, role = c(aut, cre))) Author: Jose Pinheiro (S version), Douglas Bates (up to 2007), Saikat DebRoy (up to 2002), Deepayan Sarkar (up to 2005), the R Core team. Maintainer: R-core r-c...@r-project.org Description: Fit and compare Gaussian linear and nonlinear mixed-effects models. Depends: graphics, stats, R (= 2.13) Imports: lattice Suggests: Hmisc, MASS LazyLoad: yes LazyData: yes License: GPL (= 2) BugReports: http://bugs.r-project.org Packaged: 2012-05-23 07:28:59 UTC; ripley Repository: CRAN Date/Publication: 2012-05-23 07:37:45 Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strucchange Fstats() example
Thanks a lot for your answer Achim, this helped a lot. I have done a lot of reading, following your recommendations and I think I have a better idea of what I should use. My dataset contains binary data on survival of the calf depending on the mass of the mother. We know that the probability of survival of the calf should vary according to the mass of the mother: 3 groups of mass expected, lower survival of the calves for small and large females, best survival of calf for intermediate-sized females. I want to identify at which masses those changes in survival occur. I think the code I need to use in order to test what I want is something of that type: gmass-gefp(Success ~ Mass, family=binomial, fit=glm, order.by=~Mass) The first question I have is what is the difference between testing the gmass model shown up compared to the gmass2 model below? gmass2-gefp(Success ~ 1, family=binomial, fit=glm, order.by=~Mass) The second question, which is related I think to the first one is whether it makes sense to plot the gmass model as aggregate=FALSE, knowing that we have a single parameter in the model (Mass), and this parameter is also the parameter we use as the order.by= parameter plot(gmass, functional=meanL2BB, aggregate=FALSE) I think the whole point around questions 1 and 2 is that I don't understand the interpretation of the intercept in the gmass model??? Third question: how to choose the proper functional? I have seen that you discuss that in your CSDA(2006 2003) papers and, in the 2006 paper you say: in situations where there is a shift in the parameters and then a second shift back, it is advantageous to aggregate over time using the range and . Which means, if I understand well that rangeBB would be adapted to the kind of test I want to perform. However, since I want to determine the timing of the peaks, I need my functional to produce a time series plot, for example like meanL2BB does. Do you think I can use meanL2BB functional in my case or should I compute an home-made functional which would use the range of efp but with comp applied first and time after (is this possible???). Fourth question: is it OK to only make a visual estimation of the breakpoints from the peaks seen on the graph after plotting the efp or should I use the breakpoints() function to properly date the breakpoints??? I'm not sure this breakpoints() function can be applied to binary data? Fifth question: I have noticed that the p-values I obtain after performing the sctest(gmass, functional=meanL2BB) for example are a bit different depending on if I introduce family=binomial as an argument in my gefp() call. Should I use this argument or is it used by default when you specify fit=glm??? Last question, you said in your previous message that I could look at the maxstat_test() from package coin for an interesting nonparametric alternative but I think this package does not allow estimation of more than one breakpoint??? Thanks heaps if you can help again with those issues, Best, Geraldine -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Achim Zeileis Sent: 30. mai 2012 08:23 To: R-help@r-project.org Subject: Re: [R] strucchange Fstats() example On Tue, 29 May 2012, Mabille, Geraldine wrote: snip In the second example, the authors state the presence of at least two breakpoints. When plotting the F-statistics using the following code, we see indeed two peaks in the F-statistics, that coincides with the dates given by the authors: c.a 1973 and 1983 but when trying to add those breakpoints to the time series, only one is taken into account The breakpoints() method for Fstats objects can just extract a single breakpoint. The reason is that maximizing the F statistics is equivalent to minimizing the residual sum of squares of a model with a single breakpoint. If you want to estimate more than a single breakpoint, you need to minimize the corresponding segmented sums of squares. This can be done with the formula method of breakpoints(), see ?breakpoints. More specificially: In your example with breakpoints(fs, breaks = 2), the breaks argument is simply ignored. The method just does not have a breaks argument and it goes through ... We see that even though the F-statistics seem to show the existence of 2 breakpoints, only one is detected by the breakpoints() function. Does anyone know how this is possible? I'm totally new to strucchange so it might well be something obvious I'm missing here! Please have a closer look at the package's documentation and the corresponding papers. See citation(strucchange) for the most important references and the corresponding manual pages for more details. For the breakpoints issue you should probably start reading the CSDA paper. OTHER SIDE QUESTION: can strucchange be used if the y variable is binary??? Testing for breakpoints can be done with the
[R] Optimizing variables represented in a matrix
Dear R-list members, I have a matrix with non-numeric variables in it and I have to optimize the variables of the matrix in a formula using optim routine of the stats4 package. I know the matrix can only take numeric data and so I would like to know how to store non-numeric variables inside a matrix. Say for example: The 3X3 matrix is 0.05V1+V20.31V10.05V1 0.31V1 0.3V1+V2 0.5V1 0.05V1 0.5V1 0.1V1+V2 The matrix is only for an example and the real matrix that I want to use is a 15X15 matrix ,here I would like to optimize the values of V1 and V2 using a formula. Could you please help me how to go about to represent the matrix in R. Thanks in advance! B.Nataraj __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using RDF/OWL with R?
Hello, Is there a convenient way to import RDF/OWL data into R? I'm interested in importing BioPAX/SBPAX data into R to make them available for a wider audience. One exciting application would be to use pathway data to explain differential microarray measurements by identifying upstream nodes that are likely involved in causing the differences. This could also be used to validate pathways or to estimate concentrations or kinetic parameters. If no convenient method to import RDF/OWL exists, I would be happy to take the lead in creating a light-weight R package based on rjava and OpenRDF Sesame Rio that could query RDF/OWL data and turn the results into data frames. Thanks! Take care Oliver -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Higher log-likelihood in null vs. fitted model
Two related questions. First, I am fitting a model with a single predictor, and then a null model with only the intercept. In theory, the fitted model should have a higher log-likelihood than the null model, but that does not happen. See the output below. My first question is, how can this happen? m Call: glm(formula = school ~ sv_conform, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) sv_conform -2.5430 0.2122 Degrees of Freedom: 1488 Total (i.e. Null); 1487 Residual Null Deviance:786.1 Residual Deviance: 781.9 AIC: 764.4 null Call: glm(formula = school ~ 1, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) -2.532 Degrees of Freedom: 1488 Total (i.e. Null); 1488 Residual Null Deviance:786.1 Residual Deviance: 786.1 AIC: 761.9 logLik(m); logLik(null) 'log Lik.' -380.1908 (df=2) 'log Lik.' -379.9327 (df=1) My second question grows out of the first. I ran the same two model on the same data in Stata and got identical coefficients. However, the log-likelihoods were different than the one's I got in R, and followed my expectations - that is, the null model has a lower log-likelihood than the fitted model. See the Stata model comparison below. So my question is, why do identical models fit in R and Stata have different log-likelihoods? - Model |Obsll(null)ll(model) df AIC BIC -+--- mod1 | 1489-393.064 -390.9304 2785.8608796.4725 null | 1489-393.064 -393.064 1 788.1279 793.4338 Thanks in advance for any input or references. Andrew Miles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] inverse binomial in R
Hello! I'm having some trouble trying to replicate in R a Stata function invbinomial(n,k,p) Domain n: 1 to 1e+17 Domain k: 0 to n - 1 Domain p: 0 to 1 (exclusive) Range: 0 to 1 Description: returns the inverse of the cumulative binomial; i.e., it returns the probability of success on one trial such that the probability of observing floor(k) or fewer successes in floor(n) trials is p. I've found some hints on the web like http://rwiki.sciviews.org/doku.php?id=guides:tutorials:regression:table I tried to replicate using qbinom the results obtained in invbinomial(10,5, 0.5) .54830584 but with no success. Thank you Cheers Anna Anna Freni Sterrantino Department of Statistics University of Bologna, Italy via Belle Arti 41, 40124 BO. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Higher log-likelihood in null vs. fitted model
On 12-05-31 8:53 AM, Andrew Miles wrote: Two related questions. First, I am fitting a model with a single predictor, and then a null model with only the intercept. In theory, the fitted model should have a higher log-likelihood than the null model, but that does not happen. See the output below. My first question is, how can this happen? I suspect you'll need to give sample data before anyone can really help with this. m Call: glm(formula = school ~ sv_conform, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) sv_conform -2.5430 0.2122 Degrees of Freedom: 1488 Total (i.e. Null); 1487 Residual Null Deviance:786.1 Residual Deviance: 781.9 AIC: 764.4 null Call: glm(formula = school ~ 1, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) -2.532 Degrees of Freedom: 1488 Total (i.e. Null); 1488 Residual Null Deviance:786.1 Residual Deviance: 786.1 AIC: 761.9 logLik(m); logLik(null) 'log Lik.' -380.1908 (df=2) 'log Lik.' -379.9327 (df=1) My second question grows out of the first. I ran the same two model on the same data in Stata and got identical coefficients. However, the log-likelihoods were different than the one's I got in R, and followed my expectations - that is, the null model has a lower log-likelihood than the fitted model. See the Stata model comparison below. So my question is, why do identical models fit in R and Stata have different log-likelihoods? That's easier: they use different base measures. The likelihood is only defined up to a multiplicative constant, so the log likelihoods can have an arbitrary constant added to them and still be valid. But I would have expected both models to use the same base measure, so the differences in log-likelihood should match. Duncan Murdoch - Model |Obsll(null)ll(model) df AIC BIC -+--- mod1 | 1489-393.064 -390.9304 2785.8608796.4725 null | 1489-393.064 -393.064 1 788.1279 793.4338 Thanks in advance for any input or references. Andrew Miles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Remove columns from dataframe based on their statistics
Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A - runif(100) B - rep(1,100) C - rep(2.42,100) D - runif(100) df - data.frame(A,B,C,D) # if want to conditionally remove column B and C as they show no variations /Johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a floating bar plot
Hi Roberto, The R Graph Gallery is an excellent resource for this kind of question. You can browse the thumbnails until you find something that leads you in the right direction, like maybe: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=136 Sarah On Thu, May 31, 2012 at 7:18 AM, Roberto Brunelli roby.brune...@gmail.com wrote: I would like to create a 'bar' plot with the following look - 0 + -+-- * o oo where the positive and negative parts of the bar should have a different color. Is there any function/package supporting this kind of plot? Thanks a lot, Roberto -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inverse binomial in R
On 12-05-31 9:10 AM, anna freni sterrantino wrote: Hello! I'm having some trouble trying to replicate in R a Stata function invbinomial(n,k,p) Domain n: 1 to 1e+17 Domain k: 0 to n - 1 Domain p: 0 to 1 (exclusive) Range:0 to 1 Description: returns the inverse of the cumulative binomial; i.e., it returns the probability of success on one trial such that the probability of observing floor(k) or fewer successes in floor(n) trials is p. I've found some hints on the web like http://rwiki.sciviews.org/doku.php?id=guides:tutorials:regression:table I tried to replicate using qbinom the results obtained in invbinomial(10,5, 0.5) .54830584 but with no success. I don't think base R has a function like that, though some contributed package probably does. If you're writing it yourself you'd need to use uniroot or some other solver, e.g invbinomial - function(n, k, p) { uniroot(function(x) pbinom(5, 10, x) - p, c(0, 1)) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a floating bar plot
Please look at the likert function in the HH package. ?likert has many examples. Here is the code for your query ## install.packages(HH) ## if necessary library(HH) twobar - data.frame(neg=c(8,0,9,2), pos=c(12,9,0,4)) likert(twobar) Rich On Thu, May 31, 2012 at 9:30 AM, Sarah Goslee sarah.gos...@gmail.comwrote: Hi Roberto, The R Graph Gallery is an excellent resource for this kind of question. You can browse the thumbnails until you find something that leads you in the right direction, like maybe: http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=136 Sarah On Thu, May 31, 2012 at 7:18 AM, Roberto Brunelli roby.brune...@gmail.com wrote: I would like to create a 'bar' plot with the following look -0 + -+-- * o oo where the positive and negative parts of the bar should have a different color. Is there any function/package supporting this kind of plot? Thanks a lot, Roberto -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
On Thu, May 31, 2012 at 8:27 AM, Johannes Radinger jradin...@gmx.at wrote: Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A - runif(100) B - rep(1,100) C - rep(2.42,100) D - runif(100) df - data.frame(A,B,C,D) # if want to conditionally remove column B and C as they show no variations You could try something like: for (i in seq(ncol(df), 1)) if (length(unique(df[, i])) == 1) { df[, i] - NULL } or for just numeric values: for (i in seq(ncol(df), 1)) if (all(mean(df[, i]) == df[, i])) { df[, i] - NULL } HTH, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Higher log-likelihood in null vs. fitted model
Hi Duncan: I don't know if the following can help but I checked the code and logLik defines the log likelihood as (p - glmobject$aic/2) where p is the glmobject$rank. So, the reason for the likelihood being less is that, in the null, it ends up being ( 1 - glmobject$aic/2) and in the other one it ends up being ( 2 - glmobject$aic/2). so 2 - 764.4/2 = -380.2 and 1 - 761.9/2 = -379.95 ( close enough for govt work ) So, that's where the #'s are coming from but it really depends on how AIC is defined. Likelihoods should not involve degrees of freedom ( atleast not where they make likelihood less like in the above example ) so maybe backing the likelihood out using AIC is the issue ? ( AIC = -2 * likelihood + 2p so p - AIC/2 = likelihood). AIC is a function of the likelihood but , as far as I know, likelihood is not a function of the AIC. Thanks for any insight. On Thu, May 31, 2012 at 9:26 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 12-05-31 8:53 AM, Andrew Miles wrote: Two related questions. First, I am fitting a model with a single predictor, and then a null model with only the intercept. In theory, the fitted model should have a higher log-likelihood than the null model, but that does not happen. See the output below. My first question is, how can this happen? I suspect you'll need to give sample data before anyone can really help with this. m Call: glm(formula = school ~ sv_conform, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) sv_conform -2.5430 0.2122 Degrees of Freedom: 1488 Total (i.e. Null); 1487 Residual Null Deviance:786.1 Residual Deviance: 781.9 AIC: 764.4 null Call: glm(formula = school ~ 1, family = binomial, data = dat, weights = weight) Coefficients: (Intercept) -2.532 Degrees of Freedom: 1488 Total (i.e. Null); 1488 Residual Null Deviance:786.1 Residual Deviance: 786.1 AIC: 761.9 logLik(m); logLik(null) 'log Lik.' -380.1908 (df=2) 'log Lik.' -379.9327 (df=1) My second question grows out of the first. I ran the same two model on the same data in Stata and got identical coefficients. However, the log-likelihoods were different than the one's I got in R, and followed my expectations - that is, the null model has a lower log-likelihood than the fitted model. See the Stata model comparison below. So my question is, why do identical models fit in R and Stata have different log-likelihoods? That's easier: they use different base measures. The likelihood is only defined up to a multiplicative constant, so the log likelihoods can have an arbitrary constant added to them and still be valid. But I would have expected both models to use the same base measure, so the differences in log-likelihood should match. Duncan Murdoch --**--** - Model |Obsll(null)ll(model) df AIC BIC -+**--** - mod1 | 1489-393.064 -390.9304 2785.8608 796.4725 null | 1489-393.064 -393.064 1 788.1279 793.4338 Thanks in advance for any input or references. Andrew Miles [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave fails to load
Uwe, Just a user's perspective: there are too many packages that work only on the maintainer's box and it would benefit the community if there were stricter standards for allowing people to post a package. Open systems like Ubuntu have a ratings sytem that allows users to review packages, so the few bad apples are properly labelled and can be avoided by the community. Kind regards Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Wednesday, May 30, 2012 4:23 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] odfWeave fails to load See http://cran.r-project.org/web/checks/check_results_odfWeave.html which indicates the package has some problems. Hence CRAN does not make binaries available. Please contact the maintainer. Best, Uwe Ligges On 29.05.2012 16:23, stephenb wrote: R version 2.15.0 (2012-03-30) Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: i386-pc-mingw32/i386 (32-bit) package 'survey' successfully unpacked and MD5 sums checked package 'odfWeave.survey' successfully unpacked and MD5 sums checked library(odfWeave.survey) Loading required package: odfWeave Error: package 'odfWeave' could not be loaded any ideas, anybody?? I had odfSweave on 2.12, but no such thing on 2.15 just odfWeave.survey and it won't load. -- View this message in context: http://r.789695.n4.nabble.com/odfWeave-fails-to-load-tp4631700.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
Hi Johannes, Try df[, !apply(df, 2, function(x) sd(x, na.rm = TRUE) 1e-10)] HTH, Jorge.- On Thu, May 31, 2012 at 9:27 AM, Johannes Radinger wrote: Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A - runif(100) B - rep(1,100) C - rep(2.42,100) D - runif(100) df - data.frame(A,B,C,D) # if want to conditionally remove column B and C as they show no variations /Johannes -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gsub/strsplit with multiple patterns/splits
Thank you very much. This definitely helps me out. Math Jeff Newmiller wrote There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help page for specific tools. After you have waded through a tutorial, the following explanation should make more sense. The braces are extended regex syntax for a repetition of a pattern by some minimum to some maximum number of times. The pattern immediately precedes the repetition specification. In the first case of {0,1} the pattern being repeated is the comma, and in the second case it is any of the characters in the square brackets (a period in this case). The period is a special match any character pattern when not part of a set of characters. A common shorthand for zero or one of something is a + symbol. Also, please learn to provide quoting context for the majority of us who do not use Nabble. --- Jeff NewmillerThe . . Go Live... DCN:lt;jdnewmil@.cagt;Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. mdvaan lt;mathijsdevaan@gt; wrote: Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to remove the word Energy? Thank you very much in advance. Math -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631897.html Sent from the R help mailing list archive at Nabble.com. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/gsub-strsplit-with-multiple-patterns-splits-tp4631873p4631934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bigglm binomial negative fitted value
Hi, there Since glm cannot handle factors very well. I try to use bigglm like this: logit_model - bigglm(responser~var1+var2+var3, data, chunksize=1000, family=binomial(), weights=~trial, sandwich=FALSE) fitted - predict(logit_model, data) only var2 is factor, var1 and var3 are numeric. I expect fitted should be a vector of value falls in (0,1) However, I get something like this: str(fitted) num [1:260617, 1] -0.0564 -0.0564 -0.1817 -0.1842 -0.1852 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:260617] 1 2 3 4 ... ..$ : NULL Anyone can help on this case? Thank you in advance. Best, --Yue __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Automated essay scoring by R
Look at CRAN Task View: Natural Language Processing http://cran.r-project.org/web/views/NaturalLanguageProcessing.html You may be talking about package lsa which is described in the Task View along with a link to an article, Investigating Unstructured Texts with Latent Semantic Analysis. The documentation for package lsa is located at http://cran.r-project.org/web/packages/lsa/lsa.pdf -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Andrew Cheung Sent: Thursday, May 31, 2012 3:40 AM To: r-help@r-project.org; Andrew Cheung Subject: [R] Automated essay scoring by R Hi, I am a doctoral student and I want to have a study about the automated essay scoring system. From some papers the authors mentioned that some experiments of the automated essay scoring study are using the package of the R open source software. I am a new learner of R and I would like to know which package of R can be used to serve this purpose and how to do then. Grateful if you could give me the guidance. Thank you very much for your help. Regards, Andrew Cheung [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Silencing the output of install.packages()
Hello! Is there a way to suppress the output of 'install.packages()'? I have seen that the 'download.file' function has a 'quiet' option but I do not know how to use it. Thanks for your help Tejas Kale IUCAA, Pune __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
On Thu, May 31, 2012 at 8:52 AM, J Toll jct...@gmail.com wrote: for (i in seq(ncol(df), 1)) if (length(unique(df[, i])) == 1) { df[, i] - NULL } Here's a similar method employing a more functional approach: df[, apply(df, 2, function(x) length(unique(x)) 1)] James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave fails to load
On 31.05.2012 15:59, Bond, Stephen wrote: Uwe, Just a user's perspective: there are too many packages that work only on the maintainer's box and it would benefit the community if there were stricter standards for allowing people to post a package. Open systems like Ubuntu have a ratings sytem that allows users to review packages, so the few bad apples are properly labelled and can be avoided by the community. Kind regards We know, that's why the package is scheduled for archival already (alongside with currently 30 others). Best, Uwe Ligges Stephen B -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Wednesday, May 30, 2012 4:23 AM To: Bond, Stephen Cc: r-help@r-project.org Subject: Re: [R] odfWeave fails to load See http://cran.r-project.org/web/checks/check_results_odfWeave.html which indicates the package has some problems. Hence CRAN does not make binaries available. Please contact the maintainer. Best, Uwe Ligges On 29.05.2012 16:23, stephenb wrote: R version 2.15.0 (2012-03-30) Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: i386-pc-mingw32/i386 (32-bit) package 'survey' successfully unpacked and MD5 sums checked package 'odfWeave.survey' successfully unpacked and MD5 sums checked library(odfWeave.survey) Loading required package: odfWeave Error: package 'odfWeave' could not be loaded any ideas, anybody?? I had odfSweave on 2.12, but no such thing on 2.15 just odfWeave.survey and it won't load. -- View this message in context: http://r.789695.n4.nabble.com/odfWeave-fails-to-load-tp4631700.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cluster with mahalanobis distance
Use distance() in package ecodist to compute the mahalanobis distance matrix and pass that to hclust(). -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Maria Froes Sent: Wednesday, May 30, 2012 6:42 PM To: r-help@r-project.org Subject: Re: [R] cluster with mahalanobis distance How can I perform cluster analysis using the mahalanobis distance instead of the euclidean distance? Thank you Maria Froes [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading file in zip archive
On May 31, 2012, at 6:11 AM, Iain Gallagher wrote: Hi Phil That's it. Thanks. Will have a read at the docs now and see if I can figure out why leaving the 'r'ead instruction out works. Seems counter-intuitive! It says that unz uses binary mode. You were specifying text mode. See if open=rb is any more successful. -- David. Best Iain From: Phil Spector spec...@stat.berkeley.edu To: Iain Gallagher iaingallag...@btopenworld.com Cc: r-help r-help@r-project.org Sent: Thursday, 31 May 2012, 0:06 Subject: Re: [R] reading file in zip archive Iain - Do you see the same behaviour if you use z - unz(pathToZip, 'x.txt') instead of z - unz(pathToZip, 'x.txt','r') - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 30 May 2012, Iain Gallagher wrote: Hi Phil Thanks, but this still doesn't work. Here's a reproducible example (was wrapping my head around these functions before). x - as.data.frame(cbind(rep('a',5), rep('b',5))) y - as.data.frame(cbind(rep('c',5), rep('d',5))) write.table(x, 'x.txt', sep='\t', quote=FALSE) write.table(y, 'y.txt', sep='\t', quote=FALSE) zip('test.zip', files = c('x.txt', 'y.txt')) pathToZip - paste(getwd(), '/test.zip', sep='') z - unz(pathToZip, 'x.txt', 'r') zT - read.table(z, header=FALSE, sep='\t') Error in read.table(z, header = FALSE, sep = \t) : seek not enabled for this connection As I said in my previous email readLines fails as well. Rather strange really. Anyway, as before any advice would be appreciated. Best Iain _ From: Phil Spector spec...@stat.berkeley.edu To: Iain Gallagher iaingallag...@btopenworld.com Cc: r-help r-help@r-project.org Sent: Wednesday, 30 May 2012, 20:16 Subject: Re: [R] reading file in zip archive Iain - Once you specify the file to unzip in the call to unz, there's no need to repeat the filename in read.table. Try: z - unz(pathToZip, 'goCats.txt', 'r') zT - read.table(z, header=TRUE, sep='\t') (Although I can't reproduce the exact error which you saw.) - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Wed, 30 May 2012, Iain Gallagher wrote: Hi List I have a series of zip archives each containing several files. One of these files is called goCats.txt and I would like to read it into R from the archive. It's a simple tab delimited text file. pathToZip -'/home/iain/Documents/Work/Results/bovineMacRNAData/ deAnalysis/afInfection/commonNorm/twoHrs/af2 hrs.zip' z - unz(pathToZip, 'goCats.txt', 'r') zT - read.table(z, 'goCats.txt', header=T, sep='\t') Error in read.table(z, goCats.txt, header = T, sep = \t) : ? seek not enabled for this connection The same error arises with readLines. Can anyone advise? Best iain sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-pc-linux-gnu (64-bit) locale: ?[1] LC_CTYPE=en_GB.utf8?? LC_NUMERIC=C ?[3] LC_TIME=en_GB.utf8??? LC_COLLATE=en_GB.utf8??? ?[5] LC_MONETARY=en_GB.utf8??? LC_MESSAGES=en_GB.utf8?? ?[7] LC_PAPER=C??? LC_NAME=C??? ?[9] LC_ADDRESS=C? LC_TELEPHONE=C?? [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C? attached base packages: [1] stats graphics? grDevices utils datasets? methods?? base loaded via a namespace (and not attached): [1] tools_2.15.0 [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Silencing the output of install.packages()
On 31.05.2012 16:18, Tejas Kale wrote: Hello! Is there a way to suppress the output of 'install.packages()'? I have seen that the 'download.file' function has a 'quiet' option but I do not know how to use it. I do not see any good reason to allow that. A user shoudl see if software is being installed. Uwe ligges Thanks for your help Tejas Kale IUCAA, Pune __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
Hi James Hi Jorge, Thank you very much! I like the apply-approach, it seems really quite simple and I get back the TRUE-FALSE vector which I can use for indexing the dataframe. Now there popped the questions if one can implement any exeption, like do the selection of the columns exept for column with name B. I have to think about this /Johannes Original-Nachricht Datum: Thu, 31 May 2012 09:20:27 -0500 Von: J Toll jct...@gmail.com An: Johannes Radinger jradin...@gmx.at CC: R-help@r-project.org Betreff: Re: [R] Remove columns from dataframe based on their statistics On Thu, May 31, 2012 at 8:52 AM, J Toll jct...@gmail.com wrote: for (i in seq(ncol(df), 1)) if (length(unique(df[, i])) == 1) { df[, i] - NULL } Here's a similar method employing a more functional approach: df[, apply(df, 2, function(x) length(unique(x)) 1)] James -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] counting the data in different groups for each row
Dear R, I have data like this I I D I D D D D D I D I D I D I D I D D D I D D I I I I I I I I D I D I D I I I D I I I D I D I D I D I 0 0 I I I I I I I I I D I D I D I D I I I D I I I D I D I D I D I I I D I I I I I Now for each row i want to make count in groups 2 in each group for all possible groups like for each row count of All I I in 1 group then all D I and D D in another group result should be count of each group in there respective rows Regards GRR [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimizing variables represented in a matrix
On May 31, 2012, at 7:37 AM, nata...@orchidpharma.com nata...@orchidpharma.com wrote: Dear R-list members, I have a matrix with non-numeric variables in it and I have to optimize the variables of the matrix in a formula using optim routine of the stats4 package. I know the matrix can only take numeric data Some of the things you think you know, are not so: exvec - c('0.05V1+V2', '0.31V1', '0.05V1', '0.31V1', '0.3V1+V2', '0.5V1', '0.05V1', '0.5V1', '0.1V1+V2') matrix(exvec, 3,3) [,1][,2] [,3] [1,] 0.05V1+V2 0.31V1 0.05V1 [2,] 0.31V10.3V1+V2 0.5V1 [3,] 0.05V10.5V10.1V1+V2 and so I would like to know how to store non-numeric variables inside a matrix. Say for example: The 3X3 matrix is 0.05V1+V20.31V1 0.05V1 0.31V1 0.3V1+V2 0.5V1 0.05V1 0.5V1 0.1V1+V2 The matrix is only for an example and the real matrix that I want to use is a 15X15 matrix ,here I would like to optimize the values of V1 and V2 using a formula. Whether that plan makes sense seems problematic, but that wasn't your question. Could you please help me how to go about to represent the matrix in R. I'm guessing you have thoughts of evaluating these expressions. They are not valid R expressions, however. You have some further study to do. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] one more piece of info on AIC
just one other thing about the AIC issue: there is a line in glm.fit which is the following: aic = aic(y, n, mu, weights, dev) + 2 * rank but I couldn't find the function aic so I couldn't investigate further. It looks suspicious though because it seems to me like it should be aic = -2*likelihood + 2 * rank if anyone could help me find the aic function it's appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
Hi Johannes, Here are two approaches to accomplish this: subset(df, select = -B) df[, colnames(df) != B] HTH, Jorge.- On Thu, May 31, 2012 at 10:34 AM, Johannes Radinger wrote: Hi James Hi Jorge, Thank you very much! I like the apply-approach, it seems really quite simple and I get back the TRUE-FALSE vector which I can use for indexing the dataframe. Now there popped the questions if one can implement any exeption, like do the selection of the columns exept for column with name B. I have to think about this /Johannes Original-Nachricht Datum: Thu, 31 May 2012 09:20:27 -0500 Von: J Toll jct...@gmail.com An: Johannes Radinger jradin...@gmx.at CC: R-help@r-project.org Betreff: Re: [R] Remove columns from dataframe based on their statistics On Thu, May 31, 2012 at 8:52 AM, J Toll jct...@gmail.com wrote: for (i in seq(ncol(df), 1)) if (length(unique(df[, i])) == 1) { df[, i] - NULL } Here's a similar method employing a more functional approach: df[, apply(df, 2, function(x) length(unique(x)) 1)] James -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] please help! Extract the row to the new file by using if-statment
Dear all, I find some troubles about how to extact the row from csv. file by using if-statement condition. I want to extract the row if the rainfall is greater than the mean of rainfall and using the wrfta divided into 3 groups that's rainfall greater than mean - group A ( create file group A_rain) - groupB ( create file group B_rain) - groupC ( create file group C_rain) rainfall less than mean - group A ( create file group A_norain) - groupB ( create file group B_norain) - groupC ( create file group C_norain) my csv. file is .. Date wrfRH wrfsolarwrfwindspeedwrfrain wrftd wrfta 21/10/2010 92.97 22.11 53.27 0 1546.337861 61.00852664 22/10/2010 87.35 21.99 40.89 0 1300.408288 62.85352227 23/10/2010 88.38 21.71 28.04 0.011381.768284 54.80594493 24/10/2010 92.32 15.45 22.38 0.511113.90981 39.46573663 25/10/2010 93.42 21.59 35.50.52868.4895334 28.42952321 26/10/2010 93.38 20.15 42.58 0.071404.722837 40.29300856 27/10/2010 89 21.66 42.30 1060.444918 41.86858345 28/10/2010 NA NA NA NA 1109.596721 39.84995092 29/10/2010 84.521.66 37.80 1015.801383 34.11625725 30/10/2010 84.98 22 36.27 0 839.5041209 43.44047866 31/10/2010 84.422.433.44 0 742.5284832 45.81572847 1/11/2010 80.09 22.24 38.35 0 1157.99328 45.59035293 2/11/2010 84.41 21.69 36.19 0 1075.26719 51.66310159 3/11/2010 88.55 21.22 37.73 0 1163.286504 51.34179935 4/11/2010 90.58 2.8838.49 0.561022.03364 57.74352136 5/11/2010 95.172.4632.22 3.48 1065.735327 57.7734991 6/11/2010 95.211.18 27.55 0.841027.066675 54.40282225 7/11/2010 89.45 20.81 24.75 0 720.9881913 57.76270824 8/11/2010 85.82 20.96 28.63 0 790.5735604 37.96771725 9/11/2010 85.02 20.96 31.94 0 703.2993511 40.62208274 my script is . #Import data wrfJJA_UTC06-read.csv(JJA_UTC06_ALL.csv, header =T,sep=,) attach(wrfJJA_UTC06) if(wrfrain a) groupA_norain- new[wrfta= 255 | wrfta= 65,] groupB_norain- new[wrfta= 65 wrfta= 180,] groupC_norain- new[wrfta= 180 wrfta= 255,] else groupA_rain- new[wrfta= 255 | wrfta= 65,] groupB_rain- new[wrfta= 65 wrfta= 180,] groupC_rain- new[wrfta= 180 wrfta= 255,] #save as ... write.csv(groupA_norain,groupA_norain.csv) write.csv(groupB_norain,groupB_norain.csv) . however, it gets error message. what wrong? Warning message: In if (n_wrfrain a) groupA_norain - new[n_wrfta = 255 | n_wrfta = : the condition has length 1 and only the first element will be used my data wrfrain contains NA. what can I do?! please help! -- View this message in context: http://r.789695.n4.nabble.com/please-help-Extract-the-row-to-the-new-file-by-using-if-statment-tp4631957.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with ifelse
Hi, On Thu, May 31, 2012 at 10:09 AM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Hello Sarah, I hope i have understood you; All i seek to do is to get a code that i can obtain interval censoring from without using the survival package. Can you come to my aid? Probably, but you need to meet me halfway. What do your inputs look like? What do your desired outputs look like? First, state them in plain English: my input is a vector of numeric values. In my desired output, this sort of value is changed to this number. Then provide that in R form. Here's an example input. Here's what I want the output to look like. I'm good at writing R code, but I'm not interested in wading through your non-working code to figure out what you meant. Please read the posting guide, and please send your replies back to the whole list and not just me. Sarah Thank you Chris - Original Message - From: Sarah Goslee sarah.gos...@gmail.com To: Christopher Kelvin chris_kelvin2...@yahoo.com; r-help r-help@r-project.org Cc: Sent: Thursday, May 31, 2012 3:51 AM Subject: Re: [R] problem with ifelse Since your code has things like this: z-numeric(length(t (( either you have a serious problem with your email client or you need to reread some introductory material and take a hard look at your code. Also note that g() doesn't work, because it contains the statement return(m) but m is undefined within g(). Meanwhile, you could provide what I asked: a statement of what you expect your code to produce given particular input. Otherwise how would we know if we've offered the right solution, since your function doesn't work? Using set.seed() would be a useful component of this reproducible example. Without having a working if poorly-written function to go by or a clear results statement, I'm not interested in trying to rewrite your code. But some thoughts: Here's a new version of f(). f2 - function(c1, c2) { r - pmax(c1 + c2, c1 + 0.5) cbind(c1, r) } It looks like you expected f() to be able to take vectors, but in g() you only return one value. Is that a mistake, or what you wanted? Since you're also using cbind(), I assume it's a mistake. Again, there are lots of problems here that suggest that you are coming from some other programming language and have not taken the time to learn much about R's syntax. This is easily remedied by reading the introduction. Sarah On Wed, May 30, 2012 at 3:32 PM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Hello Sarah, Thank you for your response. Below is the complete code. My desire is to obtain interval censored data through simulation to fit it on the weibull distribution to estimate the parameters. I am actually not very sure of the code correctness. You may try it and advice me on what to do and also about it correctness if time will permit you. Thank you g-function(c1,c2) { f-function(c1,c2) { u-c1 h-c1+c2 k-c1+0.5 r-numeric (length(c1)) for(i in 1:length(r)) r[i]-max(h[i],k[i]) return( cbind (u,r))} r1-f(c1,c2) r2-f(r1[2],r1[1]) r3-f(r2[2],r2[1]) r4-f(r3[2],r3[1]) r5-f(r4[2],r4[1]) a-(cbind(r1[1],r2[1],r3[1],r4[1],r5[1],r5[2])) return(m )} c1-runif(1,0,1.5) c2-runif(1,0,0.5) m-g(c1,c2) tdata-rweibull(25,0.8,1.5) v-c(0,m,999) y-function(t,v){ z-numeric(length(t (( s-numeric(length(t (( for(i in 1:length(t)){ for(j in 1:length(v-1)) { ifelse ((t[i]v[j] t v[j+1] ),{z[i]-v[j];s[i]-v[j+1]},NA)}} return(cbind(z,s))} y(t,v) Chris Kelvin Hi, The error with ifelse() seems to be that you have no idea what ifelse() does. As far as I can tell, you tried to construct code that does something like this: y-function(tdata,v){ z - rep(NA, length(tdata)) s - z for(i in 1:length(tdata)) { for(j in 1:length(v-1)) { if(tdata[i] v[j] tdata[i] v[j+1]) { z[i]-v[j] s[i]-v[j+1] } } } return(cbind(z,s)) } But what's with all the (( instead of ))? And are you certain that the logic in the if statement is correct? If you tell us what you expect the results to be for given input values, we can help with that part too. Including making this more Rish: the nested for-loop construct is entirely unnecessary here, but I'm disinclined to rewrite it unless I actually know what you're trying to achieve. Incidentally, your example is only nearly-reproducible, since we don't know what m is. Sarah On Wed, May 30, 2012 at 10:01 AM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Dear all, The code below is used to generate interval censored data but unfortunately there is an error with the ifelse which i am not able to rectify. Can somebody help correct it for me. Thank you t-rexp(20,0.2) v-c(0,m,999) y-function(t,v){ z-numeric(length(t (( s-numeric(length(t
Re: [R] one more piece of info on AIC
On 2012-05-31 07:52, Mark Leeds wrote: just one other thing about the AIC issue: there is a line in glm.fit which is the following: aic = aic(y, n, mu, weights, dev) + 2 * rank but I couldn't find the function aic so I couldn't investigate further. It looks suspicious though because it seems to me like it should be aic = -2*likelihood + 2 * rank if anyone could help me find the aic function it's appreciated. Have a look at ?family to see that the family object used in glm.fit() is a list most of whose elements are functions. The code in glm.fit() has a few lines extracting those functions, one of which is aic(). The 'gf - Gamma()' example on the help page for family is informative as is the annotated code for glm.fit in the sources. Peter Ehlers -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] community finding in a graph and heatplot
On Tue, May 29, 2012 at 1:16 AM, Aziz, Muhammad Fayez az...@illinois.edu wrote: Hi everyone, I am using the fastgreedy.community function to get the $merges matrix and the $modularity vector. This serves my purpose of testing modularity of my graph. But I am greedy to plot the heat map and dendrrogram based on the $merges dendogram matrix. I know that heatplot does the graphics part but I am not sure if the dendogram generated by the heatplot will match the one given by fastgreedy.community in all cases and that the heat map will represent the same clustering. No, they are different. To plot fast-greedy results as a dendrogram, see this and the follow-ups: http://lists.gnu.org/archive/html/igraph-help/2010-11/msg00059.html Gabor Tell me if my apprehension is incorrect. Otherwise please let me know of any alternatives. Here is the code I am testing so far: # http://igraph.sourceforge.net/doc/R/modularity.html # http://igraph.sourceforge.net/doc/R/fastgreedy.community.html # http://igraph.sourceforge.net/doc/R/graph.constructors.html library(igraph) library(made4) g - graph(c(1,2, 2,3, 3,1, 4,5)-1, , FALSE) print(g) ModuleInfo - fastgreedy.community(g) print(ModuleInfo) heatplot(c(1,2, 2,3, 3,1, 4,5)) Thanks Fayez Grad student UIUC IL, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi csa...@rmki.kfki.hu MTA KFKI RMKI __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] community finding in a graph and heatplot
Thank you so much Gabor for your reply. I had spotted your post earlier and it worked like a charm. Interestingly I have just ran into a trouble with the stament dend - igraph:::as.dendrogram.igraph.walktrap(fc). Apparently the members are empty as when I print(dend) it says 'dendrogram' with 2 branches and members total, at height 93 while the error with using dend with dendrapply remians to be Error in `[[.dendrogram`(X, 2L) : attempt to set an attribute on NULL Any ideas? My code looks like this File2Open = paste(FilePath, NetworkFiles\\net\\, NetPrefix, , TPPostfix, .net, sep = ) g - read.graph(File2Open, format=pajek) g - delete.isolates(g) g - simplify(g) fgc - fastgreedy.community(g, modularity=TRUE, weights = E(g)$weight) ModularityIndexfgc - max(fgc$modularity) # fgc modularity ModularityIndexng - modularity(g, membership, weights = E(g)$weight) # newman-girvan modularity dend - igraph:::as.dendrogram.igraph.walktrap(fgc) png(filename = paste(FilePath, Analysis\\Graphs\\EColiStressModuleHeatMap, NetPrefixAbbr, TPPostfix, .png, sep = ), width = 800, height = 800) # heat map is square adjMatrix = get.adjacency(g, attr=weight) DendNodeCounter - 0 # counter for ColorGroupsOrdered ColorGroupsOrdered - rep(red, vcount(g)) dendrapply(dend, colLab) # modifies ColorGroupsOrdered From: csardi.ga...@gmail.com [csardi.ga...@gmail.com] on behalf of Gábor Csárdi [csa...@rmki.kfki.hu] Sent: Thursday, May 31, 2012 10:45 AM To: Aziz, Muhammad Fayez Cc: r-help@r-project.org Subject: Re: [R] community finding in a graph and heatplot On Tue, May 29, 2012 at 1:16 AM, Aziz, Muhammad Fayez az...@illinois.edu wrote: Hi everyone, I am using the fastgreedy.community function to get the $merges matrix and the $modularity vector. This serves my purpose of testing modularity of my graph. But I am greedy to plot the heat map and dendrrogram based on the $merges dendogram matrix. I know that heatplot does the graphics part but I am not sure if the dendogram generated by the heatplot will match the one given by fastgreedy.community in all cases and that the heat map will represent the same clustering. No, they are different. To plot fast-greedy results as a dendrogram, see this and the follow-ups: http://lists.gnu.org/archive/html/igraph-help/2010-11/msg00059.html Gabor Tell me if my apprehension is incorrect. Otherwise please let me know of any alternatives. Here is the code I am testing so far: # http://igraph.sourceforge.net/doc/R/modularity.html # http://igraph.sourceforge.net/doc/R/fastgreedy.community.html # http://igraph.sourceforge.net/doc/R/graph.constructors.html library(igraph) library(made4) g - graph(c(1,2, 2,3, 3,1, 4,5)-1, , FALSE) print(g) ModuleInfo - fastgreedy.community(g) print(ModuleInfo) heatplot(c(1,2, 2,3, 3,1, 4,5)) Thanks Fayez Grad student UIUC IL, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi csa...@rmki.kfki.hu MTA KFKI RMKI __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimizing variables represented in a matrix
If you want an helpful answer, you must describe your real problem MUCHbetter! This is way too confused. Kjetil On Thu, May 31, 2012 at 7:37 AM, nata...@orchidpharma.com wrote: Dear R-list members, I have a matrix with non-numeric variables in it and I have to optimize the variables of the matrix in a formula using optim routine of the stats4 package. I know the matrix can only take numeric data and so I would like to know how to store non-numeric variables inside a matrix. Say for example: The 3X3 matrix is 0.05V1+V2 0.31V1 0.05V1 0.31V1 0.3V1+V2 0.5V1 0.05V1 0.5V1 0.1V1+V2 The matrix is only for an example and the real matrix that I want to use is a 15X15 matrix ,here I would like to optimize the values of V1 and V2 using a formula. Could you please help me how to go about to represent the matrix in R. Thanks in advance! B.Nataraj __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] please help! Extract the row to the new file by using if-statment
Not sure whether I understand your data and objectives well enough but here is what I would do: To make my life easier, I used x as a variable name. I'm not using attach(). You can extract your data with something like y - x[x$wrfta= 255 | x$wrfta= 65 x$wrfrain == 0, ] y - y[!is.na(y[5]),] y Date wrfRH wrfsolar wrfwindspeed wrfrain wrftdwrfta 1 21/10/2010 92.9722.1153.27 0 1546.3379 61.00853 2 22/10/2010 87.3521.9940.89 0 1300.4083 62.85352 7 27/10/2010 89.0021.6642.30 0 1060.4449 41.86858 9 29/10/2010 84.5021.6637.80 0 1015.8014 34.11626 10 30/10/2010 84.9822.0036.27 0 839.5041 43.44048 11 31/10/2010 84.4022.4033.44 0 742.5285 45.81573 12 1/11/2010 80.0922.2438.35 0 1157.9933 45.59035 13 2/11/2010 84.4121.6936.19 0 1075.2672 51.66310 14 3/11/2010 88.5521.2237.73 0 1163.2865 51.34180 18 7/11/2010 89.4520.8124.75 0 720.9882 57.76271 19 8/11/2010 85.8220.9628.63 0 790.5736 37.96772 20 9/11/2010 85.0220.9631.94 0 703.2994 40.62208 Does that help? Rgds, Rainer By the way, it is better to provide the data in dput() format: x - structure(list(Date = structure(c(2L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 1L, 3L, 14L, 15L, 16L, 17L, 18L, 19L, 20L ), .Label = c(1/11/2010, 21/10/2010, 2/11/2010, 22/10/2010, 23/10/2010, 24/10/2010, 25/10/2010, 26/10/2010, 27/10/2010, 28/10/2010, 29/10/2010, 30/10/2010, 31/10/2010, 3/11/2010, 4/11/2010, 5/11/2010, 6/11/2010, 7/11/2010, 8/11/2010, 9/11/2010), class = factor), wrfRH = c(92.97, 87.35, 88.38, 92.32, 93.42, 93.38, 89, NA, 84.5, 84.98, 84.4, 80.09, 84.41, 88.55, 90.58, 95.17, 95.2, 89.45, 85.82, 85.02), wrfsolar = c(22.11, 21.99, 21.71, 15.45, 21.59, 20.15, 21.66, NA, 21.66, 22, 22.4, 22.24, 21.69, 21.22, 2.88, 2.46, 11.18, 20.81, 20.96, 20.96), wrfwindspeed = c(53.27, 40.89, 28.04, 22.38, 35.5, 42.58, 42.3, NA, 37.8, 36.27, 33.44, 38.35, 36.19, 37.73, 38.49, 32.22, 27.55, 24.75, 28.63, 31.94), wrfrain = c(0, 0, 0.01, 0.51, 0.52, 0.07, 0, NA, 0, 0, 0, 0, 0, 0, 0.56, 3.48, 0.84, 0, 0, 0), wrftd = c(1546.337861, 1300.408288, 1381.768284, 1113.90981, 868.4895334, 1404.722837, 1060.444918, 1109.596721, 1015.801383, 839.5041209, 742.5284832, 1157.99328, 1075.26719, 1163.286504, 1022.03364, 1065.735327, 1027.066675, 720.9881913, 790.5735604, 703.2993511), wrfta = c(61.00852664, 62.85352227, 54.80594493, 39.46573663, 28.42952321, 40.29300856, 41.86858345, 39.84995092, 34.11625725, 43.44047866, 45.81572847, 45.59035293, 51.66310159, 51.34179935, 57.74352136, 57.7734991, 54.40282225, 57.76270824, 37.96771725, 40.62208274), cat = c(NA, NA, 1, 1, 1, 1, NA, NA, NA, NA, NA, NA, NA, NA, 1, 1, 1, NA, NA, NA)), .Names = c(Date, wrfRH, wrfsolar, wrfwindspeed, wrfrain, wrftd, wrfta, cat), row.names = c(NA, -20L), class = data.frame) On Thursday 31 May 2012 07:55:01 pigpigmeow wrote: Dear all, I find some troubles about how to extact the row from csv. file by using if-statement condition. I want to extract the row if the rainfall is greater than the mean of rainfall and using the wrfta divided into 3 groups that's rainfall greater than mean - group A ( create file group A_rain) - groupB ( create file group B_rain) - groupC ( create file group C_rain) rainfall less than mean - group A ( create file group A_norain) - groupB ( create file group B_norain) - groupC ( create file group C_norain) my csv. file is .. DatewrfRH wrfsolarwrfwindspeedwrfrain wrftd wrfta 21/10/201092.97 22.11 53.27 0 1546.337861 61.00852664 22/10/201087.35 21.99 40.89 0 1300.408288 62.85352227 23/10/201088.38 21.71 28.04 0.011381.768284 54.80594493 24/10/201092.32 15.45 22.38 0.511113.90981 39.46573663 25/10/201093.42 21.59 35.50.52868.4895334 28.42952321 26/10/201093.38 20.15 42.58 0.071404.722837 40.29300856 27/10/201089 21.66 42.30 1060.444918 41.86858345 28/10/2010NA NA NA NA 1109.596721 39.84995092 29/10/201084.521.66 37.80 1015.801383 34.11625725 30/10/201084.98 22 36.27 0 839.5041209 43.44047866 31/10/201084.422.433.44 0 742.5284832 45.81572847 1/11/2010 80.09 22.24 38.35 0 1157.99328 45.59035293 2/11/2010 84.41 21.69 36.19 0 1075.26719 51.66310159 3/11/2010 88.55 21.22 37.73 0 1163.286504 51.34179935 4/11/2010
[R] Quadrat counting with spatstat
I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I'm not even sure how to do it mathematically, let alone using R. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Quadrat-counting-with-spatstat-tp4631960.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forecast
#- USDEUR.xts-xts(USDEUR[,2:3], # columns with data USDEUR$Date) # column with date/time index USDEUR_sample.xts-window(USDEUR.xts,end=as.Date(2012-03-25)) nobs-length(USDEUR_sample.xts$Value) arima111_2-arima(USDEUR_sample.xts$Value, # variable order=c(1,1,1), # (p,d,q) parameters xreg=1:nobs # additional regressors - here: linear trend ) fore_111 - predict(arima111_2, n.ahead=5, newxreg=(nobs+1):(nobs+5)) # using xreg in model estimation #- Problem is that forecast starts at 1184. But my sample ends at 169. I would like to let start the forecast at 170, but I have no idea where to change this detail in the above code. Any idea? -- View this message in context: http://r.789695.n4.nabble.com/Forecast-tp4631964.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Remove columns from dataframe based on their statistics
HI, I tweaked the code of James a little bit to produce the same result. for(i in seq(ncol(df),1)) if(sd(df[,i])==0){ df[,i] -NULL } - Original Message - From: J Toll jct...@gmail.com To: Johannes Radinger jradin...@gmx.at Cc: R-help@r-project.org Sent: Thursday, May 31, 2012 9:52 AM Subject: Re: [R] Remove columns from dataframe based on their statistics On Thu, May 31, 2012 at 8:27 AM, Johannes Radinger jradin...@gmx.at wrote: Hi, I have a dataframe and want to remove columns from it that are populated with a similar value (for the total column) (the variation of that column is 0). Is there an easier way than to calculate the statistics and then remove them by hand? A - runif(100) B - rep(1,100) C - rep(2.42,100) D - runif(100) df - data.frame(A,B,C,D) # if want to conditionally remove column B and C as they show no variations You could try something like: for (i in seq(ncol(df), 1)) if (length(unique(df[, i])) == 1) { df[, i] - NULL } or for just numeric values: for (i in seq(ncol(df), 1)) if (all(mean(df[, i]) == df[, i])) { df[, i] - NULL } HTH, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] density plots using density.lf, data.frame and sort.int errors
Dear R help group: I am attempting to produce a density plot from a list of 2 values using the density.lf function and would appreciate any help, I hope I have done my homework reading the documentation but I still seem to be missing something basic. I have read the data as a table using read.table, with header=TRUE (I excluded 2000 values), when calling the objects it appears to be there and I can see the values this is what I get when doing density.lf (x=logN0, n=50, window=gaussian) Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : undefined columns selected then assigned a column name (using colname so it is called first), then assigned it as a vector using assign (x, c (logN0)) the error I get is density.lf (x, n=50, window=gaussian) Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic the traceback produces: traceback () 5: stop('x' must be atomic) 4: sort.int(x, na.last = na.last, decreasing = decreasing, ...) 3: sort.default(x) 2: sort(x) 1: density.lf(x, n = 50, window = gaussian) Thanks in advance, Andrea -- Andrea Sequeira Associate Professor Department of Biological Sciences Wellesley College, Wellesley MA 02481 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models
No, both yield the same result: reject the null hypothesis, which always corresponds to the restricted (smaller) model. albyn On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote: Hello- I understand that it's convention, when comparing two models using the anova function anova(model1, model2), to put the more complicated (for want of a better word) model as the second model. However, I'm using lme in the nlme package and I've found that the order of the models actually gives opposite results. I'm not sure if this is supposed to be the case or if I have missed something important, and I can't find anything in the Pinheiro and Bates book or in ?anova, or in Google for that matter which unfortunately only returns results about ANOVA which isn't much help. I'm using the latest version of R and nlme, just checked both. Here is the code and output: PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit) PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit, + correlation=corAR1(form=~Date|Case)) anova(PHQmodel1, PHQmodel2) # accept model 2 Model df AIC BIClogLik Test L.Ratio p-value PHQmodel1 1 8 48784.57 48840.43 -24384.28 PHQmodel2 2 9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit, + correlation=corAR1(form=~Date|Case)) PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, random=~1|Case, na.action=na.omit) anova(PHQmodel1, PHQmodel2) # accept model 2 Model df AIC BIClogLik Test L.Ratio p-value PHQmodel1 1 9 48284.68 48347.51 -24133.34 PHQmodel2 2 8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001 In both cases I am led to accept model 2 even though they are opposite models. Is it really just that you have to put them in the right order? It just seems like if there were say four models you wouldn't necessarily be able to determine the correct order. Many thanks, Chris Beeley, Institute of Mental Health, UK ...session info follows sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] gridExtra_0.9 RColorBrewer_1.0-5 car_2.0-12 nnet_7.3-1 MASS_7.3-17 [6] xtable_1.7-0 psych_1.2.4languageR_1.4 nlme_3.1-104 ggplot2_0.9.1 loaded via a namespace (and not attached): [1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2 labeling_0.1 lattice_0.20-6 memoise_0.1 [7] munsell_0.3 plyr_1.7.1 proto_0.3-9.2 reshape2_1.2.1 scales_0.2.1 stringr_0.6 [13] tools_2.15.0 packageDescription(nlme) Package: nlme Version: 3.1-104 Date: 2012-05-21 Priority: recommended Title: Linear and Nonlinear Mixed Effects Models Authors@R: c(person(Jose, Pinheiro, comment = S version), person(Douglas, Bates, comment = up to 2007), person(Saikat, DebRoy, comment = up to 2002), person(Deepayan, Sarkar, comment = up to 2005), person(R-core, email = r-c...@r-project.org, role = c(aut, cre))) Author: Jose Pinheiro (S version), Douglas Bates (up to 2007), Saikat DebRoy (up to 2002), Deepayan Sarkar (up to 2005), the R Core team. Maintainer: R-core r-c...@r-project.org Description: Fit and compare Gaussian linear and nonlinear mixed-effects models. Depends: graphics, stats, R (= 2.13) Imports: lattice Suggests: Hmisc, MASS LazyLoad: yes LazyData: yes License: GPL (= 2) BugReports: http://bugs.r-project.org Packaged: 2012-05-23 07:28:59 UTC; ripley Repository: CRAN Date/Publication: 2012-05-23 07:37:45 Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Albyn Jones Reed College jo...@reed.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimizing variables represented in a matrix
Inline ... On Thu, May 31, 2012 at 9:09 AM, Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com wrote: If you want an helpful answer, you must describe your real problem MUCH better! This is way too confused. Absolutely! -- But we certainly can say: Kjetil On Thu, May 31, 2012 at 7:37 AM, nata...@orchidpharma.com wrote: Dear R-list members, I have a matrix with non-numeric variables in it and I have to optimize the variables of the matrix in a formula using optim routine of the stats4 package. I know the matrix can only take numeric data -- This statement is false. A matrix can contain data of only one type -- no mixing -- but the type can be non-numeric, character for instance. However, as Kjetil said, your post is basically incoherent, so it is unlikely that you'll get any help here unless you post something that makes some sense. -- Bert -- Bert and so I would like to know how to store non-numeric variables inside a matrix. Say for example: The 3X3 matrix is 0.05V1+V2 0.31V1 0.05V1 0.31V1 0.3V1+V2 0.5V1 0.05V1 0.5V1 0.1V1+V2 The matrix is only for an example and the real matrix that I want to use is a 15X15 matrix ,here I would like to optimize the values of V1 and V2 using a formula. Could you please help me how to go about to represent the matrix in R. Thanks in advance! B.Nataraj __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with ifelse
Hi, On Thu, May 31, 2012 at 12:49 PM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Hello, Sorry for reply you directly but i feel that if i send to R they might not post because they may not understand what i am trying to say I want to provide you with the code below by using the survival package, that might make sense to you than all these codes i am confusing myself with. But treating me as your personal R guru is a less-optimal solution than asking the entire list. Someone else might understand your problem; I certainly don't. I have say some values represented by r which i record for the first time i am conducting my study for each individual, 15 people i have another set of data that i refer to as the stop time of my study represented by t. Now what i have is two set of data say [1 2], [3 4] that is numeric values. So you have actual measured data for start time and stop time? What do those look like? To get my desired output i require that the number of start time be equal to the number of end time. That seems unlikely. So at the end i need a data that has lower values in the left and upper values in the right. What do your recorded data look like, and what does your intended output look like? library(survival) p1-1.2;b-1.5;n-15 r-runif(n,min=0,max=b) t-rweibull(n,shape=p1,scale=b) w=Surv(r,t+r,type=interval2) So r and t+r are equivalent to your recorded data, and w is your desired output? Comments would be helpful. Then why can't you use this method to process your own data? Sarah Thank you Chris - Original Message - From: Sarah Goslee sarah.gos...@gmail.com To: Christopher Kelvin chris_kelvin2...@yahoo.com; r-help r-help@r-project.org Cc: Sent: Thursday, May 31, 2012 11:19 PM Subject: Re: [R] problem with ifelse Hi, On Thu, May 31, 2012 at 10:09 AM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Hello Sarah, I hope i have understood you; All i seek to do is to get a code that i can obtain interval censoring from without using the survival package. Can you come to my aid? Probably, but you need to meet me halfway. What do your inputs look like? What do your desired outputs look like? First, state them in plain English: my input is a vector of numeric values. In my desired output, this sort of value is changed to this number. Then provide that in R form. Here's an example input. Here's what I want the output to look like. I'm good at writing R code, but I'm not interested in wading through your non-working code to figure out what you meant. Please read the posting guide, and please send your replies back to the whole list and not just me. Sarah Thank you Chris - Original Message - From: Sarah Goslee sarah.gos...@gmail.com To: Christopher Kelvin chris_kelvin2...@yahoo.com; r-help r-help@r-project.org Cc: Sent: Thursday, May 31, 2012 3:51 AM Subject: Re: [R] problem with ifelse Since your code has things like this: z-numeric(length(t (( either you have a serious problem with your email client or you need to reread some introductory material and take a hard look at your code. Also note that g() doesn't work, because it contains the statement return(m) but m is undefined within g(). Meanwhile, you could provide what I asked: a statement of what you expect your code to produce given particular input. Otherwise how would we know if we've offered the right solution, since your function doesn't work? Using set.seed() would be a useful component of this reproducible example. Without having a working if poorly-written function to go by or a clear results statement, I'm not interested in trying to rewrite your code. But some thoughts: Here's a new version of f(). f2 - function(c1, c2) { r - pmax(c1 + c2, c1 + 0.5) cbind(c1, r) } It looks like you expected f() to be able to take vectors, but in g() you only return one value. Is that a mistake, or what you wanted? Since you're also using cbind(), I assume it's a mistake. Again, there are lots of problems here that suggest that you are coming from some other programming language and have not taken the time to learn much about R's syntax. This is easily remedied by reading the introduction. Sarah On Wed, May 30, 2012 at 3:32 PM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Hello Sarah, Thank you for your response. Below is the complete code. My desire is to obtain interval censored data through simulation to fit it on the weibull distribution to estimate the parameters. I am actually not very sure of the code correctness. You may try it and advice me on what to do and also about it correctness if time will permit you. Thank you g-function(c1,c2) { f-function(c1,c2) { u-c1 h-c1+c2 k-c1+0.5 r-numeric (length(c1)) for(i in 1:length(r)) r[i]-max(h[i],k[i]) return( cbind (u,r))} r1-f(c1,c2) r2-f(r1[2],r1[1]) r3-f(r2[2],r2[1])
[R] How can I get this function to work?
Hello All, Can anyone tell help me understand why the function below doesn't work and how I can fix it? Below are some sample data, some code that works on individual rows of the data, and my attempt to translate that code into a function. My hope is to get the function working and then to apply it to the larger data frame using ddply() from the plyr package or possibly some other approach. As yet, I don't have much experience writing anonymous functions. I imagine I'm doing something that is obviously wrong, but I don't know what it is. Thanks, Paul Read in test data testData - structure(list(profile_key = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(001-001 , 001-002 , 001-003 , 001-004 , 001-005 , 001-006 , 001-007 ), class = factor), encounter_date = structure(c(9L, 10L, 11L, 12L, 13L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label = c( 2009-03-01 , 2009-03-22 , 2009-04-01 , 2010-03-01 , 2010-10-15 , 2010-11-15 , 2011-03-01 , 2011-03-14 , 2011-10-10 , 2011-10-24 , 2012-09-15 , 2012-10-05 , 2012-10-17 ), class = factor), raw = c( ordered kras testing on 10102010 results not yet available if patient has a mutation will start erbitux , received kras results on 10202010 test results indicate tumor is wild type ua protein positve erpr positive her2neu positve , will conduct kras mutation testing prior to initiation of therapy with erbitux , still need to order kras mutation testing , ordered kras testing waiting for results , kras test results pending note that patient was negative for lynch mutation , kras results still pending note that patient was negative for lynch mutation , kras mutated will not prescribe erbitux due to mutation , kras mutated therefore did not prescribe erbitux , kras wild , tumor is negative for mutation , tumor is wild type patient is eligible to receive eribtux , if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 2011 results are still not available , kras results are in patient has the mutation , ordered kras mutation testing on 02152011 results came back negative several lines of material patient kras mutation test is negative will start erbitux , patient is kras negative started erbitux on 03012011 )), .Names = c(profile_key, encounter_date, raw), row.names = c(NA, -16L), class = data.frame) Convert text record to lowercase testData$raw - tolower(testData$raw) Remove punctuation and any multiple spaces testData$raw - gsub([[:punct:]], , testData$raw) testData$raw - gsub( +, , testData$raw) Select test row testRow - testData[13,] testRow Select terms +/- a specified number of words from kras Text - unlist(strsplit(testRow$raw, )) Target - grep(kras, Text) if (length(Target) == 0) {testRow$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - 6 0, Target - 6, 1) Upper - ifelse(Target + 6 Length, Target + 6, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} testRow$reduced - paste(Text[!is.na(Keep)], collapse= ) } testRow length(Text) length(Text[!is.na(Keep)]) Function for selecting words within specified range of a target term nearTerms - function(df, text, target, before, after, outvar){ Text - with(df, strsplit(text, )) Target - grep(target, Text) if (length(Target) == 0) {df$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - before 0, Target - before, 1) Upper - ifelse(Target + after Length, Target + after, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} df - transform(df, outvar = paste(Text[!is.na(Keep)], collapse= )) } } nearTerms(testRow, raw, kras, 6, 6) nearTerms(df = testRow, text = raw, target = kras, before = 6, after = 6) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I get this function to work?
Well, good luck finding someone to wade through your code -- small,reproducible examples are requested for a reason -- but I will offer that I have no idea what you mean with your remark about anonymous functions, as the code you posted has none. -- Bert On Thu, May 31, 2012 at 10:38 AM, Paul Miller pjmiller...@yahoo.com wrote: Hello All, Can anyone tell help me understand why the function below doesn't work and how I can fix it? Below are some sample data, some code that works on individual rows of the data, and my attempt to translate that code into a function. My hope is to get the function working and then to apply it to the larger data frame using ddply() from the plyr package or possibly some other approach. As yet, I don't have much experience writing anonymous functions. I imagine I'm doing something that is obviously wrong, but I don't know what it is. Thanks, Paul Read in test data testData - structure(list(profile_key = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(001-001 , 001-002 , 001-003 , 001-004 , 001-005 , 001-006 , 001-007 ), class = factor), encounter_date = structure(c(9L, 10L, 11L, 12L, 13L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label = c( 2009-03-01 , 2009-03-22 , 2009-04-01 , 2010-03-01 , 2010-10-15 , 2010-11-15 , 2011-03-01 , 2011-03-14 , 2011-10-10 , 2011-10-24 , 2012-09-15 , 2012-10-05 , 2012-10-17 ), class = factor), raw = c( ordered kras testing on 10102010 results not yet available if patient has a mutation will start erbitux , received kras results on 10202010 test results indicate tumor is wild type ua protein positve erpr positive her2neu positve , will conduct kras mutation testing prior to initiation of therapy with erbitux , still need to order kras mutation testing , ordered kras testing waiting for results , kras test results pending note that patient was negative for lynch mutation , kras results still pending note that patient was negative for lynch mutation , kras mutated will not prescribe erbitux due to mutation , kras mutated therefore did not prescribe erbitux , kras wild , tumor is negative for mutation , tumor is wild type patient is eligible to receive eribtux , if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 2011 results are still not available , kras results are in patient has the mutation , ordered kras mutation testing on 02152011 results came back negative several lines of material patient kras mutation test is negative will start erbitux , patient is kras negative started erbitux on 03012011 )), .Names = c(profile_key, encounter_date, raw), row.names = c(NA, -16L), class = data.frame) Convert text record to lowercase testData$raw - tolower(testData$raw) Remove punctuation and any multiple spaces testData$raw - gsub([[:punct:]], , testData$raw) testData$raw - gsub( +, , testData$raw) Select test row testRow - testData[13,] testRow Select terms +/- a specified number of words from kras Text - unlist(strsplit(testRow$raw, )) Target - grep(kras, Text) if (length(Target) == 0) {testRow$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - 6 0, Target - 6, 1) Upper - ifelse(Target + 6 Length, Target + 6, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} testRow$reduced - paste(Text[!is.na(Keep)], collapse= ) } testRow length(Text) length(Text[!is.na(Keep)]) Function for selecting words within specified range of a target term nearTerms - function(df, text, target, before, after, outvar){ Text - with(df, strsplit(text, )) Target - grep(target, Text) if (length(Target) == 0) {df$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - before 0, Target - before, 1) Upper - ifelse(Target + after Length, Target + after, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} df - transform(df, outvar = paste(Text[!is.na(Keep)], collapse= )) } } nearTerms(testRow, raw, kras, 6, 6) nearTerms(df = testRow, text = raw, target = kras, before = 6, after = 6) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list
Re: [R] How can I get this function to work?
On Thu, May 31, 2012 at 1:54 PM, Bert Gunter gunter.ber...@gene.com wrote: Well, good luck finding someone to wade through your code -- small,reproducible examples are requested for a reason -- but I will offer that I have no idea what you mean with your remark about anonymous functions, as the code you posted has none. That's exactly as far as I got, and for just the same reasons. I'll just add that if you're trying to make a function (the last thing) that does the same thing as the sample code above it, then you do rather need to include the same code in it. And if that's not what you're trying to do, well, see Bert's request for small reproducible example and clear explanation. Sarah -- Bert On Thu, May 31, 2012 at 10:38 AM, Paul Miller pjmiller...@yahoo.com wrote: Hello All, Can anyone tell help me understand why the function below doesn't work and how I can fix it? Below are some sample data, some code that works on individual rows of the data, and my attempt to translate that code into a function. My hope is to get the function working and then to apply it to the larger data frame using ddply() from the plyr package or possibly some other approach. As yet, I don't have much experience writing anonymous functions. I imagine I'm doing something that is obviously wrong, but I don't know what it is. Thanks, Paul Read in test data testData - structure(list(profile_key = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(001-001 , 001-002 , 001-003 , 001-004 , 001-005 , 001-006 , 001-007 ), class = factor), encounter_date = structure(c(9L, 10L, 11L, 12L, 13L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label = c( 2009-03-01 , 2009-03-22 , 2009-04-01 , 2010-03-01 , 2010-10-15 , 2010-11-15 , 2011-03-01 , 2011-03-14 , 2011-10-10 , 2011-10-24 , 2012-09-15 , 2012-10-05 , 2012-10-17 ), class = factor), raw = c( ordered kras testing on 10102010 results not yet available if patient has a mutation will start erbitux , received kras results on 10202010 test results indicate tumor is wild type ua protein positve erpr positive her2neu positve , will conduct kras mutation testing prior to initiation of therapy with erbitux , still need to order kras mutation testing , ordered kras testing waiting for results , kras test results pending note that patient was negative for lynch mutation , kras results still pending note that patient was negative for lynch mutation , kras mutated will not prescribe erbitux due to mutation , kras mutated therefore did not prescribe erbitux , kras wild , tumor is negative for mutation , tumor is wild type patient is eligible to receive eribtux , if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 2011 results are still not available , kras results are in patient has the mutation , ordered kras mutation testing on 02152011 results came back negative several lines of material patient kras mutation test is negative will start erbitux , patient is kras negative started erbitux on 03012011 )), .Names = c(profile_key, encounter_date, raw), row.names = c(NA, -16L), class = data.frame) Convert text record to lowercase testData$raw - tolower(testData$raw) Remove punctuation and any multiple spaces testData$raw - gsub([[:punct:]], , testData$raw) testData$raw - gsub( +, , testData$raw) Select test row testRow - testData[13,] testRow Select terms +/- a specified number of words from kras Text - unlist(strsplit(testRow$raw, )) Target - grep(kras, Text) if (length(Target) == 0) {testRow$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - 6 0, Target - 6, 1) Upper - ifelse(Target + 6 Length, Target + 6, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} testRow$reduced - paste(Text[!is.na(Keep)], collapse= ) } testRow length(Text) length(Text[!is.na(Keep)]) Function for selecting words within specified range of a target term nearTerms - function(df, text, target, before, after, outvar){ Text - with(df, strsplit(text, )) Target - grep(target, Text) if (length(Target) == 0) {df$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - before 0, Target - before, 1) Upper - ifelse(Target + after Length, Target + after, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} df - transform(df, outvar = paste(Text[!is.na(Keep)], collapse= )) } } nearTerms(testRow, raw, kras, 6, 6) nearTerms(df = testRow, text = raw, target = kras, before = 6, after = 6) __ R-help@r-project.org mailing list
[R] Incorporate a shapefile with an package animation
Hello I'm working with NetCDF files in an animation and am trying to superimpose a shapefile on the image as it is generates the html pages. If I take out the lines for the shapefile, it works correctly, however I'm having difficultly including the shapefile. If anyone has any ideas I would greatly appreciate the assistance. R Version 2.15.0 (2012-03-30) Platform i386-pc-mingw32/i386 (32-bit -- ### Animation of EDEN real time uncorrected data. library(animation) library(ncdf) library(fields) library(maptools) ENP_WCA - readShapePoly(system.file(U:\\GIS_Data\\GISlayers\\Boundaries_Lines\\EVERareas_Dissolve.shp, package=maptools)[1], proj4String=CRS(+proj=utm +zone17 +datum=WGS84)) # here an error is reported. (Error in getinfo.shape(filen) : Error opening SHP file # change to the working directory where the NetCDF file is stored setwd(A:\\Work_Area\\Steve\\EDEN) eden - open.ncdf(Jan_9-28_2011_q1_rt.nc) print(eden) Stage - get.var.ncdf(nc=eden, varid = stage) saveHTML({ for (i in 1:90) { image.plot(Stage[, , i], zlim=c(-20,500), main = EDEN Jan 10 - 28 2011\n Uncorrected Real Time Stage) plot(ENP_WCA, col=black) } }, img.name = Stage.img, imgdir = EDEN_dir, htmlfile = Eden_stage.html, autobrowse = FALSE, title = TIME SALINITY PREDICTIONS, description = c(EDEN Real Time Uncorrected Data) ) Thanks for your attention. Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I get this function to work?
I should have added, though: If you are writing R code you **must** learn to use R's debugging tools, which include: ?traceback ?debugger ?browser ?trace ?debug ?recover Then you do your own debugging instead of posting opaque code here and hoping that someone takes the bait. See the section on debugging in the R Language manual for a more complete discussion. Cheers, Bert On Thu, May 31, 2012 at 11:02 AM, Sarah Goslee sarah.gos...@gmail.com wrote: On Thu, May 31, 2012 at 1:54 PM, Bert Gunter gunter.ber...@gene.com wrote: Well, good luck finding someone to wade through your code -- small,reproducible examples are requested for a reason -- but I will offer that I have no idea what you mean with your remark about anonymous functions, as the code you posted has none. That's exactly as far as I got, and for just the same reasons. I'll just add that if you're trying to make a function (the last thing) that does the same thing as the sample code above it, then you do rather need to include the same code in it. And if that's not what you're trying to do, well, see Bert's request for small reproducible example and clear explanation. Sarah -- Bert On Thu, May 31, 2012 at 10:38 AM, Paul Miller pjmiller...@yahoo.com wrote: Hello All, Can anyone tell help me understand why the function below doesn't work and how I can fix it? Below are some sample data, some code that works on individual rows of the data, and my attempt to translate that code into a function. My hope is to get the function working and then to apply it to the larger data frame using ddply() from the plyr package or possibly some other approach. As yet, I don't have much experience writing anonymous functions. I imagine I'm doing something that is obviously wrong, but I don't know what it is. Thanks, Paul Read in test data testData - structure(list(profile_key = structure(c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(001-001 , 001-002 , 001-003 , 001-004 , 001-005 , 001-006 , 001-007 ), class = factor), encounter_date = structure(c(9L, 10L, 11L, 12L, 13L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 4L, 7L, 7L), .Label = c( 2009-03-01 , 2009-03-22 , 2009-04-01 , 2010-03-01 , 2010-10-15 , 2010-11-15 , 2011-03-01 , 2011-03-14 , 2011-10-10 , 2011-10-24 , 2012-09-15 , 2012-10-05 , 2012-10-17 ), class = factor), raw = c( ordered kras testing on 10102010 results not yet available if patient has a mutation will start erbitux , received kras results on 10202010 test results indicate tumor is wild type ua protein positve erpr positive her2neu positve , will conduct kras mutation testing prior to initiation of therapy with erbitux , still need to order kras mutation testing , ordered kras testing waiting for results , kras test results pending note that patient was negative for lynch mutation , kras results still pending note that patient was negative for lynch mutation , kras mutated will not prescribe erbitux due to mutation , kras mutated therefore did not prescribe erbitux , kras wild , tumor is negative for mutation , tumor is wild type patient is eligible to receive eribtux , if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 2011 results are still not available , kras results are in patient has the mutation , ordered kras mutation testing on 02152011 results came back negative several lines of material patient kras mutation test is negative will start erbitux , patient is kras negative started erbitux on 03012011 )), .Names = c(profile_key, encounter_date, raw), row.names = c(NA, -16L), class = data.frame) Convert text record to lowercase testData$raw - tolower(testData$raw) Remove punctuation and any multiple spaces testData$raw - gsub([[:punct:]], , testData$raw) testData$raw - gsub( +, , testData$raw) Select test row testRow - testData[13,] testRow Select terms +/- a specified number of words from kras Text - unlist(strsplit(testRow$raw, )) Target - grep(kras, Text) if (length(Target) == 0) {testRow$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - 6 0, Target - 6, 1) Upper - ifelse(Target + 6 Length, Target + 6, Length) for(i in 1:length(Keep)){ for(j in 1:length(Lower)){ Keep[i][i %in% seq(Lower[j], Upper[j])] - i }} testRow$reduced - paste(Text[!is.na(Keep)], collapse= ) } testRow length(Text) length(Text[!is.na(Keep)]) Function for selecting words within specified range of a target term nearTerms - function(df, text, target, before, after, outvar){ Text - with(df, strsplit(text, )) Target - grep(target, Text) if (length(Target) == 0) {df$reduced - } else{ Length - length(Text) Keep - rep(NA, Length) Lower - ifelse(Target - before 0, Target - before, 1) Upper -
Re: [R] density plots using density.lf, data.frame and sort.int errors
My guess (unconfirmed) is that read.table() gives you a data frame but density.lf expects an atomic (= not a list = not a data frame) vector. Perhaps try density.lf(x[,1]) to just send the column -- the drop behavior should make sure this is an atomic vector. If that doesn't help, please do provide us with the output of dput(head(x, 30)) and the package from which the density.lf function comes from. Best, Michael On Thu, May 31, 2012 at 11:35 AM, Andrea S Sequeira asequ...@wellesley.edu wrote: Dear R help group: I am attempting to produce a density plot from a list of 2 values using the density.lf function and would appreciate any help, I hope I have done my homework reading the documentation but I still seem to be missing something basic. I have read the data as a table using read.table, with header=TRUE (I excluded 2000 values), when calling the objects it appears to be there and I can see the values this is what I get when doing density.lf (x=logN0, n=50, window=gaussian) Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : undefined columns selected then assigned a column name (using colname so it is called first), then assigned it as a vector using assign (x, c (logN0)) the error I get is density.lf (x, n=50, window=gaussian) Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 'x' must be atomic the traceback produces: traceback () 5: stop('x' must be atomic) 4: sort.int(x, na.last = na.last, decreasing = decreasing, ...) 3: sort.default(x) 2: sort(x) 1: density.lf(x, n = 50, window = gaussian) Thanks in advance, Andrea -- Andrea Sequeira Associate Professor Department of Biological Sciences Wellesley College, Wellesley MA 02481 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadrat counting with spatstat
On Thu, May 31, 2012 at 11:23 AM, AMFTom the.quiet.r...@gmail.com wrote: I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I must admit I found this a little confusing -- are you trying to divide into twenty segments or 36? Also, what package does quadratcount come from? I'm guessing this might work better in an image processing/computer vision program than in R. Best, Michael I'm not even sure how to do it mathematically, let alone using R. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Quadrat-counting-with-spatstat-tp4631960.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadrat counting with spatstat
On May 31, 2012, at 2:26 PM, R. Michael Weylandt wrote: On Thu, May 31, 2012 at 11:23 AM, AMFTom the.quiet.r...@gmail.com wrote: I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I must admit I found this a little confusing -- are you trying to divide into twenty segments or 36? Also, what package does quadratcount come from? I'm guessing this might work better in an image processing/computer vision program than in R. The solution[1] requires a higher level of intelligence than is typical in ordinary clustering mechanisms. Maybe some sort of symbolic geometry program exists somewhere? There are really two levels of symmetry that need to be processed to come up with an approach that satisfies both constraints (equi-area-partition and all- area-included) . Agree it's not a statistical problem ... not was it offered in a manner that lent itself to testing an algorithmic solution. -- David. [1] Which is too large to fit into the margins of this posting. Best, Michael I'm not even sure how to do it mathematically, let alone using R. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Quadrat-counting-with-spatstat-tp4631960.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadrat counting with spatstat
1. Erect a solid, impremeable wall around the perimeter. 2. Put a very flexible, membrane around each circle. 3. Add a drop of low viscosity, low surface tension liquid to each circle. 4. At some point, all circles will have expanded to completely fill the space. 5. The membranes will define your optimum solution. Soap bubbles with micropipets to inflate them may work equally well. Clint Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels:300 Desmond Drive, Lacey, WA 98503-1274 On Thu, 31 May 2012, David Winsemius wrote: On May 31, 2012, at 2:26 PM, R. Michael Weylandt wrote: On Thu, May 31, 2012 at 11:23 AM, AMFTom the.quiet.r...@gmail.com wrote: I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I must admit I found this a little confusing -- are you trying to divide into twenty segments or 36? Also, what package does quadratcount come from? I'm guessing this might work better in an image processing/computer vision program than in R. The solution[1] requires a higher level of intelligence than is typical in ordinary clustering mechanisms. Maybe some sort of symbolic geometry program exists somewhere? There are really two levels of symmetry that need to be processed to come up with an approach that satisfies both constraints (equi-area-partition and all-area-included) . Agree it's not a statistical problem ... not was it offered in a manner that lent itself to testing an algorithmic solution. -- David. [1] Which is too large to fit into the margins of this posting. Best, Michael I'm not even sure how to do it mathematically, let alone using R. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Quadrat-counting-with-spatstat-tp4631960.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print.data.frame to string?
dear R experts---is there a function that prints a data frame to a string? cat() cannot handle lists, so I cannot write cat(your data frame is:\n, df, \n). regards, /iaw Ivo Welch (ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame to string?
What do you mean by prints? You can use capture.output to get what would regularly be printed to the screen into a text vector, or use dput to get a version of an object that could be read back into another R session. On Thu, May 31, 2012 at 2:10 PM, ivo welch ivo.we...@gmail.com wrote: dear R experts---is there a function that prints a data frame to a string? cat() cannot handle lists, so I cannot write cat(your data frame is:\n, df, \n). regards, /iaw Ivo Welch (ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame to string?
capture.output(print(mydf)) note that df is a base function... best to not use it as a variable. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. ivo welch ivo.we...@gmail.com wrote: dear R experts---is there a function that prints a data frame to a string? cat() cannot handle lists, so I cannot write cat(your data frame is:\n, df, \n). regards, /iaw Ivo Welch (ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svychisq??
Hello forum, I want to do a test of independence with svychisq, but I get an error, then this my code: am18 - read.spss(C:/Users/diana/Dropbox/Semestre 10/Tesis 10/Tesis Diana/AMcomuna18-29MAR2012.sav, use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) b-matrix(c(am18$N6_MANZANA),ncol=1) c-matrix(c(am18$PM1_1_PONDEMUESTRA),ncol=1) d-matrix(c(am18$M1_3_ESTRATO),ncol=1) e-matrix(c(rep(0.078,315)),ncol=1) Muestra.comp-svydesign(id=~b,strata =~d,nest=TRUE,weights=~c,data=am18, fpc=~e) ocupacion -matrix(c(am18$M1_19_OCUPACIONPRINCIPALACTUAL),ncol=1) APES-matrix(c(am18$M3_11_AUTOPERCEPCIONSALUDGENERAL),ncol=1) tbl1-svytable(~ocupacion+APES,Muestra.comp) summary(tbl1, statistic=Chisq) Error en `[.data.frame`(design$variables, , as.character(rows)) : undefined columns selected when I call am18 at the end says it is a data.frame (to.data.frame = TRUE)) and by that I aprace error. I would appreciate help with this problem [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probably a good use for apply
This is great thank you. I think I am getting the hang of some of the apply functions. I am stuck again however. I have list test_ below and would like to apply the sample function using each element of each vector as the probability and return a TRUE or FALSE that I will ultimately sum the TRUES by vector. test_- list(a=c(.85,.10),b=c(.99,.05)) #Write a function to sample based on labor force participation rates to determine presence of workers in household sampleWorker - function(x) return(sample(c(TRUE,FALSE),x, replace = TRUE, prob = c(x, 1-x))) IsWorker.Hh_ - lapply(test , sampleWorker) I am doing something wrong with the setup becuase i am getting an error about specifying probabilities incorrectly. The result I am looking for for IsWorker_ to be (assuming the .85, and . 99 probabilities 'win' from each vector and the lower values do not. IsWorker_ $a [1]TRUE $b [1]TRUE but ultimately I will need to sum the TRUEs for each vector IsWorker_ $a [1] 1 $b [1] 1 Thanks Josh -- View this message in context: http://r.789695.n4.nabble.com/Probably-a-good-use-for-apply-tp4631883p4631974.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Prediction of the lme part of a gamm model estimated with mgcv
Dear useRs, I'm using the mgcv package for estimate a model with a smooth effect and a spatial covarianza matrix. I use the command predict(model$lme) to obtain a prediction with the fixed and random part of the model, but i need this kind of prevision also on a grid with different points than the ones used to estimate the model. I read this: http://r.789695.n4.nabble.com/mgcv-gamm-predict-to-reflect-random-s-effects-td3622738.html http://r.789695.n4.nabble.com/mgcv-gamm-predict-to-reflect-random-s-effects-td3622738.html which explain that the problem is Basically gamm treats all random effects as 'part of the noise' in the model specification. In this discussion seems that there is a way (maybe really difficult) to obtain a prevision of the lme part of a gamm model, but really i don't understand how. Someone can help me. -- View this message in context: http://r.789695.n4.nabble.com/Prediction-of-the-lme-part-of-a-gamm-model-estimated-with-mgcv-tp4631977.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to aggregate combinations
Thanks a lot, this is what I was looking for. All the best -- View this message in context: http://r.789695.n4.nabble.com/How-to-aggregate-combinations-tp4631867p4631980.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time-series statistics collection
Hello, I am trying to collect several global measures or statistics for time-series as well as packages of R that can compute them. I have found several of them in papers and books, but the literature is so big i am sure i am missing several of them. skewness kurtosis min max mean SD trend seasonality periodicity chaos (Lyapunov Exponent) / Largest Lyapunov Exponent (i think is the same statistic) serial correlation / auto-correlation (this is the same if i am correct Box-Pierce autocorrelation sum) higher-order autocorrelation nonlinearity (terasvirta test) self similarity (Hurst exponent) matual information sum any other statistics that i am missing? Maybe other useful tests? or books/papers that i could find more? also any packages that can compute some/all of them? Best, PA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadrat counting with spatstat
Hello, There is more than one way to do it. I would divide space according to weighted distance. Specify a distance function. Euclidian distance will give you boundaries that consist of ellipse segments. Manhattan distance will give you straight lines which may be preferable. Assign to every circle i a distance weight wd[i]. You can start with equal weights. Associate every point j in space with the circle i to which it has the smallest weighted distance distance(i,j)/wd[i]. This will divide space into segments, each containing one circle. Use a set of points equally distributed over space (e.g. grid points or random) and calculate how segments area sizes relate to each other by counting how many points fall into each segment. Adjust distance weights - the larger the distance weight, the larger the area around the circle - until the areas are equal enough for your purpose. Exploit symmetries by keeping distance weights equal that should be equal due to symmetry. You can run an optimization algorithm by using an evaluation function that is minimal for equal areas. Hope this helps! Take care Oliver On Thu, May 31, 2012 at 11:23 AM, AMFTom the.quiet.r...@gmail.com wrote: I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I'm not even sure how to do it mathematically, let alone using R. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Quadrat-counting-with-spatstat-tp4631960.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Oliver Ruebenacker Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker) Knowomics, The Bioinformatics Network (http://www.knowomics.com) SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] splines and ns equation
Hi, I am looking at the change in N concentration in plant roots over 4 time points and I have fit a spline to the data using ns and lme: fit10 - lme( N~ns(day, 3), data = rcn10G) I may want to adjust the model a little bit, but for now, let's assume it's good. I get output for the fixed effects: Fixed: N ~ ns(day, 3) (Intercept) ns(day, 3)1 ns(day, 3)2 ns(day, 3)3 1.15676524 0.14509171 0.04459627 0.09334428 and coefficients for each experimental unit in my experiment: (Intercept) ns(day, 3)1 ns(day, 3)2 ns(day, 3)3 241.050360 -0.42666159 -0.56290877 -0.10714407 131.104464 -0.30825350 -0.53311653 -0.05558150 311.147878 -0.14548512 -0.78673906 -0.07231781 461.177781 -0.22278380 -0.80278177 -0.02321460 151.144215 -0.04484519 -0.06084798 0.07633663 321.213007 0.00741061 0.03896933 0.15325849 231.274615 0.16477514 0.00872224 0.23128320 411.215626 0.57050767 0.11415467 0.10608867 431.134203 0.48070741 0.72112899 0.18108193 121.091422 0.39563632 1.01521528 0.22597459 211.100631 0.44589314 0.98526322 0.23535739 351.226980 0.82419937 0.39809568 0.16900841 NOW, I want to write a spline function where I can incorporate these coefficients to get the predicted N concentration value for each day. However, I am having trouble finding the right spline equation, since there are many forms on the internets. I know it won't be a simple one, but can some one direct me to the equation that would be best to use for ns? Thanks a lot, Ranae -- View this message in context: http://r.789695.n4.nabble.com/splines-and-ns-equation-tp4631986.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame to string?
thanks, jeff. no, not capture.output(), but thanks for pointing me to it (I did not know it). capture.output flattens the data frame. I want the print.data.frame output, so that I can feed it to cat, and get reasonable newlines, too. regards, /iaw Ivo Welch (ivo.we...@gmail.com) J. Fred Weston Professor of Finance Anderson School at UCLA, C519 http://www.ivo-welch.info/ Editor, Critical Finance Review, http://www.critical-finance-review.org/ On Thu, May 31, 2012 at 1:19 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote: capture.output(print(mydf)) note that df is a base function... best to not use it as a variable. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. ivo welch ivo.we...@gmail.com wrote: dear R experts---is there a function that prints a data frame to a string? cat() cannot handle lists, so I cannot write cat(your data frame is:\n, df, \n). regards, /iaw Ivo Welch (ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probably a good use for apply
Hi, On Thu, May 31, 2012 at 1:08 PM, LCOG1 jr...@lcog.org wrote: This is great thank you. I think I am getting the hang of some of the apply functions. I am stuck again however. I have list test_ below and would like to apply the sample function using each element of each vector as the probability and return a TRUE or FALSE that I will ultimately sum the TRUES by vector. test_- list(a=c(.85,.10),b=c(.99,.05)) #Write a function to sample based on labor force participation rates to determine presence of workers in household sampleWorker - function(x) return(sample(c(TRUE,FALSE),x, replace = TRUE, prob = c(x, 1-x))) Your first problem is that sampleWorker() doesn't run with a single component of test_ so it can't possibly run in an apply statement. Please reread ?sample - the second argument is the size of the desired sample, but what you are passing is a non-integer vector of length 2. What do you actually want this to be? Then for prob, you're passing c(x, 1-x)) but x is again a non-integer vector of length 2, so that results in a vector of length 4, which is longer than the number of options sample() is choosing from. Do you perhaps want to pass only a single probability at a time? But even then you need to resolve the size problem. Sarah IsWorker.Hh_ - lapply(test , sampleWorker) I am doing something wrong with the setup becuase i am getting an error about specifying probabilities incorrectly. The result I am looking for for IsWorker_ to be (assuming the .85, and . 99 probabilities 'win' from each vector and the lower values do not. IsWorker_ $a [1]TRUE $b [1]TRUE but ultimately I will need to sum the TRUEs for each vector IsWorker_ $a [1] 1 $b [1] 1 Thanks Josh -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RScript.exe and map directory issue
Hi, I'm trying to run on Windows 7 a scriptfile with Rscript.exe from within Excel 2010 with the following code: Call Shell(rPath \Rscript.exe C:\Work\Latest\_Test.R, vbHide) The good news is: the above code works perfectly, but ... If I add white spaces to my map directory, like: Call Shell(rPath \Rscript.exe C:\Work\Latest 1\_Test.R, vbHide) In the above case or the RScript.exe doesn't run the file anymore. Could someone explain to me how this comes and how I can deal with it? Kind regards, Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadrat counting with spatstat
On Thu, May 31, 2012 at 08:23:02AM -0700, AMFTom wrote: I have photographs of plots that look like so: http://r.789695.n4.nabble.com/file/n4631960/Untitled.jpg I need to divide it up so each circle has an equal area surrounding it. So into 20 equal segments, each of which contains a circle. Quadratcount is not sufficient because if I divide it up into 36 equal quadrats, some quadrats do not contain one of the circles. I'm not even sure how to do it mathematically, let alone using R. Hi. Try the following. a - rbind( c(-1, -1), c(-1, 1), c( 1, 1), c( 1, -1), c(-1, -1)) plot(a, type=l) p - rbind( c(0, 0), c(-0.6, 0.6), c(-0.6, -0.6), c(0.6, 0.6), c(0.6, -0.6)) points(p, col=4, pch=20, cex=4) v - sqrt(2/5) b - rbind( c(-1, 0), c( 0,-1), c( 1, 0), c( 0, 1)) for (i in 1:4) { lines(rbind(b[i, ], v*b[i, ])) lines(v*rbind(b[i, ], b[(i %% 4) + 1, ])) } This divides a square into 5 equal regions. The area of the middle square is 2 v^2 = 4/5 and the area of each of the four remaining parts is 1 - v^2/2 = 4/5. If the above is repeated in a grid 2 times 2, we get a partition of a larger square into 20 equal regions. I did not check, whether they contain the required points, since i do not know their exact coordinates, but they could. Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Expressions returned by a Package function
Hello, I have a function 'ewrap' (see below for definition). It takes 3 expressions and returns another expression e.g. map - ewrap({ len - length(r$addon) rhcollect(len,1) }) becomes: expression({ NULL result - mapply(function(.index, k, r) { { len - length(r$addon) rhcollect(len, 1) } }, 1:length(map.values), map.keys, map.values) NULL }) attr(,class) [1] expression rhmr-map ewrap is defined in the GlobalEnv. In my package (Rhipe), a function rhwrap has the exact same definition rhwap - function(co1=NULL,before=NULL,after=NULL){ co - substitute(co1); before=substitute(before) j - as.expression(bquote({ .(BE) result - mapply(function(.index,k,r){ .(CO) },1:length(map.values),map.keys,map.values) .(AF) },list(CO=co,BE=before,AF=after))) class(j) - c(class(j),rhmr-map) j } but the following two are different, map - ewrap({ len - length(r$addon) rhcollect(len,1) }) and map2 - rhwrap({ len - length(r$addon) rhcollect(len,1) }) (because serialize(map,NULL) != serialize(map2,NULL)) I guess this is because both functions(ewrap and rhwrap) return an environment in which they are defined and in the case of rhwrap this is the Rhipe package namespace/environment (i'm not sure what jargon i should use here). So my questions are: 1. how do i inspect the extra information that rhwrap is adding to its return value 2. How do i remove this, so that it behaves like ewrap Thanks in advance Saptarshi ewrap - function(co1=NULL,before=NULL,after=NULL){ co - substitute(co1); before=substitute(before) j - as.expression(bquote({ .(BE) result - mapply(function(.index,k,r){ .(CO) },1:length(map.values),map.keys,map.values) .(AF) },list(CO=co,BE=before,AF=after))) class(j) - c(class(j),rhmr-map) j } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bigglm binomial negative fitted value
On Fri, Jun 1, 2012 at 1:17 AM, Yue Guan pipeha...@gmail.com wrote: Hi, there Since glm cannot handle factors very well. I try to use bigglm like this: logit_model - bigglm(responser~var1+var2+var3, data, chunksize=1000, family=binomial(), weights=~trial, sandwich=FALSE) fitted - predict(logit_model, data) only var2 is factor, var1 and var3 are numeric. I expect fitted should be a vector of value falls in (0,1) However, I get something like this: str(fitted) num [1:260617, 1] -0.0564 -0.0564 -0.1817 -0.1842 -0.1852 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:260617] 1 2 3 4 ... ..$ : NULL As the help says, the default is predictions of the linear predictor. To get predictions of the probability, use type=response -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] community finding in a graph and heatplot
On Thu, May 31, 2012 at 12:08 PM, Aziz, Muhammad Fayez az...@illinois.edu wrote: Thank you so much Gabor for your reply. I had spotted your post earlier and it worked like a charm. Interestingly I have just ran into a trouble with the stament dend - igraph:::as.dendrogram.igraph.walktrap(fc). Apparently the members are empty as when I print(dend) it says 'dendrogram' with 2 branches and members total, at height 93 while the error with using dend with dendrapply remians to be Error in `[[.dendrogram`(X, 2L) : attempt to set an attribute on NULL Any ideas? I would need to see fgc for this. Can you send it to me in private? Or send some self-contained example that generates the same error? Gabor My code looks like this File2Open = paste(FilePath, NetworkFiles\\net\\, NetPrefix, , TPPostfix, .net, sep = ) g - read.graph(File2Open, format=pajek) g - delete.isolates(g) g - simplify(g) fgc - fastgreedy.community(g, modularity=TRUE, weights = E(g)$weight) ModularityIndexfgc - max(fgc$modularity) # fgc modularity ModularityIndexng - modularity(g, membership, weights = E(g)$weight) # newman-girvan modularity dend - igraph:::as.dendrogram.igraph.walktrap(fgc) png(filename = paste(FilePath, Analysis\\Graphs\\EColiStressModuleHeatMap, NetPrefixAbbr, TPPostfix, .png, sep = ), width = 800, height = 800) # heat map is square adjMatrix = get.adjacency(g, attr=weight) DendNodeCounter - 0 # counter for ColorGroupsOrdered ColorGroupsOrdered - rep(red, vcount(g)) dendrapply(dend, colLab) # modifies ColorGroupsOrdered From: csardi.ga...@gmail.com [csardi.ga...@gmail.com] on behalf of Gábor Csárdi [csa...@rmki.kfki.hu] Sent: Thursday, May 31, 2012 10:45 AM To: Aziz, Muhammad Fayez Cc: r-help@r-project.org Subject: Re: [R] community finding in a graph and heatplot On Tue, May 29, 2012 at 1:16 AM, Aziz, Muhammad Fayez az...@illinois.edu wrote: Hi everyone, I am using the fastgreedy.community function to get the $merges matrix and the $modularity vector. This serves my purpose of testing modularity of my graph. But I am greedy to plot the heat map and dendrrogram based on the $merges dendogram matrix. I know that heatplot does the graphics part but I am not sure if the dendogram generated by the heatplot will match the one given by fastgreedy.community in all cases and that the heat map will represent the same clustering. No, they are different. To plot fast-greedy results as a dendrogram, see this and the follow-ups: http://lists.gnu.org/archive/html/igraph-help/2010-11/msg00059.html Gabor Tell me if my apprehension is incorrect. Otherwise please let me know of any alternatives. Here is the code I am testing so far: # http://igraph.sourceforge.net/doc/R/modularity.html # http://igraph.sourceforge.net/doc/R/fastgreedy.community.html # http://igraph.sourceforge.net/doc/R/graph.constructors.html library(igraph) library(made4) g - graph(c(1,2, 2,3, 3,1, 4,5)-1, , FALSE) print(g) ModuleInfo - fastgreedy.community(g) print(ModuleInfo) heatplot(c(1,2, 2,3, 3,1, 4,5)) Thanks Fayez Grad student UIUC IL, USA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gabor Csardi csa...@rmki.kfki.hu MTA KFKI RMKI -- Gabor Csardi csa...@rmki.kfki.hu MTA KFKI RMKI __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repost: Expressions returned by GlobalEnv functions and package functions
Hello, (Sorry for the repost, i am resending in plain text) I have a function 'ewrap' (see below for definition). It takes 3 expressions and returns another expression e.g. map - ewrap({ len - length(r$addon) rhcollect(len,1) }) becomes: expression({ NULL result - mapply(function(.index, k, r) { { len - length(r$addon) rhcollect(len, 1) } }, 1:length(map.values), map.keys, map.values) NULL }) attr(,class) [1] expression rhmr-map ewrap is defined in the GlobalEnv. In my package (Rhipe), a function rhwrap has the exact same definition rhwap - function(co1=NULL,before=NULL,after=NULL){ co - substitute(co1); before=substitute(before) j - as.expression(bquote({ .(BE) result - mapply(function(.index,k,r){ .(CO) },1:length(map.values),map.keys,map.values) .(AF) },list(CO=co,BE=before,AF=after))) class(j) - c(class(j),rhmr-map) j } but the following two are different, map - ewrap({ len - length(r$addon) rhcollect(len,1) }) and map2 - rhwrap({ len - length(r$addon) rhcollect(len,1) }) (because serialize(map,NULL) != serialize(map2,NULL)) I guess this is because both functions(ewrap and rhwrap) return an environment in which they are defined and in the case of rhwrap this is the Rhipe package namespace/environment (i'm not sure what jargon i should use here). So my questions are: 1. how do i inspect the extra information that rhwrap is adding to its return value 2. How do i remove this, so that it behaves like ewrap Thanks in advance Saptarshi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svychisq??
On Fri, Jun 1, 2012 at 6:07 AM, Diana Marcela Martinez Ruiz dianamm...@hotmail.com wrote: Hello forum, I want to do a test of independence with svychisq, but I get an error, then this my code: am18 - read.spss(C:/Users/diana/Dropbox/Semestre 10/Tesis 10/Tesis Diana/AMcomuna18-29MAR2012.sav, use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE) b-matrix(c(am18$N6_MANZANA),ncol=1) c-matrix(c(am18$PM1_1_PONDEMUESTRA),ncol=1) d-matrix(c(am18$M1_3_ESTRATO),ncol=1) e-matrix(c(rep(0.078,315)),ncol=1) Muestra.comp-svydesign(id=~b,strata =~d,nest=TRUE,weights=~c,data=am18, fpc=~e) ocupacion -matrix(c(am18$M1_19_OCUPACIONPRINCIPALACTUAL),ncol=1) APES-matrix(c(am18$M3_11_AUTOPERCEPCIONSALUDGENERAL),ncol=1) tbl1-svytable(~ocupacion+APES,Muestra.comp) summary(tbl1, statistic=Chisq) Error en `[.data.frame`(design$variables, , as.character(rows)) : undefined columns selected when I call am18 at the end says it is a data.frame (to.data.frame = TRUE)) and by that I aprace error. I would appreciate help with this problem Variables that you refer to with a formula have to be in the design object. You don't need to turn the variables into matrices, so you could just do tbl1-svytable(~M1_19_OCUPACIONPRINCIPALACTUAL+M3_11_AUTOPERCEPCIONSALUDGENERAL,Muestra.comp) or if you want shorter names, create renamed variables in the design object: Muestra.comp - update(Muestra.comp, ocupacion = M1_19_OCUPACIONPRINCIPALACTUAL, APES= M3_11_AUTOPERCEPCIONSALUDGENERAL) -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RScript.exe and map directory issue
Not on Windows so I can't test, but I imagine you need to escape the space: try this: Call Shell(rPath \Rscript.exe C:\Work\Latest\ 1\_Test.R, vbHide) Michael On Thu, May 31, 2012 at 4:40 PM, Bert Jacobs bert.jac...@figurestofacts.be wrote: Hi, I'm trying to run on Windows 7 a scriptfile with Rscript.exe from within Excel 2010 with the following code: Call Shell(rPath \Rscript.exe C:\Work\Latest\_Test.R, vbHide) The good news is: the above code works perfectly, but ... If I add white spaces to my map directory, like: Call Shell(rPath \Rscript.exe C:\Work\Latest 1\_Test.R, vbHide) In the above case or the RScript.exe doesn't run the file anymore. Could someone explain to me how this comes and how I can deal with it? Kind regards, Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame to string?
It will work if you paste a \n to the end of each line: a - data.frame(x=runif(4), y=runif(4), z=runif(4)) b - capture.output(a) c - paste(b, \n, sep=) cat(Your data set is:\n, c, \n) -- David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of ivo welch Sent: Thursday, May 31, 2012 3:31 PM To: Jeff Newmiller Cc: r-help Subject: Re: [R] print.data.frame to string? thanks, jeff. no, not capture.output(), but thanks for pointing me to it (I did not know it). capture.output flattens the data frame. I want the print.data.frame output, so that I can feed it to cat, and get reasonable newlines, too. regards, /iaw Ivo Welch (ivo.we...@gmail.com) J. Fred Weston Professor of Finance Anderson School at UCLA, C519 http://www.ivo-welch.info/ Editor, Critical Finance Review, http://www.critical-finance- review.org/ On Thu, May 31, 2012 at 1:19 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote: capture.output(print(mydf)) note that df is a base function... best to not use it as a variable. - -- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k - -- Sent from my phone. Please excuse my brevity. ivo welch ivo.we...@gmail.com wrote: dear R experts---is there a function that prints a data frame to a string? cat() cannot handle lists, so I cannot write cat(your data frame is:\n, df, \n). regards, /iaw Ivo Welch (ivo.we...@gmail.com) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RScript.exe and map directory issue
Thx Michael, After some testing (with more luck then craftmanship) it appears that following code works: Call Shell(rPath \Rscript.exe C:\Work\Latest 1\_Test.R, vbHide) I'll also try your solution. SY, Bert -Original Message- From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] Sent: 31 May 2012 23:51 To: Bert Jacobs Cc: r-help@r-project.org Subject: Re: [R] RScript.exe and map directory issue Not on Windows so I can't test, but I imagine you need to escape the space: try this: Call Shell(rPath \Rscript.exe C:\Work\Latest\ 1\_Test.R, vbHide) Michael On Thu, May 31, 2012 at 4:40 PM, Bert Jacobs bert.jac...@figurestofacts.be wrote: Hi, I'm trying to run on Windows 7 a scriptfile with Rscript.exe from within Excel 2010 with the following code: Call Shell(rPath \Rscript.exe C:\Work\Latest\_Test.R, vbHide) The good news is: the above code works perfectly, but ... If I add white spaces to my map directory, like: Call Shell(rPath \Rscript.exe C:\Work\Latest 1\_Test.R, vbHide) In the above case or the RScript.exe doesn't run the file anymore. Could someone explain to me how this comes and how I can deal with it? Kind regards, Bert __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame to string?
great. thanks. exactly what I wanted. /iaw Ivo Welch (ivo.we...@gmail.com) On Thu, May 31, 2012 at 2:53 PM, David L Carlson dcarl...@tamu.edu wrote: a - data.frame(x=runif(4), y=runif(4), z=runif(4)) b - capture.output(a) c - paste(b, \n, sep=) cat(Your data set is:\n, c, \n) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bigglm binomial negative fitted value
Thank you very much. I do overlook something. On Thu, May 31, 2012 at 5:20 PM, Thomas Lumley tlum...@uw.edu wrote: On Fri, Jun 1, 2012 at 1:17 AM, Yue Guan pipeha...@gmail.com wrote: Hi, there Since glm cannot handle factors very well. I try to use bigglm like this: logit_model - bigglm(responser~var1+var2+var3, data, chunksize=1000, family=binomial(), weights=~trial, sandwich=FALSE) fitted - predict(logit_model, data) only var2 is factor, var1 and var3 are numeric. I expect fitted should be a vector of value falls in (0,1) However, I get something like this: str(fitted) num [1:260617, 1] -0.0564 -0.0564 -0.1817 -0.1842 -0.1852 ... - attr(*, dimnames)=List of 2 ..$ : chr [1:260617] 1 2 3 4 ... ..$ : NULL As the help says, the default is predictions of the linear predictor. To get predictions of the probability, use type=response -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.