Re: [R] Date conversion problem using as.Date
Vegard Andersen vegard.andersen at ism.uit.no writes: : : Hello! : : My problem is that the Julian date behind my dates seems to be wrong. I : will examplify my problem. : : t1 - 1998-11-20 : t2 - as.Date(t1) : # Here t2 is correctly 1998-11-20, but : date.mdy(t2) : $month : [1] 11 : $day : [1] 19 : $year : [1] 1988 : : And indeed, if I write: fix(t2) then I get : structure(10550, class = : Date). So the Julian date is 10550, which is 1988-11-19, not the : correct 1998-11-20 : : If I instead of as.Date use as.date, then things work ok. But I have : not found out how to instruct as.date to handle dates from the 21st : century. : : I hope that someone can help me, thanks in advance! : As already mentioned the date class in survival uses 1960 as its origin: R as.date(0) [1] 1Jan60 whereas the Date class uses 1970: R structure(0, class = Date) [1] 1970-01-01 Regarding your other question you can use a 4 digit year: R as.date(2Jan2001) [1] 2Jan2001 or: R as.date.Date - function(x) as.date(format(x), order = ymd) R as.date.Date(t2) [1] 20Nov98 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is a .R script file name available inside the script?
Darren Weber darrenleeweber at gmail.com writes: : : Hi, : : if we have a file called Rscript.R that contains the following, for example: : : x - 1:100 : outfile = Rscript.Rout : sink(outfile) : print(x) : : and then we run : : source(Rscript.R) : : we get an output file called Rscript.Rout - great! : : Is there an internal variable, something like .Platform, that holds : the script name when it is being executed? I would like to use that : variable to define the output file name. : In R 2.0.1 try putting this in a file and sourcing it. script.description - function() eval.parent(quote(file), n = 3) print(basename(script.description())) If you are using R 2.1.0 (devel) then use this instead: script.description - function() showConnections() [as.character(eval.parent(quote(file), n = 3)), description] print((basename(script.description( __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] newbie question about beta distribution
faisal99 at inf.its-sby.edu writes: : : hi everyone, : I'm still a newbie in statistics, : : I have a question about beta distribution, that is, : : On the ref/tutorials I've found on the net, why beta distribution always : have value p(x) more than 1? Consider the uniform distribution on the interval (0, 1/a) whose probability density graph is a horizontal line at a. If a 1 then the probability density is greater than 1 for every point of its support showing the the density can indeed exceed 1. : As I know, any probability density function always have value not more : than 1? : : is there any one who can explain to me, I'm not statistics people, but I : need to code that needing some of this distribution function. : __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Violin plot for discrete variables.
Witold Eryk Wolski W.E.Wolski at ncl.ac.uk writes: : : Dear Rgurus, : : To my knowledge the best way to visualize the distribution of a discrete : variable X is : plot(table(X)) : : The problem which I have is the following. I have to discrete variables : X and Y which distribution I would like to compare. To overlay the : distribution of Y with lines(table(Y)) gives not satisfying results. : This is the same in case of using density or histogram. : : Hence, I am wondering if there is a equivalent of the vioplot function : (package vioplot) for discrete variables : which starts with a boxplot and than adds a rotated plot(table()) plot : to each side of the box plot. : : Maybee I should ask it first: Does such a plot make any sense? If not : are there better solutions? You could try a barplot or a balloonplot: tab - table(stack(list(x1 = x1, x2 = x2))) # x1, x2 from Andy's post barplot(t(tab), beside = TRUE) library(gplots) balloonplot(tab) Although intended for comparing data to a theoretical distribution, rootogram can compare two discrete distributions: library(vcd) rootogram(tab[,1], tab[,2]) Another possibility is to fit each distribution to a parametric form using vcd::distplot as shown in the examples on its help page. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Read a dataset with different lengths
Xiyan Lon xiyanlon at gmail.com writes: : : Dear useR again, : How can I read a dataset if lines in dataset did not have same : elements (have different lengths), For example: : : 12, 4, 16, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 : 22, 13, 5, 1, 1, 3, 1, 1, 15, 5, 1, 1, 14, 1, 1 : 34, 5, 11, 1, 1, 6, 1, 1, 5, 14, 1, 1, 15, 1, 1 : 42, 5, 9, 1, 1, 14, 1, 1, 8, 16, 1, 1, 13, 1, 1 : 53, 7, 14, 1, 1, 14, 1, 1, 5, 21, 1, 1, 8, 1, 1 : 66, 3, 1, 12, 1, 1, 5, 8, 1, 1, 15, 1, 1 : 76, 3, 1, 11, 1, 1, 10, 7, 1, 1, 21, 1, 1 : 8 21, 20, 9, 1, 1, 6, 1, 1, 13, 10, 1, 1, 1 : 95, 7, 21, 1, 1, 13, 1, 1, 14, 2, 1, 1, 6, 1, 1 : 10 8, 14, 10, 1, 1, 5, 1, 1, 10, 5, 1, 1, 5, 1, 1 : 11 5, 20, 17, 1, 1, 19, 1, 1, 14, 7, 1, 1, 6, 1, 1 : 12 7, 4, 11, 1, 1, 2, 1, 1, 5, 13, 1, 1, 14, 1, 1 : 13 7, 14, 13, 1, 1, 6, 1, 1, 13, 16, 1, 1, 17, 1, 1 : 14 7, 14, 5, 1, 1, 5, 1, 1, 5, 17, 1, 1, 17, 1, 1 : 15 3, 9, 12, 1, 1, 18, 1, 1, 6, 1, 4, 1, 1 : 16 7, 10, 5, 1, 1, 12, 1, 1, 5, 17, 1, 1, 13, 1, 1 : 17 12, 8, 16, 1, 1, 5, 1, 1, 8, 10, 1, 1, 14, 1, 1 : 18 5, 11, 7, 1, 1, 5, 1, 1, 18, 13, 1, 1, 17, 1, 1 : 19 7, 13, 8, 1, 1, 14, 1, 1, 5, 17, 1, 1, 13, 1, 1 : 20 7, 18, 21, 1, 1, 16, 1, 1, 5, 17, 1, 1, 13, 1, 1 : : I know that in BioC package rmutil have a function (read.list) to : handle different lengths sets of lines but it did not work. : library(rmutil) : Error in library(rmutil) : 'rmutil' is not a valid package -- installed 2.0.0? : rmutil can be found here: http://popgen.unimaas.nl/~jlindsey/rcode.html : : Are there any others function to handle this. nf - count.fields(myfile, sep = ,) z - read.table(myfile, sep = ,, fill = TRUE, colClass = rep(numeric(), nf)) If the first line is longest you can omit the colClass argument and the nf computation. The above returns a data frame with one line per row and NAs at the end to fill it out as necessary. If you need a list of rows without the NAs: lapply(as.data.frame(t(data.matrix(z))), na.omit) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Convex hull line coordinates..
achilleas.psomas at wsl.ch writes: : : Hello R-Helpers.. : : I am still new in R and I have the following question.. : I am applying the function chull on a 2D dataset and have the convex hull : nicely : calculated and plotted. : Do you know if there is a way to extract the coordinates of the line created : from the connection of the chull data points.. : I have alredy tried with approx to lineary interpolate but its not working : correctly since the interpolated values sometimes fall inside the convex . : Using the yleft or yright doesnt seem to help.. : : Any suggestions? 1. First suggestion is not to post by following up on an unrelated thread since some people won't see it. e.g. try finding it on gmane. Its there but good luck on finding it. 2. Second suggestion is an example which creates a matrix z whose columns are the regression coefficients of the successive line segments. Note use of lm's subset= arg to simplify code: example(chull) # creates hpts and X and plots convex hull z - sapply(2:length(hpts), function(i) coef(lm(X[,2] ~ X[,1], subset = hpts[i-1:0])) ) # we can use z to display _full_ lines, on top of the line # _segments_ that were displyed in example(chull): for(i in 1:ncol(z)) abline(coef = z[,i], col = red, lty = 2) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] flatten a matrix and unflatten it
Bill Simpson William.Simpson at drdc-rddc.gc.ca writes: : : I want to flatten a matrix and unflatten it again. Please tell me how to : do it. : : 1. given a matrix: : x1 y1 z1 : x2 y2 z2 : ... : xk yk zk : convert it to a vector: : x1, y1, z1, x2, y2, z2, ..., xk, yk, zk : : 2. given a vector: : x1, y1, z1, x2, y2, z2, ..., xk, yk, zk : convert it to a matrix : x1 y1 z1 : x2 y2 z2 : ... : xk yk zk : : It is known that the number of dimensions is 3. : myvector - c(t(mymatrix)) mymatrix - matrix(myvector, byrow = TRUE, nc=3) If column-wise is ok rather than row-wise as you show, then omit t() in the first line and byrow = TRUE in the second. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem in textConnection function
Michael S michael_shen at hotmail.com writes: : : Dear all-helpers: : : I create one package ,code like this: : output - : function(x,y) : { : zz -textConnection(foo,w) : sink(zz) : a -5 : b -6 : z -a*b : z : e -spss : h -c(1,2,3) : ls() : r-c(s,p,s,s) : p-list(1:10) : p : sink() : close(zz) : x - foo : y - foo : # .C(output,as.character(x),as.character(y)) : } : : packege making is ok , but when I use output in Rgui, none of object x : ory can get the result what I expect(textConnection result),when I copy the : code and paste on Rgui ,it is ok.what should I do ? : This is a FAQ: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-is-the-output-not-printed- when-I-source_0028_0029-a-file_003f __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] I modify my question in textconnection output
Michael S michael_shen at hotmail.com writes: : : dear ALL-R-helper: : I modify my question in textconnection output: : I wrote one function in Rgui: : output - function(y){ : x - textConnection(foo,w) : sink(x) : a -5 : b -6 : z -a*b : z : e -spss : h -c(1,2,3) : ls() : r-c(s,p,s,s) : p-list(1:10) : p : y - foo : sink() : close(x) : return(y) : } : : I want to get resulte is : : y : : [1] [1] 30 : [2] [1] \a\ \b\ \c\ \d\ \e\ \f\ : \foo\\g\ \g.p\\h\ \interp\ \m\ : \mytest\ : [3] [14] \output\ \p\ \r\ \var1\ \var2\ \x\ : \y\ \z\ : [4] [[1]] : [5] [1] 1 2 3 4 5 6 7 8 9 10 : [6] : : when I copy the command line within the function ,and paste to RGui,result : is ok .but when I use the output function ,y show value of y object.I got : result character(0) : : seem to me : I didn't get value of y within function You have not defined foo within your function. If you have a foo outside your function then that is being assigned to y. If you haven't a foo anywhere then you should have received an error. You might want to look at ?capture.output y - capture.output({ x - 1 print(x) }) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Newbie: Matrix indexing
Pascal BLEUYARD p.bleuyard at opgc.univ-bpclermont.fr writes: : : Hi all, : : I need to compute some occurence matrix: given a zero matrix and a set : of paired indexes, I want to store the number of occurences of each paired : index in a matrix. The paired indexes are stores as an index matrix. I : prefere not to use loops for performances purpose. : : Here follows a dummy example: : : occurence - matrix(0, 2, 2); data : [,1] [,2] : [1,]00 : [2,]00 : : index - matrix(1, 3, 2); index : [,1] [,2] : [1,]11 : [2,]11 : [3,]11 : : occurence[index] - occurence[index] + 1 : : I was expecting the folowing result: : : occurence : [,1] [,2] : [1,]30 : [2,]00 : : I get instead: : : occurence : [,1] [,2] : [1,]10 : [2,]00 : : I guess that there is some hidden copy involved but I wanted to know if : there is an efficient workaround (not using some loop structure). I thought : factors could do the job but I didn't manage to use them for that problem. Turn your index matrix into a data frame so you can use lapply on it. Then convert each of the two columns to a two-level factor. Now you can use table on the result: table(lapply(as.data.frame(index), factor, lev = 1:2)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] r under linux: creating high quality bmp's for win users
Can you provide a link. I did a google search and found something on a Japanese site but it turned out that the writer had made a mistake and it linked to wmf2eps, not eps2wmf. Christophe Pallier pallier at lscp.ehess.fr writes: : : Hello Christoph! : : In the past, I used an utility called eps2wmf. : It only works under Windows though (maybe under Linux with wine?). : I believe it is available on the CTAN (Tex archives). : : The nice thing is that wmf files are not bitmap and scale well. : : Christophe Pallier : : Christoph Lehmann wrote: : : Hi : : I produce graphics with R under linux, but my collaborators often use : windows and cannot import eps pics e.g. in msword : : what is the standard way to get e.g. bmp's with the same quality as : eps. going the way: creating eps, convert eps2bmp using 'convert' : doesn't yield good enough bmp's : : thanks for a short hint : : cheers : christoph : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! : http://www.R-project.org/posting-guide.html : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extracting numerical data from text field
Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes: : : I have imported a data frame that looks like this: : :Measurement.Date.and.Time Z.Average..nm. PDI : 572 Dienstag, 22. Mrz 2005 11:05:59 366,4 0,468 : 573 Dienstag, 22. Mrz 2005 11:09:30 353,4 0,532 : 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428 : 575 Dienstag, 22. Mrz 2005 11:16:28 354,1 0,433 : 576 Dienstag, 22. Mrz 2005 11:19:59 341,9 0,349 : 577 Dienstag, 22. Mrz 2005 11:23:29 334,9 0,429 : ... : : Would there be a way to extract the time in numerical form from the : Measurement.Date.and.Time field? What I would like to do is a time : series where, for example, : Dienstag, 22. Mrz 2005 11:05:59 is time=0 min : Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc. : : Thank you in advance for your help. : : Luis Make sure that you are in a German locale: # this works on Windows XP. On other OS, ge code may differ. Sys.setlocale(LC_TIME, ge) Then if DF is your data frame use strptime (see ?strptime for more on the % codes): dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S) dat - dat[1] # difference in time since the first date time __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extracting numerical data from text field
Gabor Grothendieck ggrothendieck at myway.com writes: Luis Tercero luis.tercero at ebi-wasser.uni-karlsruhe.de writes: : : I have imported a data frame that looks like this: : :Measurement.Date.and.Time Z.Average..nm. PDI : 572 Dienstag, 22. Mrz 2005 11:05:59 366,4 0,468 : 573 Dienstag, 22. Mrz 2005 11:09:30 353,4 0,532 : 574 Dienstag, 22. Mrz 2005 11:12:59343 0,428 : 575 Dienstag, 22. Mrz 2005 11:16:28 354,1 0,433 : 576 Dienstag, 22. Mrz 2005 11:19:59 341,9 0,349 : 577 Dienstag, 22. Mrz 2005 11:23:29 334,9 0,429 : ... : : Would there be a way to extract the time in numerical form from the : Measurement.Date.and.Time field? What I would like to do is a time : series where, for example, : Dienstag, 22. Mrz 2005 11:05:59 is time=0 min : Dienstag, 22. Mrz 2005 11:09:30 is time=3.5 min, etc. : : Thank you in advance for your help. : : Luis Make sure that you are in a German locale: # this works on Windows XP. On other OS, ge code may differ. Sys.setlocale(LC_TIME, ge) Then if DF is your data frame use strptime (see ?strptime for more on the % codes): dat - strptime(DF[,1], %A, %d. %B %Y %H:%M:%S) dat - dat[1] # difference in time since the first date time One other comment. I assumed your data time field is stored as character in the data frame. If its stored as a factor then you need to convert it to character first using as.character. If its already stored as a POSIXct date time then all you have to do is subtract off the first one. (Note that if you put the output of dput(DF) in your post then people will be able to exactly recreate your data frame and then know what you have.) Also, RNews 4/1 has a table with lots of date time processing idioms. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fold in R?
Seung Jun jun at cc.gatech.edu writes: : : Fold in Mathematica (or reduce in Python) works as follows: : : Fold[f, x, {a, b, c}] := f[f[f[x,a],b],c] : : That is, f is a binary operator, x is the initial value, and the results : are cascaded along the list. I've found it useful for reducing lists : when I only have a function that accepts two arguments (e.g., merge in R). : : Is there any R equivalent? I'm a newbie in R and having a hard time : finding such one. Thank you. : You could define it yourself like this: Fold - function(f, x, L) for(e in L) x - f(x, e) # example of its use result - Fold(sum, 0, 1:3) # result is 6 Note that merge.zoo in the zoo package does handle multiple arguments; however, that is intended for merging time series along their times, in case that is your application. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Generating list of vector coordinates
If the odometer order in your post is essential then you could try this: expand.grid(1:5, 1:4, 1:3)[,3:1] If R's reverse odometer order is ok then you could simplify it to this: expand.grid(1:3, 1:4, 1:5) On Mon, 28 Mar 2005 15:20:46 -0800, Ronnen Levinson [EMAIL PROTECTED] wrote: Hi. Can anyone suggest a simple way to obtain in R a list of vector coordinates of the following form? The code below is Mathematica. In[5]:= Flatten[Table[{i,j,k},{i,3},{j,4},{k,5}], 2] Out[5]= {{1,1,1},{1,1,2},{1,1,3},{1,1,4},{1,1,5},{1,2,1},{1,2,2},{1,2,3},{1 ,2,4},{1,2, 5},{1,3,1},{1,3,2},{1,3,3},{1,3,4},{1,3,5},{1,4,1},{1,4,2},{1,4,3}, {1,4, 4},{1,4,5},{2,1,1},{2,1,2},{2,1,3},{2,1,4},{2,1,5},{2,2,1},{2,2,2}, {2,2, 3},{2,2,4},{2,2,5},{2,3,1},{2,3,2},{2,3,3},{2,3,4},{2,3,5},{2,4,1}, {2,4, 2},{2,4,3},{2,4,4},{2,4,5},{3,1,1},{3,1,2},{3,1,3},{3,1,4},{3,1,5}, {3,2, 1},{3,2,2},{3,2,3},{3,2,4},{3,2,5},{3,3,1},{3,3,2},{3,3,3},{3,3,4}, {3,3, 5},{3,4,1},{3,4,2},{3,4,3},{3,4,4},{3,4,5}} I've been futzing with apply(), outer(), and so on but haven't found an elegant solution. Thanks, Ronnen. P.S. E-mailed CCs of posted replies appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Annotation metadata kills help.search
This happened to me in R 2.1.0 (I forget which specific version since I now have March 27th) on Windows XP which I traced to package dataRep. Once I removed that package help.search worked again. On Mon, 28 Mar 2005 22:08:58 -0500, Gerard Tromp [EMAIL PROTECTED] wrote: Greetings! OS: Windows R 2.0.1 Before anyone flames -- I tried to query this on the R searchable web site and using google and did not find anything applicable. As of about a week ago the help.search function dies when used in the simple help.search(something) usage. The error is Error in rbind(...) : number of columns of matrices must match (see arg 203) After some effort I have traced it down to the annotation packages. I installed GO, KEGG, mgu74[abc]v2 and hgu133plus2 all version 1.7.0 When I move these out of the library directory, help.search() functions correctly again. I have not tracked it any further -- just wanted to know if anyone else had noticed it. Gerard Tromp __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Recall() and sapply()
I believe that the function that Recall executes is the function in which Recall, itself, is evaluated -- not the function in which Recall appears. In normal cases these are the same but if you pass Recall to another function then they are not the same. Here Recall is being passed to sapply (which in turn is likely passing it to other functions). Because of lazy evaluation Recall does not get evaluated until it is found within sapply (or a function called by it or called by one called by it, etc.) and at that point its recalling the wrong function. AFAICS one cannot pass Recall to another function. You could rewrite the expression that uses sapply to use iteration instead or you could do it as shown below. In this example, the use of f2 within supply refers to the inner f2 which does not change even if the name of the outer f2 does. f2 - function(n) { f2 - function(n) if(length(n)1) sapply(n,f2) else matrix(n,n,n) f2(n) } f3 - f2 f2(1:3) f3(1:3) # gives same result On Wed, 30 Mar 2005 09:28:08 +0100, Robin Hankin [EMAIL PROTECTED] wrote: Hi. I'm having difficulty following the advice given in help(Recall). Consider the two following toy functions: f1 - function(n){ if(length(n)1){return(sapply(n,f1))} matrix(n,n,n) } f2 - function(n){ if(length(n)1){return(sapply(n,Recall))} matrix(n,n,n) } f1() works as desired (that is, f(1:3), say, gives me a three element list whose i-th element is an i-by-i matrix whose elements are all i). But f2() doesn't. How do I modify either function to use Recall()? What exactly is Recall() calling here? -- Robin Hankin Uncertainty Analyst Southampton Oceanography Centre European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Survey of moving window statistical functions - still looking f or fast mad function
Jaroslaw's article was great. In fact, it was used as the basis for rapply and some optimized special cases that will be included in the R 2.1.0 version of zoo (which have been coded but not yet released). Regarding numerically stable summation, check out the idea behind the following which I coincidentally am also considering for the zoo implementation: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/393090 On Apr 1, 2005 8:07 PM, Vadim Ogranovich [EMAIL PROTECTED] wrote: Hi, First, let me thank Jaroslaw for making this survey. I find it quite illuminating. Now the questions: * the #1 solution below (based on cumsum) is numerically unstable. Specifically if you do the runmean on a positive vector you can easily get negative numbers due to rounding errors. Does anyone see a modification which is free of this deficiency? * is it possible to optimize the algorithm of the filter function, solution #2 below, for the case of the rep(1/k,k) kernel? Thanks, Vadim [R] Survey of moving window statistical functions - still looking f or fast mad function * This message: [ Message body http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#start ] [ More options http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#options2 ] * Related messages: [ Next message http://tolstoy.newcastle.edu.au/R/help/04/10/5162.html ] [ Previous message http://tolstoy.newcastle.edu.au/R/help/04/10/5160.html ] [ Next in thread http://tolstoy.newcastle.edu.au/R/help/04/10/5167.html ] [ Replies http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html#replies ] From: Tuszynski, Jaroslaw W. JAROSLAW.W.TUSZYNSKI_at_saic.com mailto:JAROSLAW.W.TUSZYNSKI_at_saic.com?Subject=Re:%20%5BR%5D%20Survey% 20of%20quot;moving%20windowquot;%20statistical%20functions%20-%20still %20lookingf%20or%20fast%20mad%20function Date: Sat 09 Oct 2004 - 06:30:32 EST Hi, Lately I run into a problem that my code R code is spending hours performing simple moving window statistical operations. As a result I did searched archives for alternative (faster) ways of performing: mean, max, median and mad operation over moving window (size 81) on a vector with about 30K points. And performed some timing for several ways that were suggested, and few ways I come up with. The purpose of this email is to share some of my findings and ask for more suggestions (especially about moving mad function). Sum over moving window can be done using many different ways. Here are some sorted from the fastest to the slowest: 1. runmean = function(x, k) { n = length(x) y = x[ k:n ] - x[ c(1,1:(n-k)) ] # this is a difference from the previous cell y[1] = sum(x[1:k]); # find the first sum y = cumsum(y) # apply precomputed differences return(y/k) # return mean not sum } 2. filter(x, rep(1/k,k), sides=2, circular=T) - (stats package) 3. kernapply(x, kernel(daniell, m), circular=T) 4. apply(embed(x,k), 1, mean) 5. mywinfun - function(x, k, FUN=mean, ...) { # suggested in news group n - length(x) A - rep(x, length=k*(n+1)) dim(A) - c(n+1, k) sapply(split(A, row(A)), FUN, ...)[1:(n-k+1)] } 6. rollFun(x, k, FUN=mean) - (fSeries package) 7. rollMean(x, k) - (fSeries package) 8. SimpleMeanLoop = function(x, k) { n = length(x) # simple-minded loop used as a baseline y = rep(0, n) k = k%/%2; for (i in (1+k):(n-k)) y[i] = mean(x[(i-k):(i+k)]) } 9. running(x, fun=mean, width=k) - (gtools package) Some of above functions return results that are the same length as x and some return arrays with length n-k+1. The relative speeds (on Windows machine) were as follow: 0.01, 0.09, 1.2, 8.1, 11.2, 13.4, 27.3, 63, 345. As one can see there are about 5 orders of magnitude between the fastest and the slowest. Maximum over moving window can be done as follow, in order of speed 1. runmax = function(x, k) { n = length(x) y = rep(0, n) m = k%/%2; a = 0; for (i in (1+m):(n-m)) { if (a==y[i-1]) y[i] = max(x[(i-m):(i+m)]) # calculate max of the window else y[i] = max(y[i-1], x[i+m]); # max of the window is =y[i-1] a = x[i-m] # point that will be removed from the window } return(y) } 2. apply(embed(x,k), 1, max) 3. SimpleMaxLoop(x, k) - similar to SimpleMeanLoop above 4. mywinfun(x, k, FUN=max) - see above 5. rollFun(x, k, FUN=max) - fSeries package 6. rollMax(x, k) - fSeries package 7. running(x, fun=max, width=k) - gtools package The relative speeds were: 0.01, 3, 3.4, 5.3, 7.5, 7.7, 15.3 Median over moving window can be done as follows: 1. runmed(x, k) - from stats package 2. SimpleMedLoop(x, k) - similar to SimpleMeanLoop above 3. apply(embed(x,k), 1, median) 4. mywinfun(x, k, FUN=median) - see above 5. rollFun (x, k, FUN=median) - fSeries package 6. running(x, fun=max, width=k) - gtools package Speeds: 0.01, 3.4, 9, 15, 29, 165 Mad over moving window can be done as
Re: [R] factor to numeric in data.frame
Try this: data.matrix(df.f12) On Apr 2, 2005 6:01 AM, Heinz Tuechler [EMAIL PROTECTED] wrote: Dear All, Assume I have a data.frame that contains also factors and I would like to get another data.frame containing the factors as numeric vectors, to apply functions like sapply(..., median) on them. I read the warning concerning as.numeric or unclass, but in my case this makes sense, because the factor levels are properly ordered. I can do it, if I write for each single column unclass(...), but I would like to use indexing, e.g. unclass(df[1:10]). Is that possible? Thanks, Heinz Tüchler ## Example: f1 - factor(c(rep('c1-low',2),rep('c2-med',5),rep('c3-high',3))) f2 - factor(c(rep('c1-low',5),rep('c2-low',3),rep('c3-low',2))) df.f12 - data.frame(f1,f2) # data.frame containing factors ## this does work df.f12.num - data.frame(unclass(df.f12[[1]]),unclass(df.f12[[2]])) df.f12.num ## this does not work df.f12.num - data.frame(unclass(df.f12[[1:2]])) df.f12.num ## this does not work df.f12.num - data.frame(unclass(df.f12[1:2])) df.f12.num __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extract date
I just started using gmail and one thing that I thought would be annoying but sometimes is actually interesting are the ads at the right hand side. They are keyed off the content of the email and in the case of your post produced: http://www.visibone.com/regular-expressions/?via=google120 http://www.regexpbuddy.com The first one is advertising a javascript reference card (which I happen to own and is excellent); but in any case, the contents of the regexp part of the reference card are fully reproduced on the web page and includes dozens of examples of regexps that you could try. I haven't explored the other web site. Although I have not read it, there is a book called Mastering Regular Expressions. By the way, here is an alternative to calculating nd in Prof. Riley's post just to give you something else to play with. I think I prefer his solution but this one is arguably a bit simpler. The three portions separated by the two bars are each deleted if they are present. gsub causes it to repeatedly try them so that it does not stop after deleting the first one: nd - gsub(Date: |.*, | ..:.*$, , dates) On Apr 5, 2005 7:22 AM, Petr Pikal [EMAIL PROTECTED] wrote: Dear Prof.Ripley Thank you for your answer. After some tests and errors I finished with suitable extraction function which gives me substatnial increase in positive answers. Nevertheless I definitely need to gain more practice in regular expressions, but from the help page I can grasp only easy things. Is there any Regular expressions for dummies available? Best regards Petr Pikal On 5 Apr 2005 at 10:23, Prof Brian Ripley wrote: On Tue, 5 Apr 2005, Petr Pikal wrote: Dear all, please, is there any possibility how to extract a date from data which are like this: Yes, if you delimit all the possibilities. Date: Sat, 21 Feb 04 10:25:43 GMT Date: 13 Feb 2004 13:54:22 -0600 Date: Fri, 20 Feb 2004 17:00:48 + Date: Fri, 14 Jun 2002 16:22:27 -0400 Date: Wed, 18 Feb 2004 08:53:56 -0500 Date: 20 Feb 2004 02:18:58 -0600 Date: Sun, 15 Feb 2004 16:01:19 +0800 I used strptime(paste(substr(x,12,13), substr(x,15,17), substr(x,19,22), sep=-), format=%d-%b-%Y) which suits to lines 3:5 and 7 (such are the most common in my dataset) but obviously does not work with other lines. For those examples, in character vector 'dates' (without quotes): nd - gsub(^[^0-9]*([0-9]+) ([A-Za-z]+) ([0-9]+).*, \\1 \\2 \\3, dates) strptime(nd, %d %b %y) [1] 2004-02-21 2020-02-13 2020-02-20 2020-06-14 2020-02-18 [6] 2020-02-20 2020-02-15 You should be able to amend the regexp for a wider range of forms, but your first line is ambiguous (2004 or 2021?) so there are limits. If there is no stightforward solution I can live with what I use now but some automagical function like give.me.date.from.my.string.regardles.of.formating(x) would be great. It would be impossible: when Americans write 07/04/2004 they do not mean April 7th. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lists: removing elements, iterating over elements,
On Apr 5, 2005 1:36 PM, Paul Johnson [EMAIL PROTECTED] wrote: I'm writing R code to calculate Hierarchical Social Entropy, a diversity index that Tucker Balch proposed. One article on this was published in Autonomous Robots in 2000. You can find that and others through his web page at Georgia Tech. http://www.cc.gatech.edu/~tucker/index2.html While I work on this, I realize (again) that I'm a C programmer masquerading in R, and its really tricky working with R lists. Here are things that surprise me, I wonder what your experience/advice is. I need to calculate overlapping U-diametric clusters of a given radius. (Again, I apologize this looks so much like C.) ## Returns a list of all U-diametric clusters of a given radius ## Give an R distance matrix ## Clusters may overlap. Clusters may be identical (redundant) getUDClusters -function(distmat,radius){ mem - list() nItems - dim(distmat)[1] for ( i in 1:nItems ){ mem[[i]] - c(i) } for ( m in 1:nItems ){ for ( n in 1:nItems ){ if (m != n (distmat[m,n] = radius)){ ##item is within radius, so add to collection m mem[[m]] - sort(c( mem[[m]],n)) } } } return(mem) } That generates the list, like this: [[1]] [1] 1 3 4 5 6 7 8 9 10 [[2]] [1] 2 3 4 10 [[3]] [1] 1 2 3 4 5 6 7 8 10 [[4]] [1] 1 2 3 4 10 [[5]] [1] 1 3 5 6 7 8 9 10 [[6]] [1] 1 3 5 6 7 8 9 10 [[7]] [1] 1 3 5 6 7 8 9 10 [[8]] [1] 1 3 5 6 7 8 9 10 [[9]] [1] 1 5 6 7 8 9 10 [[10]] [1] 1 2 3 4 5 6 7 8 9 10 The next task is to eliminate the redundant elements. unique() does not apply to lists, so I have to scan one by one. cluslist - getUDClusters(distmat,radius) ##find redundant (same) clusters redundantCluster - c() for (m in 1:(length(cluslist)-1)) { for ( n in (m+1): length(cluslist) ){ if ( m != n length(cluslist[[m]]) == length(cluslist[[n]]) ){ if ( sum(cluslist[[m]] == cluslist[[n]]){ redundantCluster - c( redundantCluster,n) } } } } ##make sure they are sorted in reverse order if (length(redundantCluster)0) { redundantCluster - unique(sort(redundantCluster, decreasing=T)) ## remove redundant clusters (must do in reverse order to preserve index of cluslist) for (i in redundantCluster) cluslist[[i]] - NULL } Question: am I deleting the list elements properly? I do not find explicit documentation for R on how to remove elements from lists, but trial and error tells me myList[[5]] - NULL will remove the 5th element and then close up the hole caused by deletion of that element. That suffles the index values, So I have to be careful in dropping elements. I must work from the back of the list to the front. Is there an easier or faster way to remove the redundant clusters? Now, the next question. After eliminating the redundant sets from the list, I need to calculate the total number of items present in the whole list, figure how many are in each subset--each list item--and do some calculations. I expected this would iterate over the members of the list--one step for each subcollection for (i in cluslist){ } but it does not. It iterates over the items within the subsets of the list cluslist. I mean, if cluslist has 5 sets, each with 10 elements, this for loop takes 50 steps, one for each individual item. I find this does what I want for (i in 1:length(cluslist)) But I found out the hard way :) Oh, one more quirk that fooled me. Why does unique() applied to a distance matrix throw away the 0's I think that's really bad! x - rnorm(5) myDist - dist(x,diag=T,upper=T) myDist 1 2 3 4 5 1 0.000 1.2929976 1.6658710 2.6648003 0.5494918 2 1.2929976 0.000 0.3728735 1.3718027 0.7435058 3 1.6658710 0.3728735 0.000 0.9989292 1.1163793 4 2.6648003 1.3718027 0.9989292 0.000 2.1153085 5 0.5494918 0.7435058 1.1163793 2.1153085 0.000 unique(myDist) [1] 1.2929976 1.6658710 2.6648003 0.5494918 0.3728735 1.3718027 0.7435058 [8] 0.9989292 1.1163793 2.1153085 -- If L is our list of vectors then the following gets the unique elements of L. I have assumed that the individual vectors are sorted (sort them first if not via lapply(L, sort)) and that each element has a unique name (give it one if not, e.g. names(L) - seq(L)). The first line binds them together into rows. This will recycle to make them the same length and give you a warning but that's ok since you only need to know if they are the same or not. Now, unique applied to a matrix finds the unique rows and in the third line we use the row.names from that to get the original unsorted lists. mat - unique(do.call(rbind, L)) L[row.names(mat)] Regarding why the diagonal elements of a distance
Re: [R] Introduce a new function in a package?
Some other advantages of making your own package are: - you can use help.search to search for your own functions even if you don't load the package - if you can't even remember where your functions are (and I often can't) then you may not remember what they do either and packaging them gives a convenient way to associate documentation. Once you have found your function you can use ? to gets its documentation. - you get to use ' CMD check' whch is very helpful If you are doing it on Windows the amount of software you need to download and install first may be a bit offputting and you may need to sort out some path and latex problems but its probably worth it in the end if you do enough R development. On Apr 6, 2005 10:55 AM, Don MacQueen [EMAIL PROTECTED] wrote: Expressions in .Rprofile are executed *before* any previously saved global environment is loaded (i.e., before the .RData file in the current working directory is loaded, causing the message [Previously saved workspace restored] to a appear). If you define a function in .Rprofile, and then later answer yes to the Save workspace image? question when you quit R, the function will exist in the saved workspace. When you next start R, the version that comes in from .Rprofile will be replaced by the version in the saved workspace -- because the saved workspace is loaded after .Rprofile is executed. This means that if you decide to change the function in .Rprofile, your changes will immediately be lost when the previously saved workspace is loaded, since that has the previous version. So defining personal utility functions in .Rprofile is not very effective. Much, much, better to create a package, and then require() that package in .Rprofile. And since creating a package is really very easy, I strongly recommend that option. Saving the functions in an image file and then attaching it is fine, but less convenient, in my opinion, since you have to keep track of where it is in the file system. -Don At 4:09 PM +0100 4/6/05, Jan T. Kim wrote: On Wed, Apr 06, 2005 at 09:57:00AM -0400, Roger D. Peng wrote: I think the usual way is to create an R package for yourself and load it when you need it for whatever project. -roger Alternatively, one can also write the function in question into one's ~/.Rprofile; then, it's automatically available in all R sessions. To avoid confusion, make sure that you choose a unique name, i.e. one that isn't used by any package, if possible. This method should be used only for functions intended to provide some convenience in interactive sessions, code in scripts should not rely on functions being provided by ~/.Rprofile. For scripting, an R package is definitely preferred. Best regards, Jan Luis Ridao Cruz wrote: R-help, Sometimes I define functions I wish to have in any R session. The obvious thing to do is copy-paste the code The thing is that sometimes I don't know where I have the function code. My question is if somehow I could define a function and introduce it (let's say 'base' package ) so that could be used anytime I run a different R project. Thank you in advance __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- +- Jan T. Kim ---+ |*NEW*email: [EMAIL PROTECTED] | |*NEW*WWW: http://www.cmp.uea.ac.uk/people/jtk | *-= hierarchical systems are for files, not for humans =-* __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- Don MacQueen Environmental Protection Department Lawrence Livermore National Laboratory Livermore, CA, USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is a .R script file name available inside the script?
It works for me. Suppose in.txt is a two line file with these two lines: file - Rscript.R source(file) and Rscript.R is a two line file with these two lines: script.description - function() eval.parent(quote(file), n = 3) print(basename(script.description())) Then here is the output on Windows: C:\Program Files\R\rw2001beta\binR --vanilla in.txt R : Copyright 2004, The R Foundation for Statistical Computing [snip] file - Rscript.R source(file) [1] Rscript.R Note that 'file' referred to in 'eval.parent' is not the variable that you called 'file' but is an internal variable within the 'source' program that is called 'file'. It has nothing to do with your 'file', which very well could have a different name. In fact you just do this on Windows: echo source(Rscript.R) | R --vanilla From: Darren Weber [EMAIL PROTECTED] That is useful, when calling the script like this: file - Rscript.R source(file) However, it does not work if we do this from the shell prompt: $ R --vanilla Rscript.R because the eval.parent statement attempts to access a base workspacethat does not contain the file object/variable, as above. Isthere a solution for this situation? Is the input script file anargument to R and therefore available in something like argv? On Mar 18, 2005 8:00 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Darren Weber darrenleeweber at gmail.com writes: : : Hi, : : if we have a file called Rscript.R that contains the following, for example: : : x - 1:100 : outfile = Rscript.Rout : sink(outfile) : print(x) : : and then we run : : source(Rscript.R) : : we get an output file called Rscript.Rout - great! : : Is there an internal variable, something like .Platform, that holds : the script name when it is being executed? I would like to use that : variable to define the output file name. : In R 2.0.1 try putting this in a file and sourcing it. script.description - function() eval.parent(quote(file), n = 3) print(basename(script.description())) If you are using R 2.1.0 (devel) then use this instead: script.description - function() showConnections() [as.character(eval.parent(quote(file), n = 3)), description] print((basename(script.description( __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to do aggregate operations with non-scalar functions
On Apr 7, 2005 1:18 AM, Itay Furman [EMAIL PROTECTED] wrote: On Tue, 5 Apr 2005, Gabor Grothendieck wrote: On Apr 5, 2005 6:59 PM, Itay Furman [EMAIL PROTECTED] wrote: Hi, I have a data set, the structure of which is something like this: a - rep(c(a, b), c(6,6)) x - rep(c(x, y, z), c(4,4,4)) df - data.frame(a=a, x=x, r=rnorm(12)) The true data set has 1 million rows. The factors a and x have about 70 levels each; combined together they subset 'df' into ~900 data frames. For each such subset I'd like to compute various statistics including quantiles, but I can't find an efficient way of [snip] I would like to end up with a data frame like this: a x 0%25% 1 a x -0.7727268 0.1693188 2 a y -0.3410671 0.1566322 3 b y -0.2914710 -0.2677410 4 b z -0.8502875 -0.6505710 [snip] One can use do.call(rbind, by(df, list(a = a, x = x), f)) where f is the appropriate function. In this case f can be described in terms of df.quantile which is like quantile except it returns a one row data frame: df.quantile - function(x,p) as.data.frame(t(data.matrix(quantile(x, p f - function(df, p = c(0.25, 0.5)) cbind(df[1,1:2], df.quantile(df[,r], p)) Thanks! Just what I wanted. A minor point is that for some reason the row numbers in the final data frame are not sequential (see below -- this is not a consequence of my changes). These are the original row numbers of the first row of each combo of a and x. If z is the result of do.call you can always do this: row.names(z) - 1:nrow(z) if this its needed. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] package
On Apr 7, 2005 8:43 AM, Gregory BENMENZER [EMAIL PROTECTED] wrote: hello, I created a package with my functions, and i wand to hide the code of some functions. Could you help me ? Grégory There was some discussion on the list that there is work being done on an R compiler. I don't know what the status is or whether it would indeed solve your problem but you could try googling around for it or maybe someone else on the list can provide more info. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] NA in table with integer types
On Apr 8, 2005 9:05 AM, Paul Rathouz [EMAIL PROTECTED] wrote: OK. Thanks. So, if you use table() on a factor that contains NA's, but for which NA is not a level, is there any way to get table to generate an entry for the NAs? For example, in below, even exclude=NULL will not give me an entry for NA on the factor y: x - c(1,2,3,3,NA) y - factor(x) y [1] 1233NA Levels: 1 2 3 summary(y) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] the difference between UseMethod and NextMehod?
ronggui 0034058 at fudan.edu.cn writes: : hi,usRs,i am studing the R programming,but i can not get the point abut the difference between UseMethod : and NextMehod.i have read the manual and try to find the solutin from internet,but i still not master it : well.so anyone can give me a guide?it will be better to show some examples . One normally uses UseMethod within a generic function to dispatch the appropriate method while NextMethod is normally used within the function so dispatched. An important difference is that UseMethod does not return, i.e. statements after UseMethod are not evaluated, whereas NextMethod does return. Have a look at print and print.ts for examples of UseMethod and NextMethod, respectively. Just type the following at the R prompt: print print.ts __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to change letters after space into capital letters
On Apr 11, 2005 6:22 AM, Wolfram Fischer [EMAIL PROTECTED] wrote: What is the easiest way to change within vector of strings each letter after a space into a capital letter? E.g.: c( this is an element of the vector of strings, second element ) becomes: c( This Is An Element Of The Vector Of Strings, Second Element ) My reason to try to do this is to get more readable abbreviations. (A suggestion would be to add an option to abbreviate() which changes letters after space to uppercase letters before executing the abbreviation algorithm.) Look for the thread titled String manipulation---mixed case in the r-help archives. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sweave and abbreviating output from R
On Apr 11, 2005 7:22 AM, Gavin Simpson [EMAIL PROTECTED] wrote: Dear List, I'm using Sweave to produce a series of class handouts for a course I am running. The students in previous years have commented about wanting output within the handouts so they can see what to expect the output to look like. So Sweave is a godsend for producing this type of handout - with one exception: Is there a way to suppress some rows of printed output so as to save space in the printed documentation? E.g rnorm(100) produces about 20 lines of output (depending on options(width)). I'd prefer something like: rnorm(100) [1] 0.527739021 0.185551107 -1.239195562 0.020991608 -1.225632520 [6] -1.000243373 -0.020180393 2.552180776 -1.719061533 -0.195024625 ... [96] -0.744916379 0.863733400 -0.186667848 1.378236663 -0.499201046 The actual application would be printing of output from summary() methods. Ideally it would be nice to ask for line 1-10, 30-40, 100-102, for example, so you could print the first few lines of several sections of output. I'd like to automate this so I don't need to keep copying and pasting into the final tex source or forget to do it if I alter some previous part of the Sweave source. Has anyone tried to do this? Does anyone know of an automatic way of achieving the simple abbreviation or the more complicated version I described? Any thoughts on this? Maybe you could use head(rnorm(100)) instead. Check ?head for other arguments. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] the difference between x1 and x1
Try this: # test data set.seed(1) DF - data.frame(x1 = rnorm(5), x2 = rnorm(5), x3 = rnorm(5)) DF model.list - c(x2, x3) # transform for(v in model.list) DF[v] - floor(DF[v]) On 4/20/06, Chad Reyhan Bhatti [EMAIL PROTECTED] wrote: Hello, I am not sure what to write in the subject line, but I would like to take a character string that is a variable in a data frame and apply a function that takes a numeric argument to this character string. Here is a simplified example that would solve my problem. Imagine I have my data stored in a data frame. x1 - x2 - x3 - x4 - x5 - rnorm(20,0,1); data - as.data.frame(cbind(x1,x2,x3,x4,x5)); I have a vector containing the variables of interest as such. model.list - c(x1,x3,x4); model.list[1] [1] x1 I would like to loop through this vector and apply the floor() function to each variable. In the current form the elements of model.list do not represent the variables in the data frame. floor(model.list[1]) Error in floor(model.list[1]) : Non-numeric argument to mathematical function floor(eval(model.list[1])) Error in floor(eval(model.list[1])) : Non-numeric argument to mathematical function s - expression(paste(floor(,model.list[1],),sep=)) s expression(paste(floor(, model.list[1], ), sep = )) eval(s) [1] floor(x1) I have tried the obvious (to me) without success. Perhaps someone could suggest a solution and some tidbits for me to read up on about the how and why. Thanks, Chad R. Bhatti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Considering port of SAS application to R
R supports a number of databases and if you only need to work with a small amount of data at once it should be readily do-able; however, R keeps objects in memory and if you need large amounts at once then you could run into problems. Note that S-Plus keeps objects on disk and has other features aimed at large data and might be an alternative if R cannot handle the size and you want something based on the S language. Since SAS was developed many years ago when optimizing computer resources was more important than it is now it might be difficult to find an alternative that matches it for performance with large data sets. You probably want to quickly develop the core of your app in such a way that it has the main performance characteristics of the full app so you can get an idea of whether it will work prior to spending the time on the full code. Also note that R typically processes matrices faster than data frames and, in general, how you write your application may affect its performance. On 4/21/06, Werner Wernersen [EMAIL PROTECTED] wrote: Hi there! I am considering to port a SAS application to R and I would like to hear your opinion if you think this is possible and worthwhile. SAS is mainly used to do data management and then to do some aggregations and simple computations on the data and to output a modified data set. The main problem I see is the size of the data file. As I have no access to SAS yet I cannot give real details but the SAS data file is about 7 gigabytes large. (It's only the basic SAS system without any additional modules) What do you think, would a port to R be possible with reasonable effort? Is R able to handle that size of data? Or is R prepared to work together with some database system? Thanks for your thoughts! Best regards, Werner - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creat new column based on condition
Try: V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1) V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30) or V2 - matrix(c(4, 6, 10)[V1/10], nc = 1) On 4/21/06, Sachin J [EMAIL PROTECTED] wrote: Hi, How can I accomplish this task in R? V1 10 20 30 10 10 20 Create a new column V2 such that: If V1 = 10 then V2 = 4 If V1 = 20 then V2 = 6 V1 = 30 then V2 = 10 So the O/P looks like this V1 V2 10 4 20 6 30 10 10 4 10 4 20 6 Thanks in advance. Sachin __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creat new column based on condition
DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20)) DF$V2 - with(DF, 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30)) DF$V3 - c(4, 6, 10)[DF$V1/10] or DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20)) DF - transform(DF, V2 = 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30), V3 = c(4, 6, 10)[V1/10]) On 4/21/06, Sachin J [EMAIL PROTECTED] wrote: Hi Gabor, The first one works fine. Just out of curiosity, in second solution: I dont want to create a matrix. I want to add a new column to the existing dataframe (i.e. V2 based on the values in V1). Is there a way to do it? TIA Sachin Gabor Grothendieck [EMAIL PROTECTED] wrote: Try: V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1) V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30) or V2 - matrix(c(4, 6, 10)[V1/10], nc = 1) On 4/21/06, Sachin J wrote: Hi, How can I accomplish this task in R? V1 10 20 30 10 10 20 Create a new column V2 such that: If V1 = 10 then V2 = 4 If V1 = 20 then V2 = 6 V1 = 30 then V2 = 10 So the O/P looks like this V1 V2 10 4 20 6 30 10 10 4 10 4 20 6 Thanks in advance. Sachin __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min with Yahoo! Messenger with Voice. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Creat new column based on condition
Here is a compact solution using approx: DF$V2 - approx(c(10, 20, 30), c(4,6,10), DF$V1)$y On 4/21/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20)) DF$V2 - with(DF, 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30)) DF$V3 - c(4, 6, 10)[DF$V1/10] or DF - data.frame(V1 = c(10, 20, 30, 10, 10, 20)) DF - transform(DF, V2 = 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30), V3 = c(4, 6, 10)[V1/10]) On 4/21/06, Sachin J [EMAIL PROTECTED] wrote: Hi Gabor, The first one works fine. Just out of curiosity, in second solution: I dont want to create a matrix. I want to add a new column to the existing dataframe (i.e. V2 based on the values in V1). Is there a way to do it? TIA Sachin Gabor Grothendieck [EMAIL PROTECTED] wrote: Try: V1 - matrix(c(10, 20, 30, 10, 10, 20), nc = 1) V2 - 4 * (V1 == 10) + 6 * (V1 == 20) + 10 * (V1 == 30) or V2 - matrix(c(4, 6, 10)[V1/10], nc = 1) On 4/21/06, Sachin J wrote: Hi, How can I accomplish this task in R? V1 10 20 30 10 10 20 Create a new column V2 such that: If V1 = 10 then V2 = 4 If V1 = 20 then V2 = 6 V1 = 30 then V2 = 10 So the O/P looks like this V1 V2 10 4 20 6 30 10 10 4 10 4 20 6 Thanks in advance. Sachin __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Love cheap thrills? Enjoy PC-to-Phone calls to 30+ countries for just 2¢/min with Yahoo! Messenger with Voice. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Minor documentation issue
It is matched by the first argument which is called from even though, as the documentation indicates under the explanation of the last form, it refers to the ending value. Note, for example, that seq(from = 3) gives 1:3 and not 3:1. Also the help file does say: The interpretation of the unnamed arguments of 'seq' is _not_ standard, ... On 4/21/06, Vivek Satsangi [EMAIL PROTECTED] wrote: (Sorry about the last email which was incomplete. I hit 'send' accidentally). I looked at ?seq. One of the forms given under Usage is seq(from). This would be the form used if seq is called with only one argument. However, this should actually say seq(to). For example, seq(1) [1] 1 seq(3) [1] 1 2 3 Cheers, -- -- Vivek Satsangi Rochester, NY USA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reorganizing rows and columns
Use merge: # test data both - list(structure(list(Doc = c(5, 9, 7, 5, 7, 9), Query = c(1, 1, 1, 2, 2, 2), Rank = c(1, 2, 3, 1, 2, 3)), .Names = c(Doc, Query, Rank), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6)), structure(list(Doc = c(4, 5, 9, 8, 5, 7), Query = c(1, 1, 1, 2, 2, 2), Rank = c(1, 2, 3, 1, 2, 3)), .Names = c(Doc, Query, Rank), class = data.frame, row.names = c(1, 2, 3, 4, 5, 6))) merge(both[[1]], both[[2]], all = TRUE, by = 1:2) On 4/23/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I'm sure this is a simple task, but how to do it has escaped me. I have imported data from two separate files (each file contains the results from an information retrieval algorithm) organized into a list. They are organized by File,Query, and Rank (in that order): [[1]] Doc Query Rank 5 1 1 9 1 2 7 1 3 5 2 1 7 2 2 9 2 3 [[2]] Doc Query Rank 4 1 1 5 1 2 9 1 3 8 2 1 5 2 2 7 2 3 I need to rearrange the data so that it is sorted by Query and Document, with columns for rank1 and rank2 (from files 1 and 2, respectively). For example: [[1]] Doc Query Rank1 Rank1 4 1 NA 1 5 1 1 2 7 1 3 NA 9 1 2 3 5 2 1 2 7 2 2 3 8 2 NA 1 9 2 3 NA My goal is to perform a Spearman/Kendall test to check the correlation between the rankings. Any help would be appreciated. Andrew Noyes __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] arrange data for simple regression analysis
Here are a couple of possibilities using the builtin iris data set. Note that although the coefficients come out the same, the degrees of freedom, etc., would differ: n - rep(1:3, 50) lm(Petal.Length ~ Petal.Width, iris, weight = n) Call: lm(formula = Petal.Length ~ Petal.Width, data = iris, weights = n) Coefficients: (Intercept) Petal.Width 1.0572.262 lm(Petal.Length ~ Petal.Width, iris[rep(1:nrow(iris), n),]) Call: lm(formula = Petal.Length ~ Petal.Width, data = iris[rep(1:nrow(iris), n), ]) Coefficients: (Intercept) Petal.Width 1.0572.262 On 4/24/06, Tomás Revilla [EMAIL PROTECTED] wrote: Hello, I want to arrange data from a table to perform a simple regression. All the examples I saw deal with paired data, e.g. 'x' and 'y' have the same dimensions (e.g. 5 values for x and 5 for y). But I have more than one 'y' for each 'x' value, e.g. the data file has a x = 0, 30, 60, and 120 columns. And for each of them I have several replicate responses (e.g. individuals), not allways the same number. After I read the data with read.table(), ending with 4 columns, what is next? how can I regress this against c(0, 30, 60, 120)? 0 -- n1 y values 30 -- n2 y values 60 -- n3 y values 120 -- n4 y values Thanks, Tomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Sending an ESC command to the console from wihtin a script
RSiteSearch(clear screen) will locate Windows code to send a ctrl-L to the screen that you can modify. On 4/24/06, Tolga Uzuner [EMAIL PROTECTED] wrote: Hi, Is there a way to send an ESC command to the console from within a script window, without using the mouse ? Thanks, Tolga __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Handling large dataset dataframe
You just need the much smaller cross product matrix X'X and vector X'Y so you can build those up as you read the data in in chunks. On 4/24/06, Sachin J [EMAIL PROTECTED] wrote: Hi, I have a dataset consisting of 350,000 rows and 266 columns. Out of 266 columns 250 are dummy variable columns. I am trying to read this data set into R dataframe object but unable to do it due to memory size limitations (object size created is too large to handle in R). Is there a way to handle such a large dataset in R. My PC has 1GB of RAM, and 55 GB harddisk space running windows XP. Any pointers would be of great help. TIA Sachin - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Change the language of the labels in a graph
This works for me on my Windows XP system: Sys.putenv(LANGUAGE=FR); Sys.setlocale(LC_ALL,FR) On 4/24/06, Lapointe, Pierre [EMAIL PROTECTED] wrote: Hello, How do you change the language of the labels in a graph. In this example, I want to get French labels by changing Sys.putenv. I should get Mai instead of May. Sys.putenv(LANGUAGE=fr) x - as.Date(c(1jan1960, 2jan1960, 31mar1960, 30jul1960), %d%b%Y) y -1:4 plot(x,y) Regards, Pierre Lapointe ** AVIS DE NON-RESPONSABILITE: Ce document transmis par courrie...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Handling large dataset dataframe [Broadcast]
The other thing you could try after doing this is to sample some rows from your data and see if the subset gives nearly the same answer as the entire data set. On 4/24/06, Liaw, Andy [EMAIL PROTECTED] wrote: Here's a skeletal example. Embellish as needed: p - 5 n - 300 set.seed(1) dat - cbind(rnorm(n), matrix(runif(n * p), n, p)) write.table(dat, file=c:/temp/big.txt, row=FALSE, col=FALSE) xtx - matrix(0, p + 1, p + 1) xty - numeric(p + 1) f - file(c:/temp/big.txt, open=r) for (i in 1:3) { x - matrix(scan(f, nlines=100), 100, p + 1, byrow=TRUE) xtx - xtx + crossprod(cbind(1, x[, -1])) xty - xty + crossprod(cbind(1, x[, -1]), x[, 1]) } close(f) solve(xtx, xty) coef(lm.fit(cbind(1, dat[,-1]), dat[,1])) ## check result unlink(c:/temp/big.txt) ## clean up. Andy -Original Message- From: Sachin J [mailto:[EMAIL PROTECTED] Sent: Monday, April 24, 2006 5:09 PM To: Liaw, Andy; R-help@stat.math.ethz.ch Subject: RE: [R] Handling large dataset dataframe [Broadcast] Hi Andy: I searched through R-archive to find out how to handle large data set using readLines and other related R functions. I couldn't find any single post which elaborates the process. Can you provide me with an example or any pointers to the postings elaborating the process. Thanx in advance Sachin Liaw, Andy [EMAIL PROTECTED] wrote: Instead of reading the entire data in at once, you read a chunk at a time, and compute X'X and X'y on that chunk, and accumulate (i.e., add) them. There are examples in S Programming, taken from independent replies by the two authors to a post on S-news, if I remember correctly. Andy From: Sachin J Gabor: Can you elaborate more. Thanx Sachin Gabor Grothendieck wrote: You just need the much smaller cross product matrix X'X and vector X'Y so you can build those up as you read the data in in chunks. On 4/24/06, Sachin J wrote: Hi, I have a dataset consisting of 350,000 rows and 266 columns. Out of 266 columns 250 are dummy variable columns. I am trying to read this data set into R dataframe object but unable to do it due to memory size limitations (object size created is too large to handle in R). Is there a way to handle such a large dataset in R. My PC has 1GB of RAM, and 55 GB harddisk space running windows XP. Any pointers would be of great help. TIA Sachin - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Add qoutation marks and combine values in a vector
Use dQuote. Assuming you have a data frame with the column as factors: DF - data.frame(x = letters) # test data levels(DF$x) - dQuote(levels(DF$x)) On 4/25/06, Jerry Pressnell [EMAIL PROTECTED] wrote: I wish to place quotation marks around each element of the following list; X1 1 Label 1 2 Label 2 3 Label 3 4 Label 4 and combine the values in the following format for use in another function; c(Label 1,Label 2,Label 3,Label 4) Many thanks, Jerry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R help
A similar question was just asked. See: http://tolstoy.newcastle.edu.au/R/help/06/04/25898.html On 4/25/06, Erez [EMAIL PROTECTED] wrote: Hello, I'm working with large matrix data and i would like to know if there is any way to reduce the size of it because even that I'm increasing the memory limit and that i have 1 gb memory the program throwing me out. There is any way to use a smaller size data (such as using bits or so) to reduce the size of it. Erez __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] by() and CrossTable()
At least for this case I think you could get the effect without modifyiing CrossTable like this: as.CrossTable - function(x) structure(x, class = c(CrossTable, class(x))) print.CrossTable - function(x) for(L in x) cat(L, \n) by(warpbreaks, warpbreaks$tension, function(x) as.CrossTable(capture.output(CrossTable(x$wool, x$breaks 30, format=SPSS, fisher=TRUE On 4/25/06, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Tue, 2006-04-25 at 11:07 -0400, Chuck Cleland wrote: I am attempting to produce crosstabulations between two variables for subgroups defined by a third factor variable. I'm using by() and CrossTable() in package gmodels. I get the printing of the tables first and then a printing of each level of the INDICES. For example: library(gmodels) by(warpbreaks, warpbreaks$tension, function(x){CrossTable(x$wool, x$breaks 30, format=SPSS, fisher=TRUE)}) Is there a way to change this so that the CrossTable() output is labeled by the levels of the INDICES variable? I think this has to do with how CrossTable returns output, because the following does what I want: by(warpbreaks, warpbreaks$tension, function(x){summary(lm(breaks ~ wool, data = x))}) thanks, Chuck Chuck, Thanks for your e-mail. Without digging deeper, I suspect that the problem here is that CrossTable() has embedded formatted output within the body of the function using cat(), as opposed to a two step process of creating a results object, which then has a print method associated with it. This would be the case in the lm() example that you have as well as many other functions in R. I had not anticipated this particular use of CrossTable(), since it was really focused on creating nicely formatted 2d tables using fixed width fonts. That being said, I have had recent requests to enhance CrossTable()'s functionality to: 1. Be able to assign the results of the internal processing to an object and be able to assign that object without any other output. For example: Results - CrossTable(...) yielding no further output in the console. 2. Facilitate LaTeX markup of the CrossTable() formatted output for inclusion in LaTeX documents. Both of the above would require me to fundamentally alter CrossTable() to create a CrossTable class object, as opposed to the current embedded output. I would then create a print.CrossTable() method yielding the current output, as well as one to create LaTeX markup for that application. The LaTeX output would likely need to support the regular 'table' style as well as 'ctable' and 'longtable' styles, the latter given the potential for long multi-page output. These changes should then support the type of use that you are attempting here. These are on my TODO list for CrossTable() (along with the inclusion of the measures of association recently discussed) and now that the dust has settled from some recent abstract submission deadlines I can get back to some of these things. I don't have a timeline yet, but will forge ahead with these enhancements. One possible suggestion for you as an interim, at least in terms of some nicely formatted n-way tables is the ctab() function in the 'catspec' package by John Hendrickx. A possible example call would be: ctab(warpbreaks$tension, warpbreaks$wool, warpbreaks$breaks 30, type = c(n, row, column, total), addmargins = TRUE) Unlike CrossTable() which is strictly 2d (though that may change in the future), ctab() directly supports the creation of n-way tables, with counts and percentages/proportions interleaved in the output. There are no statistical tests applied and these would need to be done separately using by(). Chuck, feel free to contact me offlist as other related issues may arise or as you have other comments on this. Again, thanks for the e-mail. Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Questions to RDCOMClient
On 4/25/06, Dr. Michael Wolf [EMAIL PROTECTED] wrote: 3. RDCOMCLient and Excel Manual === Do you know a good overview of using Excel VBA code via RDCOMClient (e. g. sh$Select())? Are there people interesting in working out such a paper? I could contribute some experiences of my work to such a project (e. g. deleting Excel shapes from R and copying new charts made by R to a special position in a Excel sheet. Normally what I do is just create whatever spreadsheet I want in Excel with the Excel macro recorder turned on and then look at the macro output and translate that to RDCOMClient. There do exist some books on VBA programming in Excel (I don't have any myself but have taken one out from the library once) that could be helpful if the macro approach is not sufficient. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] www.r-project.org
On 4/25/06, hadley wickham [EMAIL PROTECTED] wrote: The R Web site is working fine. Even if it is not relifted from a long time, it is functional. So, this is the point... and it should remain, at least, as functional as it is. As an experienced user of the R website, this probably is true for you. However, there are a number of confusing problems for new users of the site: * how do you download R? * how do you bookmark a specific page? * what is that giant graphic on the home page? * can't get to contributed docs directly from home page __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] www.r-project.org
Its not hard if you know what to do but if you don't then its a nuisance to figure it out every time. On 4/25/06, Gavin Simpson [EMAIL PROTECTED] wrote: On Tue, 2006-04-25 at 14:09 -0400, Gabor Grothendieck wrote: On 4/25/06, hadley wickham [EMAIL PROTECTED] wrote: The R Web site is working fine. Even if it is not relifted from a long time, it is functional. So, this is the point... and it should remain, at least, as functional as it is. As an experienced user of the R website, this probably is true for you. However, there are a number of confusing problems for new users of the site: * how do you download R? * how do you bookmark a specific page? * what is that giant graphic on the home page? * can't get to contributed docs directly from home page Isn't Other Contributed Documentation sufficient? Usability guidelines for websites suggest that you should have as few top-level menu items as possible, say 5-6 max... OK the R website is not like insert_company_name.com type website, but you wouldn't want to flood users with too many options up front. G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% * Note new Address, Telephone Fax numbers from 6th April 2006 * %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson ECRC ENSIS [t] +44 (0)20 7679 0522 UCL Department of Geography [f] +44 (0)20 7679 0565 Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street [w] http://www.ucl.ac.uk/~ucfagls/cv/ London, UK. [w] http://www.ucl.ac.uk/~ucfagls/ WC1E 6BT. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] www.r-project.org
On Windows, right click the web page, choose Properties and copy the url there. On 4/25/06, Spencer Graves [EMAIL PROTECTED] wrote: see inline hadley wickham wrote: The R Web site is working fine. snip As an experienced user of the R website, this probably is true for you. However, there are a number of confusing problems for new users of the site: * how do you download R? * how do you bookmark a specific page? *** If I find something with R Site Search on www.r-project.org, I can NOT just copy the web address into an email, because the address is still www.r-project.org. However, if I use RSiteSearch from within R, I get an honest address (like http://finzi.psych.upenn.edu/R/Rhelp02a/archive/47417.html;), which I can then paste into an email like this. If it weren't too difficult to display the address for each item retrieved from the archives, it would make it easier it use R Site Search without opening R. Thanks to all the core R team, including Jonathan Baron, whose support of R Site Search has prevented me from tearing my hair out on many occasions (and I don't have much left to tear out). When people ask me questions about S-Plus, I often go to R Site Search, and then see if I can somehow use in S-Plus any R solution I find. Best Wishes, spencer graves p.s. I also have a strong preference for avoiding fancy features. I've been burned so many times with viruses and software that never performed as advertized for many unknown reasons that I routinely check no when asked if I want to install Micromedia Flash, and I hope I won't have to install it to use a future version of www.r-project.org. * what is that giant graphic on the home page? Hadley __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] new.frame()
Also, the proto and R.oo packages provide object oriented ways of working with environments. On 4/26/06, Prof Brian Ripley [EMAIL PROTECTED] wrote: ?new.env ?local should help you. R works with environments, basically a frame plus an enclosure. On Wed, 26 Apr 2006, Anna Whitfield wrote: Hello, I would like to know whether R has a homogeneous function of S-plus's new.frame(), which create explicit frames in the evaluator and provide a locale for computations that can be shared among various functions. new.frame() in S-plus: http://www.uni-muenster.de/ZIV/Mitarbeiter/BennoSueselbeck/s-html/helpfiles/new.frame.html Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] environment
A third possibility is to using the proto package and define a proto object (an environment with special meaning for $) containing the two components .x and g like this: library(proto) p - proto(.x = 2, g = function(.) { print(.$.x); .$.x - 3 }) p$.x # 2 print(p$g()) # 2, 3 p$.x # 3 or you can write the print statement as with(., print(.x)) On 26 Apr 2006 11:02:58 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Romain Francois [EMAIL PROTECTED] writes: Hi, Consider the code : g - function(){ print(.x) .x - 3 } f - function(){ environment(g) - environment() .x - 2 g() .x } f() [1] 2 [1] 2 I would like f() to return 3. How can I do that ? Am I completely out of place ? Doing that, I want to avoid to pass .x as a parameter in f, because in real life .x is pretty big and g() is called over and over in a loop. If you want to assign into the environment of g, you'll need - , otherwise you assign to a local variable. Another possibility involves assign(..., parent.frame()) And a third possibility is: library(proto) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] environment
Also, if you don;t need to create child objects which override .x, and I don't think you do here, p could be further simplified to this (only the print statement has been changed): p - proto(.x = 2, g = function(.) { print(.x); .$.x - 3 }) On 4/26/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: A third possibility is to using the proto package and define a proto object (an environment with special meaning for $) containing the two components .x and g like this: library(proto) p - proto(.x = 2, g = function(.) { print(.$.x); .$.x - 3 }) p$.x # 2 print(p$g()) # 2, 3 p$.x # 3 or you can write the print statement as with(., print(.x)) On 26 Apr 2006 11:02:58 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Romain Francois [EMAIL PROTECTED] writes: Hi, Consider the code : g - function(){ print(.x) .x - 3 } f - function(){ environment(g) - environment() .x - 2 g() .x } f() [1] 2 [1] 2 I would like f() to return 3. How can I do that ? Am I completely out of place ? Doing that, I want to avoid to pass .x as a parameter in f, because in real life .x is pretty big and g() is called over and over in a loop. If you want to assign into the environment of g, you'll need - , otherwise you assign to a local variable. Another possibility involves assign(..., parent.frame()) And a third possibility is: library(proto) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] About regression and plot
On 4/26/06, Daniel Yang [EMAIL PROTECTED] wrote: --- I changed to format to text instead of html --- Dear R-help, This is my first R day. I want to ask some more beginner's questions. Read the posting guide at the bottom of each post to r-help. Q1. How can I obtain the covariance matrix for parameter estimates of a multiple regression? I checked ?lm but didn't get the information. ?vcov Q2. How can I see the old graphs in the graph window? Assuming you are using Windows, create a plot, e.g. plot(1), change focus to the plot, i.e. left click it, and choose Recording from the History menu. From that point on in the session (or until you turn it off), it will record your plots and you can change focus to the plot window and use PageUp and PageDn keys to move through them. Q3. Can R plot animated graph? For example, I want to see the dynamic change of a 2D graph during a time period. RSiteSearch(animation) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Were to find appropriate functions for a given task in R
There is a reference sheet here: http://www.rpad.org/Rpad/R-refcard.pdf a function finder here: http://biostat.mc.vanderbilt.edu/s/finder/find.html and task views here: http://cran.r-project.org/src/contrib/Views/ Also use of RSiteSearch and help.search from within R can be helpful. On 4/26/06, Albert Sorribas [EMAIL PROTECTED] wrote: This is a generic request concerning were to look for finding appropriate information on a precise procedure in R. I'm using R for teaching introductory statistics and my students are learning how to deal with it. However, I find it difficult to locate some of the procedures. For instance, for basic crosstabulation, it is obvious that basic functions as table, ftable, and prop.table can be used. But there is a CrossTable function that is very useful. This is hidden in gmodels and gregmisc, as far as I've been able to explore the packages. However, there is no way (unless I sit down to r-help for hours) to be sure if there is some other place in which a very useful function is hidden for table manipulation (for instance controlling for other variables). This is only an example. But there are many more. Were to look for CI for proportions? I can find it but it is not easy. I understand R is more appropriate for difficult statistical procedures (glm and similar), BUT students need to start somewhere…. My specific claim is about the need for a sort of guide in which the different procedures could be classified (and some redundancies could be deleted…..by the way). Is there something similar around? Any project working on this? Any clue for? If not, I would suggest starting some kind of easy reference based on the problem to solve. This could indicate were to look for. Last day I find in package vcd that a function exist for testing the goodness-of-fit of a sample to binomial and other distributions….but this was VERY difficult to locate. Any way, as usual, any indication will be very useful (spaecially for my students!!!) Albert Sorribas Professor of Statistics and Operational Research Departament de Ciències Mèdiques Bà siques Universitat de Lleida Montserrat Roig 2 25008-Lleida (Espanya) web.udl.es/Biomath/Group [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] www.r-project.org
Maybe a separate web site that shows R off or maybe just a pointer to the R Graph Gallery. On 4/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote: Romain Francois wrote: Dear R users and developpers, My question is adressed to both of you, so I choose R-help to post it. Are there any plans to jazz up the main R website : http://www.r-project.org The look it have now is the same for a long time and kind of sad compared to other statistical package's website. Of course, the comparison is not fair, since companies are paying web designers to draw lollipop websites ... There have been various suggestions along these lines (check the archives), but there are a number of constraints that make the problem difficult: - there are two web sites, www.r-project.org and cran.r-project.org with different needs. In particular, CRAN must be very low tech because it is mirrored on very diverse sites (including local copies, e.g. on a CDROM). - There are a lot of busy people who need to edit these pages occasionally, so a stable, standard, simple setup is extremely desirable. That means simple HTML to be edited in a text editor, no special CMS. These requirements are quite hard to meet, so expect changes to the web sites to be very time consuming, and possibly rejected en masse in the end. Duncan Murdoch My first idea was to organize some kind of web designing contest. But, I had a small talk with Friedrich Leisch about that, who said that I shouldn't expect too many competitors. So, what about creating a small team, create a home page project and then propose it to the core team. It goes without saying it : The core team has the final word. What do you think ? Who would like to play ? Romain __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help using tapply
On 4/26/06, Dimitri Szerman [EMAIL PROTECTED] wrote: Dear R-mates, # Here's what I am trying to do. I have a dataset like this: id = c(rep(1,8), rep(2,8)) dur1 - c( 17,18,19,18,24,19,24,24 ) est1 - c( rep(1,5), rep(2,3) ) dur2 - c(1,1,3,4,8,12,13,14) est2 - rep(1,8) mydata = data.frame(id, estat=c(est1, est2), durat=c(dur1, dur2)) # I want to one have this: id = c(rep(1,8), rep(2,8)) dur1 - c( 17,18,19,20,28,1,2,3 ) est1 - c( rep(1,5), rep(2,3) ) dur2 - c(1,2,3,4,12,13,14,15) est2 - rep(1,8) mydata2 = = data.frame(id, estat=c(est1, est2), durat=c(dur1, dur2)) # What is happening here? I have a longitudinal dataset. # Individuals are observed 8 times, and each time each of them are in a certain state J (here, J={1,2}). # Each observation is one unit of time away from the following one, except observations 4 and 5, which are 8 units of time away from each other. # So here we have individual 1 migrating from state 1 to state 2 at observation #6, # while individual 2 stays in state 1 as long as we can observe her. # I am interested in the spell (duration) of each state. # However, the durations are clearly mismesuared, and now I am trying to give some consistency to the data. # I am assuming that the first duration is correct. Departing from this, I wrote the following function: d - function(dur,est) { if ( sum( diff(est) )==0 ) # for those who didn't change state { for( i in c(2:4)) dur[i] - dur[i-1] + 1 dur[5] - dur[4] + 8 for( i in c(6:8) ) dur[i] - dur[i-1] + 1 } if ( sum( diff(est) )!=0 ) # for those who changed state { j = which(diff(est)!=0) + 1# j is when the change occured dur[j] = 1 k0 = which( c(1:8) j )[-c(1)] k1 = which( c(1:8) j ) if(length(j) 1) { for( i in 1:(length(j)-1) ) k2 = c(1:8)[c(1:8) j[i] c(1:8) j[i+1]] k = unique( c(k0,k1,k2) ) } k = unique( c(k0,k1) ) k = k[!k%in%j] if(5%in%k) { k = k[k != 5] for(i in k[k5]) dur[i] = dur[i-1] + 1 dur[5] = dur[4] + 8 for(i in k[k5]) dur[i] = dur[i-1] + 1 } else { for(i in k) dur[i] = dur[i-1] + 1 } } dur } # Now, if a do d(dur1, est1) # and d(dur2,est2) # I get what I want, except from the fact that I couldn't do this for a large dataset. # So I decide to use tapply. But this gives me new.durat - tapply(mydata$durat, IND=mydata$id, FUN=d, est=mydata$estat) mydata$new.durat - unlist(new.durat) mydata id estat durat new.durat 1 1 11717 2 1 11818 3 1 11919 4 1 11820 5 1 12428 6 1 21929 7 1 22430 8 1 22431 9 2 1 1 1 10 2 1 1 2 11 2 1 3 3 12 2 1 4 4 13 2 1 812 14 2 11213 15 2 11314 16 2 11415 # what is not what I want. I can't figure it out why, but when I use tapply, # the logical expression sum( diff(est) )==0 turns out to be true for both individuals # (whereas we know this is true only for individual #2). # I am sorry for the long message. I will be very grateful for any help with this problem. I didn't try to read all this carefully but I think you want to tapply over the indices so you can use them in both columns: with(mydata, unlist(tapply(seq(id), id, function(i) d(durat[i], estat[i]))) ) or use by: unlist(by(mydata, mydata$id, function(x) d(x$durat, x$estat))) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] copying previously installed libraries to R 2.3.0
Note that the FAQ is really no different from batchfiles since the FAQ does not address how to move or copy the packages and the batchfiles scripts just do that (with the safeguard that they will not overwrite any packages that are already there so you could, for example, manually install a few packages and the move or copy the remaining ones over.) I think there was some discussion that 2.3.0 might have moving/copying capability built into the R installer but since it seems that that did not make it into 2.3.0 it should be possible to use the scripts, as before, or, alternatively, just re-install all your packages from scratch. By the way, there were some comments about the advantage of keeping your packages in a library so that you can just update the library. The problem with that is that if you want to keep multiple versions of R on the same system then you will want to make sure that each R version has packages that run with that version of R so if you clobber the ones that run with 2.2.0, say, by overwriting them with 2.3.0 packages then you can no longer use that library with 2.2.0. If you keep the packages in .../R/R-2/library then you can be sure that each R version has the right packages in its library without messing around. On 4/26/06, Thomas Harte [EMAIL PROTECTED] wrote: the windoze faq that you refer to doesn't quite address the question that i asked, but thanks all the same. 2.8 What's the best way to upgrade? That's a matter of taste. For most people the best thing to do is to uninstall R (see the previous Q), install the new version, copy any installed packages to the library folder in the new installation, run update.packages() in the new R (`Update packages...' from the Packages menu, if you prefer) and then delete anything left of the old installation. Different versions of R are quite deliberately installed in parallel folders so you can keep old versions around if you wish. Upgrading from R 1.x.y to R 2.x.y is special as all the packages need to be reinstalled. Rather than copy them across, make a note of their names and re-install them from CRAN. Christos Hatzis [EMAIL PROTECTED] wrote: See Windows FAQ 2.8 - works well. -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Harte Sent: Wednesday, April 26, 2006 2:54 PM To: r-help@stat.math.ethz.ch Subject: [R] copying previously installed libraries to R 2.3.0 hi all, is there a new mechanism in R 2.3.0 for copying libraries from, say, R 2.2.1 to R 2.3.0? i ask because gabor grothendieck comments in his copydir.bat (from gabor's batchfiles at: http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip ): ``:: I personally upgraded my 2.1.0 to 2.2.0 this way so it seems ok until :: R replaces this with something better which is expected for 2.3.0. ''' see also the posting below. cheers, thomas. [R] copy contributed packages from R 2.2.0 to 2.2.1 This message: [ Message body ] [ More options ] Related messages: [ Next message ] [ Previous message ] [ In reply to ] [ Next in thread ] From: Ronnie Babigumira Date: Fri, 23 Dec 2005 15:58:36 +0100 Hi Helli, this came up last week, Here are some of the replys posted 1. In http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip are two Windows XP batch files: movedir.bat copydir.bat which will move the packages (which is much faster and suitable if you don't need the old version of R any more) or copy the packages (which takes longer but preserves the old version). 2. x - installed.packages()[,1] install.packages(x) 3. This is one reason we normally recommend that you install into a separate library. Then update.packages(checkBuilt = TRUE) is all that is needed. However, foo - installed.packages() as.vector(foo[is.na(foo[, Priority]), 1]) will give you a character vector which you can feed to install.packages(), so it's not complex to do manually. 4. If the previous installation is still alive, fire it up and pS - packageStatus() pkgs - pS$inst$Package[!pS$inst$Priority %in% c(base, recommended)] save(pkgs, file = foo) In the new installation, load(foo) install.packages(pkgs) Helmut Kudrnovsky wrote: hi R-users, a few days ago R 2.2.1 came out. on my win xp i'installed R 2.2.0. along the time i've installed a lot of contributed packages. my internet-connection is not very fast. so my question: is it possible after installing R 2.2.1 to do copy/paste the contributed packages from the C:\Programme\R221 to the C:\Programme\R2.2.1- location in the files system? or have i to download and install the packages new? greetings from the snowy austria merry christmas helli system R.2.2.0 win xp __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
Re: [R] problem with get command
ov$vn1 is not a variable. It is the result of applying the $ function to the ov and vn1 arguments. For example, using BOD which is a data frame that comes with R, rather than get(BOD$Time) use get(BOD)[[Time]] On 4/26/06, Thomas Davidoff [EMAIL PROTECTED] wrote: I don't understand what my error is in the following: I need to use the get command on a series of variables, but can't for some reason that I don't understand. Why am I told no such variable as ov$vn1 after getting a summary report on that very variable? summary(ov$vn1) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's 1.0 25.0 81.0468.1450.0 159100.0 6050.0 dvars - paste(ov$dvn, 1:4, sep=) vars - c(ov$vn1,ov$vn2,ov$vn3,ov$vn4) summary(get(vars[1])) Error in get(x, envir, mode, inherits) : variable ov$vn1 was not found Execution halted Thomas Davidoff Assistant Professor Haas School of Business UC Berkeley Berkeley, CA 94618 Phone:(510) 643-1425 Fax:(510) 643-7357 email:[EMAIL PROTECTED] web:http://faculty.haas.berkeley.edu/davidoff/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] stl function
stl does this internally: x - na.action(as.ts(x)) so stl(x, s.window, na.action = f) is the same as stl(f(as.ts(x)), s.window) e.g. nottem[25] - NA # nottem is a built in data set in R stl(nottem, per) # error stl(nottem, per, na.action = na.contiguous) library(zoo) stl(nottem, per, na.action = na.locf) stl(nottem, per, na.action = na.approx) Whether any of these makes sense is another matter. On 4/26/06, Andrea Toreti [EMAIL PROTECTED] wrote: Hi, I have a monthly time series with missing values and I would use stl function to identify seasonality. I tried all settings of na.action but the result is the same: stl(tm245,s.window=11, na.action=na.pass) Error in stl(tm245, s.window = 11, na.action = na.pass) : NA/NaN/Inf in foreign function call (arg 1) Can you help me? Thanks Andrea Toreti [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] gmane?
They can be found here: http://dir.gmane.org/gmane.comp.lang.r.announce Announcements about the development of the R Project for Statistical Computing and the availability of new code. (read-only) http://dir.gmane.org/gmane.comp.lang.r.deal Learning Bayesian networks in R - the 'deal' package http://dir.gmane.org/gmane.comp.lang.r.debian Discussion of the Debian port of the statistical software GNU R http://dir.gmane.org/gmane.comp.lang.r.devel R language developers list http://dir.gmane.org/gmane.comp.lang.r.general The `main' R mailing list, a language and environment for statistical computing and graphics. http://dir.gmane.org/gmane.comp.lang.r.geo Discussion of geographical data in the statistical software GNU R http://dir.gmane.org/gmane.comp.lang.r.gr R Special Interest Group on gRaphical models http://dir.gmane.org/gmane.comp.lang.r.gui Discussion of the Graphical User Interface for the statistical software GNU R http://dir.gmane.org/gmane.comp.lang.r.mac R Special Interest Group on Macintosh Development and Porting, both for MacOS 8.6 - 9.x and MacOS X http://dir.gmane.org/gmane.comp.lang.r.r-metrics Mailing list for discussions relating to use of GNU R in 'finance', i.e. financial engineering, financial economics, empirical finance, computational finance, ... On 4/27/06, Jose Quesada [EMAIL PROTECTED] wrote: Hi All, I recently found gmane http://gmane.org/ It's a system to covert mail to news and back, with the nice property of keeping a searchable archive... Very convenient if you are subscribed to many lists and don't want to have your mail box cluttered. I use it to read several mailing lists already, but R is not available there. I wonder if the admins know about gmane and if they think it'd be a good idea to have R-help added there. Quoting from their site: To get a new mailing list added, use the subscription form. Almost any mailing list can be added. Just include subscription information. Mailing list archives can be imported into Gmane. What do you think? -- Cheers, -Jose -- Jose Quesada, PhD. [EMAIL PROTECTED] Dept. of Psychology http://www.andrew.cmu.edu/~jquesada Sussex University Brighton, UK [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] gmane?
Assuming you are aware that the dir pages at gmane provide the key links for each group, googling for: dir gmane r-help gets it as the first hit. On 4/27/06, Jose Quesada [EMAIL PROTECTED] wrote: Thanks all, It was very surprising that I couldn't find it. I searched news.gmane.orgfor r-help, and nothing popped up. -Jose On 4/27/06, Sundar Dorai-Raj [EMAIL PROTECTED] wrote: Jose Quesada wrote: Hi All, I recently found gmane http://gmane.org/ It's a system to covert mail to news and back, with the nice property of keeping a searchable archive... Very convenient if you are subscribed to many lists and don't want to have your mail box cluttered. I use it to read several mailing lists already, but R is not available there. I wonder if the admins know about gmane and if they think it'd be a good idea to have R-help added there. Quoting from their site: To get a new mailing list added, use the subscription form. Almost any mailing list can be added. Just include subscription information. Mailing list archives can be imported into Gmane. What do you think? -- Cheers, -Jose -- Jose Quesada, PhD. [EMAIL PROTECTED] Dept. of Psychology http://www.andrew.cmu.edu/~jquesada Sussex University Brighton, UK [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html R-help is already there. Go to http://www.r-project.org/ and click the Search link. There's a link to Gmane there. The relevant group for R-help is called gmane.comp.lang.r.general. I agree that Gmane is very useful. --sundar -- Cheers, -Jose -- Jose Quesada, PhD. [EMAIL PROTECTED] Dept. of Psychology http://www.andrew.cmu.edu/~jquesada Sussex University Brighton, UK [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] State space AR models in R: some examples
Check out the sspir package and http://www.jstatsoft.org/index.php?vol=16 On 4/27/06, Pablo Almaraz [EMAIL PROTECTED] wrote: Hi all, Does anyone have an example of an autoregressive (AR) time-series model specified as a state space model in R? That is, I want to go beyond the locally linear (constant) model, and fit the following Gaussian AR state process model: Xt = a + (1+b)*Xt-1 + epsilon ,where the model for the observation process is Yt = Xt + tau I have information of the tau's (observation variance) for each observation in the time-series, and it would be perfect to include this information during the fitting routine. I have actually coded this as a WinBUGS code (pasted below), but I'm not quite sure it works as it should. I would be extremely thanked if anyone could submit an example of an R code fitting the above problem. Gibb's sampler for solving the problem would be great, I'm not sure whether the Kalman filter would work well with only 30 data points (?). Additional details, corrections and/or help would probably save my life at least for a while. Thank you all Cheers Pablo WinBUGS state-space R code: ## model; { # Parameters and priors alpha ~ dnorm(0,0.01) # Intrinsic rate of increase b ~ dnorm(0,0.01) beta[1] - b-1 # First-order density-dependence sigma ~ dunif(0, 1000) # State process SD isigma2 - pow(sigma, -2) # State process 1/var # isigma2 ~ dgamma(0.01,0.01) # Initial state value n.exp[1] ~ dnorm(n[1],tau[1]) # State process model for(j in 1:(N-1)){ n.exp.mu[j+1] - alpha + b*n.exp[j] # First-order Gompertz model n.exp[j+1] ~ dnorm(n.exp.mu[j+1], isigma2) } # Observation process model for(j in 1:(N-1)){ n[j+1] ~ dnorm(n.exp[j+1],tau[j+1]) } } # Loge-transformed and standardized time-series data list(N=28, n=c(-0.24645, 0.015312, 0.442262, -0.05879, -0.17308, -0.03778, 0.120961, -0.04383, 0.002507, 0.073278, -0.11684, 0.003657, -0.07375, 0.05006, -0.04489, -0.00826, -0.06713, 0.682228, 0.032058, -0.33254, -0.50432, 0.176914, 0.249793, 0.01672, -0.30581, -0.19617, 0.158579, 0.185296), tau=c(2.38351, 2.351379, 49.12811, 10.01703, 11.68982, 3.846619, 1.999254, 1.6685, 3.011932, 5.661051, 168.2524, 1.581, 25.74985, 50.29332, 3.03117, 7.65013, 3.376606, 17.34871, 4.215985, 2.455294, 7.685724, 1.918054, 5.588953, 8.503541, 0.5666, 0.923611, 4.986243, 10.36613)) # Inits for MC 1 list(alpha = 0.5, b = -1, sigma = 0.5) # Inits for MC 2 list(alpha = 1, b = 0.01, sigma = 1) # Inits for MC 3 list(alpha = 0.01, b = 1, sigma = 0.01) ### End (not run) -- Pablo Almaraz GarcÃa Estación Biológica de Doñana (CSIC) Pabellón del Perú, Avda. Mª LuÃsa s/n E-41013, Sevilla SPAIN E-mail: almaraz[AT]ebd[DOT]csic[DOT]es webpage: http://www.almaraz.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scope of variable/object ?
You probably need to contact the developer of pamr but short of investigating it, a workaround might be to put a copy of myd2 into the global environment since it likely will at least look there, e.g. add this line to: assign(myd2, myd2, .GlobalEnv) domat. On 4/27/06, Tim Smith [EMAIL PROTECTED] wrote: Hi, I must be missing something here...Essentially, a short piece of code works if it's standalone, but doesn't work if it's divided into two functions. The code that works is: ### WORKS ### library(pamr) set.seed(120) x - matrix(rnorm(1000*20),ncol=20) y - sample(c(1:4),size=20,replace=TRUE) mydata - list(x=x,y=y) mytrain - pamr.train(mydata) new.scales - pamr.adaptthresh(mytrain,ntries = 1) But if I split the lines into two functions, then I get an error message that reads : 'Error in pamr.train(data = myd2, threshold = threshold, threshold.scale = all.scales[i+ : object myd2 not found.' The code that doesn't work is: ### DOESN'T WORK library(pamr) domat - function(myd){ myd2 - myd mytrain - pamr.train(myd2) new.scales - pamr.adaptthresh(mytrain) } dom - function(){ set.seed(120) x - matrix(rnorm(1000*20),ncol=20) y - sample(c(1:4),size=20,replace=TRUE) myda - list(x=x,y=y) domat(myda) } dom() # Did I do something really goofy? How can I find out what's happening? many thanks. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Break into Parts
Try this: tapply(x, cut(x, 12), sd) On 4/28/06, sumanta basak [EMAIL PROTECTED] wrote: Hi R-Experts, I have a vector of length 72. I want to break it into 12 parts and want to take standerd deviation of each group. Please help me in this regard. Thanks, Sumanta. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Break into Parts
Good point. Following Andy's comment sd(matrix(sort(x), nc=12)) could also be used if you want them broken up by 6 smallest, next 6 smallest, etc. although there might be differences in the case of ties. Using tapply here are a number of ways of breaking it up (the first three give the same answer as sd(matrix(x,nc=12))) while the others form the groups in different ways: tapply(x, gl(12, 6), sd) tapply(x, rep(1:12, each = 6), sd) tapply(sort(x), gl(12, 6), sd) tapply(x, rep(1:12, 6), sd) tapply(sort(x), rep(1:6, 12), sd) tapply(sort(x), rep(1:6, each = 12), sd) On 4/28/06, Liaw, Andy [EMAIL PROTECTED] wrote: You didn't say _how_ you want the vector to be broken up, so you get two different answers from Uwe and Gabor. Uwe's answer group every six elements into one group, in the order they appear in the vector (which, BTW, can be simplified to just sd(matrix(x, ncol=12)). Gabor's answer put the smallest six into one group, the next smallest six into the second group and so on. You'll have to decide which is the one you want. Andy From: sumanta basak Hi R-Experts, I have a vector of length 72. I want to break it into 12 parts and want to take standerd deviation of each group. Please help me in this regard. Thanks, Sumanta. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot acf of several timeseries
Try this: lapply(names(tslist), function(nm) acf(tslist[[nm]], main = nm)) On 4/28/06, Ulf Mehlig [EMAIL PROTECTED] wrote: Hello r-help, I have a couple of time-series of different length and I would like to produce a simple overview plot showing the autocorrelation functions of the series. The time-series are stored in a dataframe like this: test.data item year value 1 xxx 1961 -1.09 2 xxx 1962 0.21 3 xxx 1963 -0.81 [trimmed] 8 yyy 1959 1.12 9 yyy 1960 1.44 10 yyy 1961 -1.97 [trimmed] I transformed them to a list of ts-objects and did the plotting via lapply(): tslist - by(test.data, test.data$item, function(x) ts(x$value, start=min(x$year), end=max(x$year)) ) par(mfcol=c(length(tslist), 1)) lapply(tslist, acf) Is there a possibility to adapt the procedure so that the name of item ('xxx', 'yyy', ...) is printed as title of each acf plot? I am sure that there are better ways to produce this type of plot ... do you have suggestions? Many thanks, Ulf -- Ulf Mehlig[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] copying previously installed libraries to R 2.3.0
A corrected version is now in batchfiles_0.2-8.zip in: http://cran.r-project.org/contrib/extra/batchfiles/ and will propogate to the mirrors shortly. On 4/28/06, Xiaohua Dai [EMAIL PROTECTED] wrote: copydir.bat wont work for libraries such as clim.pact, haplo.stats, hier.part, pls.pcr, R.matlab, R.oo. It will truncate new directories as clim, haplo, hier, pls, R. On 4/26/06, Thomas Harte [EMAIL PROTECTED] wrote: hi all, is there a new mechanism in R 2.3.0 for copying libraries from, say, R 2.2.1 to R 2.3.0? i ask because gabor grothendieck comments in his copydir.bat (from gabor's batchfiles at: http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip ): ``:: I personally upgraded my 2.1.0 to 2.2.0 this way so it seems ok until :: R replaces this with something better which is expected for 2.3.0. ''' see also the posting below. cheers, thomas. [R] copy contributed packages from R 2.2.0 to 2.2.1 This message: [ Message body ] [ More options ] Related messages: [ Next message ] [ Previous message ] [ In reply to ] [ Next in thread ] From: Ronnie Babigumira rb.glists Date: Fri, 23 Dec 2005 15:58:36 +0100 Hi Helli, this came up last week, Here are some of the replys posted 1. In http://cran.r-project.org/contrib/extra/batchfiles/batchfiles_0.2-5.zip are two Windows XP batch files: movedir.bat copydir.bat which will move the packages (which is much faster and suitable if you don't need the old version of R any more) or copy the packages (which takes longer but preserves the old version). 2. x - installed.packages()[,1] install.packages(x) 3. This is one reason we normally recommend that you install into a separate library. Then update.packages(checkBuilt = TRUE) is all that is needed. However, foo - installed.packages() as.vector(foo[is.na(foo[, Priority]), 1]) will give you a character vector which you can feed to install.packages(), so it's not complex to do manually. 4. If the previous installation is still alive, fire it up and pS - packageStatus() pkgs - pS$inst$Package[!pS$inst$Priority %in% c(base, recommended)] save(pkgs, file = foo) In the new installation, load(foo) install.packages(pkgs) Helmut Kudrnovsky wrote: hi R-users, a few days ago R 2.2.1 came out. on my win xp i'installed R 2.2.0. along the time i've installed a lot of contributed packages. my internet-connection is not very fast. so my question: is it possible after installing R 2.2.1 to do copy/paste the contributed packages from the C:\Programme\R221 to the C:\Programme\R2.2.1- location in the files system? or have i to download and install the packages new? greetings from the snowy austria merry christmas helli system R.2.2.0 win xp __ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Received on Fri Dec 23 2005 - 15:58:36 EST This message: [ Message body ] Next message: Matthias Kohl: [R] convolution of the double exponential distribution Previous message: Helmut Kudrnovsky: [R] copy contributed packages from R 2.2.0 to 2.2.1 In reply to: Helmut Kudrnovsky: [R] copy contributed packages from R 2.2.0 to 2.2.1 Next in thread: Uwe Ligges: [R] copy contributed packages from R 2.2.0to 2.2.1 Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ] This archive was generated by hypermail 2.2.0 : Sat Dec 31 2005 - 19:09:32 EST [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] aggregating columns in a data frame in different ways
Here are three possibilities: 1. aggregate on the columns that you want to sum and aggregate on the columns that you want to average and then merge them: By - A[, 2, drop = FALSE] merge(aggregate(A[, 3, drop = FALSE], By, sum), aggregate(A[, 4, drop = FALSE], By, mean)) 2. use by: f - function(x) with(x, c(count = sum(count), value = mean(value))) do.call(rbind, by(A[, 3:4], A[, 2, drop = FALSE], f)) 3. use summaryBy in the doBy package picking off the appropriate columns in the output: library(doBy) summaryBy(. ~ type, A[, -1], FUN = c(sum, mean))[, c(1, 2, 5)] On 4/28/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I would like to use aggregate() to combine statistics for several days in a data frame. My data frame looks similar to this: datetype count value 1 2006-04-01 A 10 99.6 2 2006-04-01 B 4 33.2 3 2006-04-02 A 22 43.2 4 2006-04-02 B 8 44.9 5 2006-04-03 A 12 12.4 6 2006-04-03 B 14 18.5 ('date' is a factor, and my actual data frame has about 100 different 'types', not just two) I would like to sum up the 'counts' per 'type', and get an average of the 'values' per 'type'. In other words, I would like my results to look like this: type count value 1 A 44 51.7 2 B 26 32.2 The way I'm doing this now is to tear the table apart into its individual columns, then apply aggregate() to each column individually (using the 'type' column for the 'by' parameter), and finally putting everything back together, like this: A.count = aggregate(A$count, list(type=A$type), sum) A.value = aggregate(A$value, list(type=A$type), mean) B = data.frame(type=A.count$type, count=A.count$x, value=A.value$x) My actual table is a bit more involved than in this simple example, however, so this becomes quite tedious. I am hoping that there is a simpler way for doing this, for example by providing different FUN parameters for each column to the aggregate() function. I would appreciate any suggestions. Thanks Klaus __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to rep a matrix by row?
Try this where DF is your data frame: DF[rep(seq(nrow(DF)), each = 3), ] On 4/29/06, Jiantao Shi [EMAIL PROTECTED] wrote: Hi, i have a dataframe like this, SourceTreatDrugReplicate control0A1 control10A2 control30A3 10A1 And i want to rep this dataframe 3 times by row,the resulting matrix as follow, SourceTreatDrugReplicate control0A1 control0A1 control0A1 control10A2 control10A2 control10A2 control30A3 control30A3 control30A3 10A1 10A1 10A1 So is there a easy way to do ? thanks. Jianao Shi [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Reshaping genetic data from long to wide
You might want to check the double check the list archives https://www.stat.math.ethz.ch/pipermail/r-help/ to see if your posts got through or not just in case its just some problem in displaying your own posts. On 4/29/06, Farrel Buchinsky [EMAIL PROTECTED] wrote: Gabor Grothendieck ggrothendieck at gmail.com writes: http://news.gmane.org/gmane.comp.lang.r.general or one of these: http://dir.gmane.org/gmane.comp.lang.r.general Yes but when I hit Post this article it send something to gMane (I think) but not to R-help@stat.math.ethz.ch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Package docs for CRAN
If your package is called mypkg you could create a mypkg-package.Rd file. e.g. library(dyn) library(help = dyn) # note that mypkg-package is listed package?dyn ?dyn-package # same and you could add one or more vignettes, e.g. library(zoo) library(help = zoo) # note that the 2 vignettes are listed at end vignette(zoo) On 4/30/06, William Asquith [EMAIL PROTECTED] wrote: CRAN et al., I would like to add an extented introduction or other arbitrary sections to my package lmomco. I have been shipping inst/doc/Introduction.Rd. I would like to have this content inserted to the front of the PDF build for the CRAN. The R-exts.pdf seems to be a little silent on this subject? For my purposes, I have been doing this R CMD Rd2dvi --pdf --title=lmomco---version X inst/doc/ Introduction.Rd man/*.Rd but I don't get the correct header (description) or the index built as seen in the lmomco.pdf from the CRAN. Further, is there any point in shipping a complete PDF build of the docs as in inst/doc/lmomco.pdf? Please advise on best practices for building the best docs that I can. . . William __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Duration labels on plot axes
The times class in chron will give hours and minutes. e.g. library(times) plot(times(0:23/23), 0:23) and you could modify chron:::axis.times for the others. On 4/30/06, Duncan Murdoch [EMAIL PROTECTED] wrote: I have a variable containing a measurement of a duration in seconds which I would like to use in a plot, with the axes labelled in a format like %H:%M:%S (or possibly %Hh%Mm) if the duration is more than an hour, or just %M:%S if more than a minute, or just decimal seconds if short enough. Is there a class with an axis.* method defined that has behaviour something like this? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] general help on R and factor in R and a few simple comment from a newbie
On 4/30/06, Guojun Zhu [EMAIL PROTECTED] wrote: Hi. I am starting to learn R for a course project. I am relative OK c++ programer. I found the R is very different. I have read the an introduction to R. I have to say it is not very newbie friendly. It does not explain many things clearly. And unfortunately, there is not too much introductory materials available on-line. I do not want to buy a book. Enter R into google and you get the R home page. On the left pane of that under Documentation click on Other and from there click on Contributed Documentation and there is a list of literally dozens of different introductions. Also google for zoonekynd R for another online intro to R. For example, I found factor is a quite different concept.I cannot use it as a vector which I can somehow think as a 1-dimension array. help(factor) does not help much to clear about the concept either. Also there are quite few basic concepts like the data structure of model, etc is far from clear for me. Yet there is no general place I can look for there more general idea. help is a very interesting and useful function. However, I would say the content lacks some general idea. I used to learn Mathematica, which is also a high-level tool by their help. It is very comprehensive, yet well-organized with some general idea, some specific fundtion explanation and some functions about one topic. For R's help, you get only the specific explanation for the perticular function, and no more related things. I feel it is more like a reference for experienced user instead of some newbie. I know there should be some trick by R with some dense code for big work. But unfortunately, I could not find many place to learn it. Now for a specific question, I use read.csv to read some data from an excel data file (about 30,000 line data). Some columns has empty data, so NA was read. But they were read in as a factor instead of vector. I need to manipulate them later as a vector (for example standardizing by dividing with standard deviation, or derive a new column from other two or more columns). How to convert it into vector? Or maybe some functions already exists for factor already? Check out the na.strings= and possibly the as.is = TRUE arguments on read.table. Also the read.xls command from the gdata package may be helpful. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Duration labels on plot axes
That should have been: library(chron) plot(times(0:23/24), 0:23) On 4/30/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: The times class in chron will give hours and minutes. e.g. library(times) plot(times(0:23/23), 0:23) and you could modify chron:::axis.times for the others. On 4/30/06, Duncan Murdoch [EMAIL PROTECTED] wrote: I have a variable containing a measurement of a duration in seconds which I would like to use in a plot, with the axes labelled in a format like %H:%M:%S (or possibly %Hh%Mm) if the duration is more than an hour, or just %M:%S if more than a minute, or just decimal seconds if short enough. Is there a class with an axis.* method defined that has behaviour something like this? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Yet another help needed
Look at ?filter ?embed rollmean in the zoo package running in the gtools package runmean in the caTools package The last one is probably the fastest. On 4/30/06, Guojun Zhu [EMAIL PROTECTED] wrote: I have a big data.frame with abou 20 column and 60,000 rows for analyze. Let us say I had a column a. I want to generate a new column which value should be the average of the 60 a before the current column. Let us say very row is time t_i. I need to calculate a(t_(i-60))+a(t(i-59)+...+a(t(i-1)). Also I want to get another number by run regression of a(t(i-60)),..., a(t(i-1)) on b(t(i-60)),...,b(t(i-1)). Is there any simple densed code for this? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] table of means/medians across bins used for a histogram
My understanding is that you want to replace each rate with its average over the associated bin and then plot age against that. In that case try this: DF # test data age rate bin 1 0.002 10.0 A 2 0.045 0.1 B 3 0.130 15.0 A 4 0.150 34.0 D with(DF, plot(ave(rate, bin), age)) Assuming they are stored in vectors the columns are age, rate, bin we would have plot(ave(clock, bin), age) On 4/30/06, lalitha viswanath [EMAIL PROTECTED] wrote: Hi I am trying to get a table of means of parameter 1 across BINS of parameter 2. I am working in proteomics and a sample of my data is as follows cluster-age clock-rate(evolutionary rate) scopclass 0.002 10 A 0.045 0.1 B 0.1315 A 0.1534 D Scop class has only 9 distinct categories (A-I) Whereas cluster-age and clock-rate are discrete variables greater than 0. I am trying to do two things with this kind of data, out of which I managed to accomplish one thanks to the documentation and pre-existing queries on the mailing lists. 1. Plot a histogram of the age distribution with scop class category superimposed on each bin. I managed to do this with barplot2. 2. Now I am trying to plot a scatter plot of the age v/s the clock-rate. However to eliminate possible sampling errors, we are trying to get an average of the clock-rate for each of the bins used above. i.e. before plotting a x-y plot, i wish to compute average clock-rate in each of the bins for the age and then plot a x-y plot of the age v/s clock rate. Can anyone point me to appropriate functions for the same? I am trying to work with prop.table, cut, break, etc. But I am not heading anywhere. Thanks Lalitha __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] table of means/medians across bins used for a histogram
Or perhaps a bit simpler: plot(age ~ ave(clock, bin), DF) On 4/30/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: My understanding is that you want to replace each rate with its average over the associated bin and then plot age against that. In that case try this: DF # test data age rate bin 1 0.002 10.0 A 2 0.045 0.1 B 3 0.130 15.0 A 4 0.150 34.0 D with(DF, plot(ave(rate, bin), age)) Assuming they are stored in vectors the columns are age, rate, bin we would have plot(ave(clock, bin), age) On 4/30/06, lalitha viswanath [EMAIL PROTECTED] wrote: Hi I am trying to get a table of means of parameter 1 across BINS of parameter 2. I am working in proteomics and a sample of my data is as follows cluster-age clock-rate(evolutionary rate) scopclass 0.002 10 A 0.045 0.1 B 0.1315 A 0.1534 D Scop class has only 9 distinct categories (A-I) Whereas cluster-age and clock-rate are discrete variables greater than 0. I am trying to do two things with this kind of data, out of which I managed to accomplish one thanks to the documentation and pre-existing queries on the mailing lists. 1. Plot a histogram of the age distribution with scop class category superimposed on each bin. I managed to do this with barplot2. 2. Now I am trying to plot a scatter plot of the age v/s the clock-rate. However to eliminate possible sampling errors, we are trying to get an average of the clock-rate for each of the bins used above. i.e. before plotting a x-y plot, i wish to compute average clock-rate in each of the bins for the age and then plot a x-y plot of the age v/s clock rate. Can anyone point me to appropriate functions for the same? I am trying to work with prop.table, cut, break, etc. But I am not heading anywhere. Thanks Lalitha __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] pulling items out of a lm() call
Try this: # test data fo - y ~ female + I(age^2) + female:black + (age + education) * female # create a list of form list(y = as.name(z.y), ...) for use with substitute L - sapply(all.vars(fo), function(nm) as.name(paste(z, nm, sep = .))) do.call(substitute, list(fo, L)) On 5/1/06, Andrew Gelman [EMAIL PROTECTED] wrote: I want to write a function to standardize regression predictors, which will require me to do some character-string manipulation to parse the variables in a call to lm() or glm(). For example, consider the call lm (y ~ female + I(age^2) + female:black + (age + education)*female). I want to be able to parse this to pick out the input variables (female, age, black, education). Then I can transform these as appropriate (to get z.female, z.age, etc), feed them back into the lm() function, and go from there. Does anyone know an easy way to pull out the variables? I basically have to parse out the symbols +, :, *, and , but there's also the problem of handling parentheses and the I() operator. Thanks! Andrew -- Andrew Gelman Professor, Department of Statistics Professor, Department of Political Science [EMAIL PROTECTED] www.stat.columbia.edu/~gelman Statistics department office: Social Work Bldg (Amsterdam Ave at 122 St), Room 1016 212-851-2142 Political Science department office: International Affairs Bldg (Amsterdam Ave at 118 St), Room 731 212-854-7075 Mailing address: 1255 Amsterdam Ave, Room 1016 Columbia University New York, NY 10027-5904 212-851-2142 (fax) 212-851-2164 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to strip one term from a data.frame? + How to write long line in script?
Using the built in data frame iris, which has 5 columns, regress Sepal.Length against all other variables except the last one: lm(Sepal.Length ~., iris[1:4]) On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote: I need to run a regression with 14 normal variables and 20 dummy variables. All the data is in a huge data.frame df. But there is some extra intermediate item in the same data.frame too. It will be nice I can strip off those terms and run lm(). Also, is there a simple way to write the formula, for example, just specify the y term, all other term in data.frame should be x_i. Or is there some kind of automatically way to build it? like use ls(df) couple with some other functions? If I have to write it in the brute force way, I will need to write a real long line in script. I am in windows. I found the script does not work with a long line. It will not work either if I break it into a few lines. How to get rid of that? thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] efficiency in merging two data frames
Some functions that may be of help: ?aggregate.ts ?cbind ?merge and in the zoo package ?as.yearmon ?as.yearqtr ?aggregate.zoo ?merge.zoo On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote: I have two data sets about lots of companies' stock and fiscal data. One is monthly data with about 144,000 lines, and the other is quaterly with about 56,000. Each data set takes different company code. I need to merge these two together. I read both ask cvs. And the other file with corresponding firm code. Now I have three data sets. return$PERMNO, account$GVKEY. id is the data frames of the corresponding relation and has both id$PERMNO and id$GVKEY. Also, I need to convert the return's month into quarter and finally merge two data frames(return and account). I end up write a short program for this, but it runs very slow. 15+ minutes. Is there quick way to do it. Here is my original codes. id$fy=rep(0,length(id$PERMNO)) for (i in 1:length(id$PERMNO)) id$fy[[i]]-account$FYR[id$GVKEY[[i]]==account$GVKEY][[1]] return$GVKEY=rep(0,length(return$PERMNO)) return$fyy=rep(0,length(return$PERMNO)) return$fyq=rep(0,length(return$PERMNO)) for (i in i:length(return$PERMNO)) { temp-id$PERMNO==return$PERMNO[[i]]; tempmon-id$fy[temp][[1]]; if (return$month[[i]]-tempmon) { return$fyy[[i]]-return$year[[i]]; return$fyq[[i]]-4-(tempmon-return$month[[i]])%/%3; } else{ return$fyy[[i]]-return$year[[i]]+1; return$fyq[[i]]-(return$month[[i]]-tempmon-1)%/%3; } return$GVKEY[[i]]-id$GVKEY[temp][[1]]; } returnnew=merge(return,account,by.x-c(GVKEY,fyy,fyq),by.y-c(GVKEY,fyy,fyq)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to specify function arguments that are used in different places
You could have a list of args for each one like this: # test data x - list(data = c(1,3,5), points = c(2,4)) myfunc - function(x, plot.args = NULL, points.args = NULL) { do.call(plot, c(list(x$data), plot.args)) do.call(points, c(list(x$points), points.args)) } myfunc(x, plot.args = list(col = red), points.args = list(col = blue)) On 5/1/06, Gregor Gorjanc [EMAIL PROTECTED] wrote: Hello! Subject is not very clear, but I hope my question will be;) I wrote a function, which produces a plot and I have problems with arguments. For the sake of example let us consider that my function looks like this myfunc - function(x, points=FALSE, lines=FALSE, ...) { ## x is an object that is being plotted plot(x$plotData, ...) ## one can also add some data on graph via points points(x$pointsData, ...) ## one can also add some data on graph via lines lines(x$linesData, ...) } My problem is in ... argument. plot(), points() and lines() have so many possible arguments, which is very nice, but how can I deal with them in my scenario. For example, I might want to specify red color for plot, blue for points and green for lines. Is it possible to handle such a mixture, without specifiying zillion of arguments such as plotCol, pointsCol, linesCol etc.? Perhaps something like ~ points$...? Thanks! -- Lep pozdrav / With regards, Gregor Gorjanc -- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc at bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europefax: +386 (0)1 72 17 888 -- One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] table of means/medians across bins used for a histogram
I assume you want to discretize one column and then for each level produced, calculate the mean of another column and plot those means against the levels. Using the builtin iris data frame discretize Sepal.Width producing the SWfac factor and calculate, SLmean, the mean Sepal.Length for each level of that factor. Then plot using custom x axis: SWfac - cut(iris$Sepal.Width, seq(2, 4.4, .5)) SLmean - tapply(iris$Sepal.Length, SWfac, mean) plot(SLmean, xaxt = n) axis(1, seq(SLmean), levels(SWfac)) On 5/1/06, lalitha viswanath [EMAIL PROTECTED] wrote: Hi I think I seem to have phrased my doubt incorrectly. I want a x-y plot of age v/s rate (the bin is irrelevant for this plot); only that instead of a simple x-y plot, i want a plot of average(rate) for each age-intervals. My ages vary from 0 to 0.7 and I want to divide them in groups of 0.02. So I want a plot of the following Age-intervalsAverage rate in that interval 0-0.025 0.02-0.04 7 0.04-0.06 1 0.06-0.08 0 0.08-0.1 0.15 Age-intervals mentioned along the x-axis (like for a histogram) and rates plotted for each age-interval --- Gabor Grothendieck [EMAIL PROTECTED] wrote: Or perhaps a bit simpler: plot(age ~ ave(clock, bin), DF) On 4/30/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: My understanding is that you want to replace each rate with its average over the associated bin and then plot age against that. In that case try this: DF # test data age rate bin 1 0.002 10.0 A 2 0.045 0.1 B 3 0.130 15.0 A 4 0.150 34.0 D with(DF, plot(ave(rate, bin), age)) Assuming they are stored in vectors the columns are age, rate, bin we would have plot(ave(clock, bin), age) On 4/30/06, lalitha viswanath [EMAIL PROTECTED] wrote: Hi I am trying to get a table of means of parameter 1 across BINS of parameter 2. I am working in proteomics and a sample of my data is as follows cluster-age clock-rate(evolutionary rate) scopclass 0.002 10 A 0.045 0.1 B 0.1315 A 0.1534 D Scop class has only 9 distinct categories (A-I) Whereas cluster-age and clock-rate are discrete variables greater than 0. I am trying to do two things with this kind of data, out of which I managed to accomplish one thanks to the documentation and pre-existing queries on the mailing lists. 1. Plot a histogram of the age distribution with scop class category superimposed on each bin. I managed to do this with barplot2. 2. Now I am trying to plot a scatter plot of the age v/s the clock-rate. However to eliminate possible sampling errors, we are trying to get an average of the clock-rate for each of the bins used above. i.e. before plotting a x-y plot, i wish to compute average clock-rate in each of the bins for the age and then plot a x-y plot of the age v/s clock rate. Can anyone point me to appropriate functions for the same? I am trying to work with prop.table, cut, break, etc. But I am not heading anywhere. Thanks Lalitha __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Adding elements in an array where I have missing data.
Here are a few alternatives: replace(a, is.na(a), 0) + b ifelse(is.na(a), 0, a) + b mapply(sum, a, b, MoreArgs = list(na.rm = TRUE)) On 5/1/06, John Kane [EMAIL PROTECTED] wrote: This is a simple question but I cannot seem to find the answer. I have two vectors but with missing data and I want to add them together with the NA's being ignored. Clearly I need to get the NA ignored. na.action? I have done some searching and cannot get na.action to help. This must be a common enough issue that the answer is staring me in the face but I just don't see it. Simple example a - c(2, NA, 3) b - c(3,4, 5) What I want is c - a + b where c is ( 5 , 4 ,8) However I get c is (5,NA, 8) What am I missing? Or do I somehow need to recode the NA's as missing? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Pasting data into scan()
On 5/1/06, Murray Jorgensen [EMAIL PROTECTED] wrote: The file TENSILE.DAT from the Hand et al Handbook of Small Data Sets looks like this: 0.023 0.032 0.054 0.069 0.081 0.094 0.105 0.127 0.148 0.169 0.188 0.216 0.255 0.277 0.311 0.361 0.376 0.395 0.432 0.463 0.481 0.519 0.529 0.567 0.642 0.674 0.752 0.823 0.887 0.926 except that my mail client has replaced the tab separators by blanks. If I paste this data into R 2.2.1 what I get is strength - scan() 1: 0.0230.0320.0540.0690.0810.094 1: 0.1050.1270.1480.1690.1880.216 Error in scan() : scan() expected 'a real', got '0.0230.0320.0540.0690.0810.094' 0.2550.2770.3110.3610.3760.395 Error: syntax error in 0.2550.2770 0.4320.4630.4810.5190.5290.567 Error: syntax error in 0.4320.4630 0.6420.6740.7520.8230.8870.926 Error: syntax error in 0.6420.6740 Aha! I thought, what I need is scan(sep = \t) but this generates the same error messages. 1. If your situation is that you have separators but don't know what they are try this. It replaces all characters that don't appear in numbers with a space: L - readLines(clipboard) L - gsub([^-0-9.], , L) scan(textConnection(L)) 2. If the separators are completely lost you may still be able to recover the data if you can assume that every number is of the form d.ddd where d is a digt. Just search for that pattern and replace it with itself and a space: L - readLines(clipboard) L - gsub(([0-9][.][0-9][0-9][0-9]), \\1 , L) scan(textConnection(L)) 3. Doing a google search for tensile.dat finds a data set that looks like yours. Try this: URL - http://statistics.byu.edu/resources/files/datasets/tensile.dat; scan(URL) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Still help needed on embeded regression
Using runmean from caTools the first one below does it in under 1 second but will not handle NAs. The second one takes under 15 seconds and handles them by replacing them with linear approximations. Note that k must be odd. # 1 library(caTools) set.seed(1) system.time({ y - rnorm(140001) x - as.numeric(seq(y)) k - 61 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) # 2 library(caTools) library(zoo) set.seed(1) system.time({ y - rnorm(14) x - as.numeric(seq(y)) x[100:200] - NA x - na.approx(zoo(x)) y - zoo(y) k - 60 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote: I basically has a long data.frame a. but I only need three columns x,y. Let us say the index of row is t. I need to produce new column s_t as the linear regression coefficient of (x_(t-60),...x_(t-1)) on (y_(t-60),...,y_(t-1)). The data is about 140,000 rows. I wrote a simple code on this which is super slow, it takes more than 2 hours on a 2.8Ghz Intel Duo Core. My friend use SAS and his code needs only couple of minutes. I know there must be some more efficient way to write it. Can anyone help me on this? Here is the code. Also one line produce a complete NA temp$y and lm function failed on that. How to make it just produce a NA instead and keep runing? attach(return) betat=rep(NA,length(RET)) for (i in 61:length(RET)){cat(i, ); if (year[[i]]=1995){ temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] #if (i%%100==0) cat(i, ); return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] } } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Still help needed on embeded regression
Try runmean2 - function(x, k) # k must be even (coredata(runmean(x, k-1)) * (k-1) + coredata(lag(x, -k/2, na.pad = TRUE)))/k Also, in your code use matrices or vectors instead of data frames to avoid any overhead in using data frames. On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote: Sorry to bother you guys again. This is great. But this is for 61 number and the second case will change 60 to 61. run* only accept odd number window. How to get around it with 60? Any suggestion? Thanks. --- Gabor Grothendieck [EMAIL PROTECTED] wrote: Using runmean from caTools the first one below does it in under 1 second but will not handle NAs. The second one takes under 15 seconds and handles them by replacing them with linear approximations. Note that k must be odd. # 1 library(caTools) set.seed(1) system.time({ y - rnorm(140001) x - as.numeric(seq(y)) k - 61 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) # 2 library(caTools) library(zoo) set.seed(1) system.time({ y - rnorm(14) x - as.numeric(seq(y)) x[100:200] - NA x - na.approx(zoo(x)) y - zoo(y) k - 60 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote: I basically has a long data.frame a. but I only need three columns x,y. Let us say the index of row is t. I need to produce new column s_t as the linear regression coefficient of (x_(t-60),...x_(t-1)) on (y_(t-60),...,y_(t-1)). The data is about 140,000 rows. I wrote a simple code on this which is super slow, it takes more than 2 hours on a 2.8Ghz Intel Duo Core. My friend use SAS and his code needs only couple of minutes. I know there must be some more efficient way to write it. Can anyone help me on this? Here is the code. Also one line produce a complete NA temp$y and lm function failed on that. How to make it just produce a NA instead and keep runing? attach(return) betat=rep(NA,length(RET)) for (i in 61:length(RET)){cat(i, ); if (year[[i]]=1995){ temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] #if (i%%100==0) cat(i, ); return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] } } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] evaluation of expressions
Try this: e - expression(glm(y ~ age)) eval(e) or this: chr - glm(y ~ age) eval(parse(text = chr)) On 5/2/06, Andrew Gelman [EMAIL PROTECTED] wrote: Hi, all. I'm trying to automate some regression operations in R but am confused about how to evaluate expressoins that are expressed as character strings. For example: y - ifelse (rnorm(10)0, 1, 0) sex - rnorm(10) age - rnorm(10) test - as.data.frame (cbind (y, sex, age)) # this works fine: glm (y ~ sex + I(age^2), data=test, family=binomial(link=logit), subset=age1) # but now I want to do it in two steps: expr - 'glm (y ~ sex + I(age^2), data=test, family=binomial(link=logit), subset=age1)' Given expr, defined above, how can I evaluate it? I played around with eval() and as.expression() but can't figure it out. Thanks. Andrew -- Andrew Gelman Professor, Department of Statistics Professor, Department of Political Science [EMAIL PROTECTED] www.stat.columbia.edu/~gelman Statistics department office: Social Work Bldg (Amsterdam Ave at 122 St), Room 1016 212-851-2142 Political Science department office: International Affairs Bldg (Amsterdam Ave at 118 St), Room 731 212-854-7075 Mailing address: 1255 Amsterdam Ave, Room 1016 Columbia University New York, NY 10027-5904 212-851-2142 (fax) 212-851-2164 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Adding elements in an array where I have missing data.
On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote: Here are a few alternatives: replace(a, is.na(a), 0) + b ifelse(is.na(a), 0, a) + b mapply(sum, a, b, MoreArgs = list(na.rm = TRUE)) Well, Gabor, if you want to get fancy... evalq({a[is.na(a)]-0;a})+b Note that the evalq can be omitted: { a[is.na] - 0; a } + b __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Adding elements in an array where I have missing data.
But the evalq solution does change a. a - c(2, NA, 3) b - c(3,4, 5) evalq({a[is.na(a)]-0;a})+b [1] 5 4 8 a [1] 2 0 3 If evalq were changed to local then it would not change a: a - c(2, NA, 3) b - c(3,4, 5) local({a[is.na(a)]-0;a})+b [1] 5 4 8 a [1] 2 NA 3 Also the replace, ifelse and mapply solutions do not change a. On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote: Below. -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 02, 2006 10:42 AM To: Berton Gunter Cc: John Kane; R R-help Subject: Re: [R] Adding elements in an array where I have missing data. On 5/2/06, Berton Gunter [EMAIL PROTECTED] wrote: Here are a few alternatives: replace(a, is.na(a), 0) + b ifelse(is.na(a), 0, a) + b mapply(sum, a, b, MoreArgs = list(na.rm = TRUE)) Well, Gabor, if you want to get fancy... evalq({a[is.na(a)]-0;a})+b Note that the evalq can be omitted: { a[is.na] - 0; a } + b No it can't. The idea is **not** to change the original a. -- Bert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Use predict.lm
Try this: # regression of Sepal.Length on cols 2 and 4 using first 100 rows iris.lm - lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 1:100) # now do it with next 50 rows predict(update(iris.lm, subset = 101:150)) # double check - this gives same result as last line predict(lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 101:150)) On 5/2/06, Jiang, Jincai (Institutional Securities Management) [EMAIL PROTECTED] wrote: Hi All, I created a two variable lm() model slm-lm(y[1:3000,8]~y[1:3000,12]+y[1:3000,15]) I made two predictions predict(slm,newdata=y[201:3200,]) predict(slm,newdata=y[601:3600,]) there is no error message for either of these. the results are identical, and identical to slm$fitted as well. if this is not the right way to apply the model coefficients to a new set of inputs, what is the right way? Thank you Regards, Jincai Jiang (Office) 212-761-3984 This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. Any ModelWare, research or other information referenced herein is subject to the ClientLink and ModelWare terms of use including all applicable disclosures and disclaimers. The information provided speaks only as of its date. We have not undertaken, and will not undertake, any duty to update the information or otherwise advise you of changes in our opinion or in the research or information. Continued access to the research and other information is provided for your convenience only, and is not a republication or reconfirmation of the opinions or information contained therein. For additional information and important disclosures, contact me or see the ModelWare website. Past performance is not indicative of future returns. This communication i! s ! solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Time series plot
Try this (where you can replace textConnection(L) with name of file containing data): L - 01/02/1990 0.531 0.479 01/03/1990 0.510 0.522 01/06/1990 0.602 0.604 library(zoo) z - read.zoo(textConnection(L), format = %m/%d/%Y) plot(z, plot.type = single) This will give more info on zoo: library(zoo) vignette(zoo) library(help = zoo) On 5/2/06, Jiang, Jincai (Institutional Securities Management) [EMAIL PROTECTED] wrote: I have some time series data like 01/02/1990 0.531 0.479 01/03/1990 0.510 0.522 01/06/1990 0.602 0.604 there is no weekends and holidays. how do I graph them in a single plot that the x-axis is the dates and the y-axis is the time series? Thank you Regards, Jincai Jiang (Office) 212-761-3984 This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. Any ModelWare, research or other information referenced herein is subject to the ClientLink and ModelWare terms of use including all applicable disclosures and disclaimers. The information provided speaks only as of its date. We have not undertaken, and will not undertake, any duty to update the information or otherwise advise you of changes in our opinion or in the research or information. Continued access to the research and other information is provided for your convenience only, and is not a republication or reconfirmation of the opinions or information contained therein. For additional information and important disclosures, contact me or see the ModelWare website. Past performance is not indicative of future returns. This communication i! s ! solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Use predict.lm
Sorry, I don't think that my earlier reply was what you wanted. Try this instead: # fit using first 100 points iris.lm - lm(Sepal.Length ~., iris[1:100,c(1,2,4)]) # predict using coefficients from above and variables from next 50 points predict(iris.lm, iris[101:150, c(1,2,4)]) On 5/2/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: # regression of Sepal.Length on cols 2 and 4 using first 100 rows iris.lm - lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 1:100) # now do it with next 50 rows predict(update(iris.lm, subset = 101:150)) # double check - this gives same result as last line predict(lm(Sepal.Length ~ ., iris[,c(1,2,4)], subset = 101:150)) On 5/2/06, Jiang, Jincai (Institutional Securities Management) [EMAIL PROTECTED] wrote: Hi All, I created a two variable lm() model slm-lm(y[1:3000,8]~y[1:3000,12]+y[1:3000,15]) I made two predictions predict(slm,newdata=y[201:3200,]) predict(slm,newdata=y[601:3600,]) there is no error message for either of these. the results are identical, and identical to slm$fitted as well. if this is not the right way to apply the model coefficients to a new set of inputs, what is the right way? Thank you Regards, Jincai Jiang (Office) 212-761-3984 This is not an offer (or solicitation of an offer) to buy/sell the securities/instruments mentioned. Morgan Stanley may deal as principal in or own or act as market maker for securities/instruments mentioned or may advise the issuers. Any ModelWare, research or other information referenced herein is subject to the ClientLink and ModelWare terms of use including all applicable disclosures and disclaimers. The information provided speaks only as of its date. We have not undertaken, and will not undertake, any duty to update the information or otherwise advise you of changes in our opinion or in the research or information. Continued access to the research and other information is provided for your convenience only, and is not a republication or reconfirmation of the opinions or information contained therein. For additional information and important disclosures, contact me or see the ModelWare website. Past performance is not indicative of future returns. This communication! is ! solely for the addressee(s) and may contain confidential information. We do not waive confidentiality by mistransmission. Contact me if you do not wish to receive these communications. In the UK, this communication is directed in the UK to those persons who are market counterparties or intermediate customers (as defined in the UK Financial Services Authority's rules). [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Still help needed on embeded regression
I was assuming that this would be added to my example where the data is a zoo object so lag.zoo is being used. Try this: library(zoo) z - zoo(11:15) z 1 2 3 4 5 11 12 13 14 15 lag(z,-1,na.pad=TRUE) 1 2 3 4 5 NA 11 12 13 14 ?lag.zoo On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote: It does not work though. How is the lag work? How does the lag work? I read the help and do not quite understand. Here is a test y [1] 1 2 3 4 5 6 7 8 9 10 coredata(lag(y,-1)) [1] 1 2 3 4 5 6 7 8 9 10 attr(,tsp) [1] 2 11 1 --- Gabor Grothendieck [EMAIL PROTECTED] wrote: Try runmean2 - function(x, k) # k must be even (coredata(runmean(x, k-1)) * (k-1) + coredata(lag(x, -k/2, na.pad = TRUE)))/k Also, in your code use matrices or vectors instead of data frames to avoid any overhead in using data frames. On 5/2/06, Guojun Zhu [EMAIL PROTECTED] wrote: Sorry to bother you guys again. This is great. But this is for 61 number and the second case will change 60 to 61. run* only accept odd number window. How to get around it with 60? Any suggestion? Thanks. --- Gabor Grothendieck [EMAIL PROTECTED] wrote: Using runmean from caTools the first one below does it in under 1 second but will not handle NAs. The second one takes under 15 seconds and handles them by replacing them with linear approximations. Note that k must be odd. # 1 library(caTools) set.seed(1) system.time({ y - rnorm(140001) x - as.numeric(seq(y)) k - 61 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) # 2 library(caTools) library(zoo) set.seed(1) system.time({ y - rnorm(14) x - as.numeric(seq(y)) x[100:200] - NA x - na.approx(zoo(x)) y - zoo(y) k - 60 Mxy - runmean(x * y, k) Mxx - runmean(x * x, k) Mx - runmean(x, k) My - runmean(y, k) b - (Mxy - Mx * My) / (Mxx - Mx * Mx) a - My - b * Mx }) On 5/1/06, Guojun Zhu [EMAIL PROTECTED] wrote: I basically has a long data.frame a. but I only need three columns x,y. Let us say the index of row is t. I need to produce new column s_t as the linear regression coefficient of (x_(t-60),...x_(t-1)) on (y_(t-60),...,y_(t-1)). The data is about 140,000 rows. I wrote a simple code on this which is super slow, it takes more than 2 hours on a 2.8Ghz Intel Duo Core. My friend use SAS and his code needs only couple of minutes. I know there must be some more efficient way to write it. Can anyone help me on this? Here is the code. Also one line produce a complete NA temp$y and lm function failed on that. How to make it just produce a NA instead and keep runing? attach(return) betat=rep(NA,length(RET)) for (i in 61:length(RET)){cat(i, ); if (year[[i]]=1995){ temp-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)]) betat[[i]]-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]] #if (i%%100==0) cat(i, ); return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]] } } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Listing Variables
Column names in iris that contain the string Sepal: grep(Sepal, names(iris), value = TRUE) On 5/3/06, Farrel Buchinsky [EMAIL PROTECTED] wrote: How does one create a vector whose contents is the list of variables in a dataframe pertaining to a particular pattern? This is so simple but I cannot find a straightforward answer. I want to be able to pass the contents of that list to a for loop. So let us assume that one has a dataframe whose name is Data. And let us assume one had the height of a group of people measured at various ages. It could be made up of vectors Data$PersonalID, Data$FirstName, Data$LastName, Data$Height.1, Data$Height.5, Data$Height.9, Data$Height.10,Data$Height.12,Data$Height.20many many more variables. How would one create a vector of all the Height variable names. The simple workaround is to not bother creating the vector Data$Height.1 Data$Height.5 Data$Height.9 Data$Height.10 Data$Height.12Data$Height.20...but rather just to use the sapply function. However with some functions the sapply will not work and it is necessary to supply each variable name to a function (see thread at Repeating tdt function on thousands of variables) This is such a core capability. I would like to see it in the R-Wiki but could not find it there. -- Farrel Buchinsky, MD Pediatric Otolaryngologist Allegheny General Hospital Pittsburgh, PA __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Math expressions in pie chart labels?
On 5/3/06, Uwe Ligges [EMAIL PROTECTED] wrote: Johannes Graumann wrote: On Tuesday 02 May 2006 23:33, Uwe Ligges wrote: Then please read ?plotmath and use it: labels = expression( = 0.66, == 0.33, = -0.33, = -0.66) Error in lab != : comparison is not allowed for expressions In addition: Warning message: is.na() applied to non-(list or vector) in: is.na(lab - labels[i]) I don't seem to be the only one having problems with this ;0) Then please tell us the details, I just tried successfully: plot(1:10, xaxt=n) axis(1, at = c(1,3,5,7), labels = expression( = 0.66, == 0.33, = -0.33, = -0.66)) I think the discussion applies to pie: pie(c(1,3,5,7), labels = + expression( = 0.66, == 0.33, = -0.33, = -0.66)) Error in lab != : comparison is not allowed for expressions In addition: Warning message: is.na() applied to non-(list or vector) in: is.na(lab - labels[i]) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sprintf question
Try this: do.call(sprintf, c(%9.2f\t%d\t%d\t%8.3f, as.list(v[iv]))) On 5/3/06, Paul Roebuck [EMAIL PROTECTED] wrote: How would one go about getting sprintf to use the values of a vector without having to specify each argument individually? v - c(1, 2, -1.197114, 0.1596687) iv - c(3, 1, 2, 4) sprintf(%9.2f\t%d\t%d\t%8.3f, v[3], v[1], v[2], v[4]) [1] -1.20\t1\t2\t 0.160 Essentially, desired effect would be something like: sprintf(%9.2f\t%d\t%d\t%8.3f, v[iv]) # wish it worked -- SIGSIG -- signature too long (core dumped) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] factor to real - best way to convert
You can use as.is = TRUE arg to read.xls to get character data rather than factors. On 5/3/06, Knut Krueger [EMAIL PROTECTED] wrote: I have got factor from read.xls: is(factor_value) [1] factor oldClass [288] -0.32 0.180.180.18-0.32 0.180.68 [295] 0.680.18 43 Levels: -0.05 -0.13 -0.15 -0.18 -0.20 -0.26 ... 1.33 If I am using the funciton as.real(factor_value) I get [271] 17 17 8 22 8 8 17 17 17 17 17 17 17 17 23 7 35 7 [289] 23 23 23 7 23 35 35 23 So I used as.real(as.matrix(factor_value)) The result is as expected: [271]NANA -0.35 0.15 -0.35 -0.35NANANA [280]NANANANANA 0.18 -0.32 0.68 -0.32 [289] 0.18 0.18 0.18 -0.32 0.18 0.68 0.68 0.18 Ok I found the way to convert with try and error, but I do not understand the way - and I found the hint in the fullref_manual: x- as.numeric(levels(factor_value))[factor_value]) Ok much better, but I would not be able to find the way from the ?as.numeric help page. Both versions are complete struggled in my mind. maybe anybody is albe to write some hints for me. with regards Knut __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Math expressions in pie chart labels?
As a workaround you could use pie3D in the plotrix package with height=0 and theta=pi, e.g. library(plotrix) pie3D(1:3, height = 0, theta = pi, labels = expression( = 1, == 2, = 3)) On 5/3/06, Johannes Graumann [EMAIL PROTECTED] wrote: On Wednesday 03 May 2006 09:05, Uwe Ligges wrote: Ah, I see, this happens in pie()'s line: if (!is.na(lab - labels[i]) lab != ) { where lab is one element of the expression. I'd like to propose to change that line to if (!is.na(lab - labels[i]) nchar(lab) 0) { What's the canonical way of patching something like this in R? Redefining the function at the start of your script? Joh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Aggregate?
Suppose we want to sum C over levels of A and that B is constant within levels of A. Then: DF - data.frame(A = gl(2,2), B = gl(2,2), C = 1:4) # test data do.call(rbind, by(DF, DF$A, function(x) replace(x[1,], C, sum(x$C On 5/3/06, Guenther, Cameron [EMAIL PROTECTED] wrote: Hello, I have a data set with a grouping variable (TRIPID) and several other variables. TRIPID is repeated in some areas and I would like to use a function like aggregate to sum the variable UNITS according to TRIPID. However I would also like to retain the other variables as they are in the data set with the new summed TRIPID. So what I have is something like this: YEARMONTH DAY CONTINUESPL AREACOUNTY DEPTH DEPUNIT GEARGEAR2 TRAPS SOAKTIMEUNITS FACTOR DISPOSIT NUMSETS TRIPST TRIPID 19921 26 1 SP0073928 8 25 4 NA 100 NA NA NA 161 1 NA NA NA 02163399054 19921 26 1 SP0073928 8 25 4 NA 100 NA NA NA 8 1 NA NA NA 02163399054 19921 26 2 SP0004228 8 25 4 NA 100 NA NA NA 161 1 NA NA NA 02163399054 19921 26 2 SP0004228 8 25 4 NA 100 NA NA NA 8 1 NA NA NA 02163399054 19921 25 NA SP0052652 8 25 4 NA 100 NA NA NA 85 1 NA NA NA 02163399057 19921 26 NA SP0037940 8 25 4 NA 100 NA NA NA 70 1 NA NA NA 02163399058 19921 27 NA SP0072357 8 25 4 NA 100 NA NA NA 15 1 NA NA NA 02163399059 19921 27 NA SP0072357 8 25 4 NA 100 NA NA NA 20 1 NA NA NA 02163399059 19921 27 NA SP0026324 8 25 4 NA 100 NA NA NA 8 1 NA NA NA 02163399060 19921 28 1 SP0072357 8 25 4 NA 100 NA NA NA 2001 NA NA NA 02163399062 And what I want is this: YEARMONTH DAY CONTINUESPL AREACOUNTY DEPTH DEPUNIT GEARGEAR2 TRAPS SOAKTIMEUNITS FACTOR DISPOSIT NUMSETS TRIPST TRIPID 19921 26 1 SP0073928 8 25 4 NA 100 NA NA NA 3381 NA NA NA 02163399054 19921 25 NA SP0052652 8 25 4 NA 100 NA NA NA 85 1 NA NA NA 02163399057 19921 26 NA SP0037940 8 25 4 NA 100 NA NA NA 70 1 NA NA NA 02163399058 19921 27 NA SP0072357 8 25 4 NA 100 NA NA NA 35 1 NA NA NA 02163399059 19921 27 NA SP0026324 8 25 4 NA 100 NA NA NA 8 1 NA NA NA 02163399060 19921 28 1 SP0072357 8 25 4 NA 100 NA NA NA 2001 NA NA NA 02163399062 Does anyone know how to do this. Data file is attached. Thanks in advance Cameron Guenther, Ph.D. Associate Research Scientist FWC/FWRI, Marine Fisheries Research 100 8th Avenue S.E. St. Petersburg, FL 33701 (727)896-8626 Ext. 4305 [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list
Re: [R] do.call in 2.3.0 vers 2.3.x
See: https://www.stat.math.ethz.ch/pipermail/r-devel/2006-May/037542.html On 5/4/06, Dieter Menne [EMAIL PROTECTED] wrote: Dear R-Core, after switching to 2.3.0, all my trusted do.call constructs that worked in 2.2 and earlier fail. I noted that changes were introduced to do.call, but I could not find out how these relate to my problem. The following example works in 2.2 and earlier, but fails because rownames are partially NA. I can correct this by manually adding row names, but it's a bit of work to check this in all my code. Dieter -- wby = by(warpbreaks[, 1:2], warpbreaks$tension, function(x) { data.frame(breaks=mean(x$breaks),var=var(x$breaks)) } ) cd = do.call(rbind,wby) row.names(cd) cd Output in 2.3.0 row.names(cd) [1] NANA1 NA2 cd Error in data.frame(breaks = c(36.38889, 26.38889, 21.7), var = c(270.48693, : row names contain missing values platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 3.0 year 2006 month 04 day24 svn rev37909 language R version.string Version 2.3.0 (2006-04-24) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] combined multiple observations
Assuming DF is your data frame, try this: aggregate(DF[,0-1:2], DF[,1:2], sum) On 5/4/06, YIHSU CHEN [EMAIL PROTECTED] wrote: Dear R users: I have a data frame as follows, where e1-e3 are indicator variables with value equal 0 or 1. St County e1 e2 e3 1 2 1 0 0 1 2 0 1 0 2 1 0 0 1 2 2 1 0 0 What I would like to do is to combine observations with same pair of ST and County together. For example, for the St=1 and County=2, I would like to have follows: St County e1 e2 e3 1 2 1 1 0 Since I have a total of more than 3 observations, any blue force way seems to be not efficient. Does anyone of you have experience to deal with it? Thank you so much. Yihsu Chen The Johns Hopkins University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html