Re: [R] Which is the best hardware?
On Mon, 5 Dec 2005, Kenneth Cabrera wrote: Hi R users: In your opinion and experience, which hardware configuration is the best to run R over LINUX ? With best I mean best performance, and also cheapest. (about U$ 2.000 the whole basic system: mother board+CPUs+RAM+HD) I presume you don't need a display or keyboard Well, prices depend on where you are and the quality of components, e.g. power supplies. But as I am just buying a new system I have some idea. I would suggest an Athlon 64 X2 would be a good choice: that's a 64-bit system with de facto two processors which can be bought here with 2Gb RAM at well under your price. By the way, which LINUX distribution is the best to run R with high computing technics (simulation, bayesian, etc) and huge data base? and in combination with what kind of (cheap) hardware? They are all based on the same components: a distribution is just the packaging and installation tools. For performance it seems that systems based on gcc3 rather than gcc4 still have a small edge, but I would say local expertise is far more important. (At one point in the committee for a large procurement I pointed out that a 10% difference between two systems was 2.5 months' of Moore's Law, and that covered the spread of benchmark results for all the contenders. So if you want better performance, just wait a few months.) -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help
Hi R-Users, I apologize if it is too simple question for all. I have a multivariate dataset having 7 variables as independent and 1 dependent variable. 248 data points are there. I want to do out sample forecast first considering 156 points. So I'll have to start from 157th point and calculate the 157th y_hat value. In this way it will go to 248th data point. Can any one tell me how I can do with for loop. Thanks a lot in advance. Thanks Regards, SUMANTA BASAK. --- This e-mail may contain confidential and/or privileged infor...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] extracting p-values from lmer()
Renaud == Renaud Lancelot [EMAIL PROTECTED] on Tue, 6 Dec 2005 08:09:35 +0100 writes: Renaud For example: vc - vcov(m1, useScale = FALSE) b - fixef(m1) se - sqrt(diag(vc)) z - b / sqrt(diag(vc)) P - 2 * (1 - pnorm(abs(z))) cbind(b, se, z, P) Renaud bse z P Renaud (Intercept) 0.3596720 0.007023556 51.20939 0 Renaud x1 0.2941068 0.002371353 124.02487 0 Renaud x2 -0.9272545 0.010087717 -91.91917 0 I still see much too many uses of 1 - pdist(...) which in cases as the above case leads to complete loss of accuracy (1 - 1 = 0) -- well actually the above case is too extreme to make any difference; but let me explain the general principle: Though the loss is usually no problem for decision making based on P-values, it is unnecessary: One of the (extra) features of R are the arguments 'lower.tail' and 'log.p' of all the pdist() functions -- which (in not yet quite all cases) allow avoid precision loss. E.g., 1 - pnorm(c( 6,8,10,20)) [1] 9.865877e-10 6.661338e-16 0.00e+00 0.00e+00 pnorm(c(6,8, 10,20), lower.tail=FALSE) [1] 9.865876e-10 6.220961e-16 7.619853e-24 2.753624e-89 BTW, example(pnorm) ends in two plots which show the advantage of using 'log.p' for additional precision gain e.g. for log-likelihood computation. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to extract row col names from a matrix
Dear all, I like to extract row names column names from the named matrix.. like.. a-matrix(1:6,2) ro-c(aa,bb) co-c(dd,ee,ff) dimnames(a)-list(ro,co) a dd ee ff aa 1 3 5 bb 2 4 6 from the above matrix a I like to extract rownames separately like rownames(a)= (aa,bb) column names separately like col names(a)= (dd,ee,ff) Kindly suggest me some good ways... tha´nk you all with regards, boopathy. Thirumalai Shanmuha Boopathy, Zimmer no : 1109, Rütscher strasse 165, 52072 Aachen . Germany. Home zone : 0049 - 241 - 9813409 Mobile zone : 0049 - 176 - 23567867 - Single? There's someone we'd like you to meet. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Writing a list to a file !
Hi All, This may be trivial in R but I have been trying with out any success. I have a list of 100 elements each having a sub list of different length. I would like to write the list to a ASCII file. I tried with write.table(), after converting my list to a matrix. Now it looks like Robert c(90, 50, 30) Johnc(91, 20, 25, 45) How can I get rid off c(, ..)? In my file, I would like to have Robert 90, 50, 30 John91, 20, 25, 45 Thanks in advance. Regards, Ezhil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] figure with inset
I am trying to plot a figure within a figure (an inset that shows a closeup of part of the data set). I have searched R-help and other sources but not found a solution. What I would like to do is (1) produce a plot (2) specify a window that will be used for the next plot (in inches or using the coordinate system of the plot produced in (1) (3) overlay a new plot in the window specified under (2) The result would be: +--+ | | | first plot | | ++ | | | inset | | | ++ | | | +--+ Thank you for your help Pascal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to draw continent boundry
Hi, If I am ploting a world map like plot (lon,lat) then how to draw a continent boundry in that plot. What is the command... Many thanks Regards, Yogesh -- === Yogesh K. Tiwari, Max-Planck Institute for Biogeochemistry, Hans-Knoell Strasse 10, D-07745 Jena, Germany Office : 0049 3641 576 376 Home : 0049 3641 223 163 Fax : 0049 3641 577 300 Cell : 0049 1520 459 1008 e-mail : [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to get or store the intermediate v?lues while running a function
Dear all, While running a function I´m getting only the final output of the function. Bit if I like to store or recover some values that are intermediate in the function calculations which command I have to use for storing those values. hope u understand. for eg a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... thank you all. with regards, boopathy. Thirumalai Shanmuha Boopathy, Zimmer no : 1109, Rütscher strasse 165, 52072 Aachen . Germany. Home zone : 0049 - 241 - 9813409 Mobile zone : 0049 - 176 - 23567867 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] figure with inset
[EMAIL PROTECTED] wrote: I am trying to plot a figure within a figure (an inset that shows a closeup of part of the data set). I have searched R-help and other sources but not found a solution. See the examples on the grid package by Paul Murrel in R News. Uwe Ligges What I would like to do is (1) produce a plot (2) specify a window that will be used for the next plot (in inches or using the coordinate system of the plot produced in (1) (3) overlay a new plot in the window specified under (2) The result would be: +--+ | | | first plot | | ++ | | | inset | | | ++ | | | +--+ Thank you for your help Pascal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] about partial correlation
Hello everyone My name is Vangelis and I want to ask a question about partial correlation. I have used the command pcor.shrink to evaluate the partial correlations of a data.frame but the problem is that in the output results I cannot see whether these correlations are significant or not. Is there any command which can show me if these correlations are significant at 95% level or another level? Than you very much. Kind regards Vangelis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to extract row col names from a matrix
You mean somthing like the following? cat(();cat(rownames(a),sep=,);cat()) cat(();cat(colnames(a),sep=,);cat()) Best regrads, Kristel shanmuha boopathy wrote: Dear all, I like to extract row names column names from the named matrix.. like.. a-matrix(1:6,2) ro-c(aa,bb) co-c(dd,ee,ff) dimnames(a)-list(ro,co) a dd ee ff aa 1 3 5 bb 2 4 6 from the above matrix a I like to extract rownames separately like rownames(a)= (aa,bb) column names separately like col names(a)= (dd,ee,ff) Kindly suggest me some good ways... tha´nk you all with regards, boopathy. Thirumalai Shanmuha Boopathy, Zimmer no : 1109, Rütscher strasse 165, 52072 Aachen . Germany. Home zone : 0049 - 241 - 9813409 Mobile zone : 0049 - 176 - 23567867 - Single? There's someone we'd like you to meet. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- __ Kristel JoossensPh.D. Student Research Center ORSTAT K.U. Leuven Naamsestraat 69 Tel: +32 16 326929 3000 Leuven, BelgiumFax: +32 16 326732 E-mail: [EMAIL PROTECTED] http://www.econ.kuleuven.be/public/ndbae49 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Writing a list to a file !
as.numeric? E.g. R res [1] 90 50 30 R as.numeric(res) [1] 90 50 30 A Ezhil wrote: Hi All, This may be trivial in R but I have been trying with out any success. I have a list of 100 elements each having a sub list of different length. I would like to write the list to a ASCII file. I tried with write.table(), after converting my list to a matrix. Now it looks like Robert c(90, 50, 30) Johnc(91, 20, 25, 45) How can I get rid off c(, ..)? In my file, I would like to have Robert 90, 50, 30 John91, 20, 25, 45 Thanks in advance. Regards, Ezhil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- __ Kristel JoossensPh.D. Student Research Center ORSTAT K.U. Leuven Naamsestraat 69 Tel: +32 16 326929 3000 Leuven, BelgiumFax: +32 16 326732 E-mail: [EMAIL PROTECTED] http://www.econ.kuleuven.be/public/ndbae49 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw continent boundry
Yogesh K. Tiwari wrote: Hi, If I am ploting a world map like plot (lon,lat) then how to draw a continent boundry in that plot. What is the command... See, e.g., the packages maps, mapdata, and mapproj as well as the task view Spatial on CRAN. Uwe Ligges Many thanks Regards, Yogesh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to extract row col names from a matrix
shanmuha boopathy a écrit : a-matrix(1:6,2) ro-c(aa,bb) co-c(dd,ee,ff) dimnames(a)-list(ro,co) (Not sure I fully understand the question), but : rn = rownames(a); cn = colnames(a); __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to get or store the intermediate v?lues while running a function
shanmuha boopathy a écrit : a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... may I suggest, inside your function res = c(k, l, m); return(res); # also ... read some intro docs ! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] about partial correlation
maybe the function pcor.confint() from package 'GeneNT' could be of help. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://www.med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Vangelis Panagiotaras [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, December 06, 2005 11:09 AM Subject: [R] about partial correlation Hello everyone My name is Vangelis and I want to ask a question about partial correlation. I have used the command pcor.shrink to evaluate the partial correlations of a data.frame but the problem is that in the output results I cannot see whether these correlations are significant or not. Is there any command which can show me if these correlations are significant at 95% level or another level? Than you very much. Kind regards Vangelis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to draw continent boundry
Have you looked in the maps package? Yogesh K. Tiwari wrote: Hi, If I am ploting a world map like plot (lon,lat) then how to draw a continent boundry in that plot. What is the command... Many thanks Regards, Yogesh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Help on a matrix task
Hello, Being new to R, I am completely stuck with the following problem. Please help to find a general solution to the following matrix task: Given: N-4 input_mat-matrix(c(1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0),ncol=N) combin_mat-matrix(c(1, 2, 1, 3, 1, 4, 2, 3, 2, 4, 3, 4),ncol=choose(N,2)) Find the indices of rows in input_mat, whose elements indicated by the pair of elements in each column of combin_mat, are equal 1. So, for the first column of combin_mat (1,2) the answer should be 1,2,3, and 4th row of input_mat has 1 as the first and second element, for the secondcolumn of combin_mat (1,3) the answer should be 1,2,5,6, for the third column of combin_mat (1,4) the answer should be 1,3,5,7, an so on. input_mat is the matrix of binary representations of the first 2^N-1 decimals in the descending order, here N=4, so 7,6,...,0. combin_mat is the matrix of all combinations of N by 2. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] about partial correlation (again)
Hello everyone I tried to install the library GeneNT in order to use the command pcor.confint because I want to construct confidence intervals for partial correlations but among other demanding the specific library needs the library Graph which I don't have it and I cannot find it at this site. Is there any other site that I can download this library? Thanks Kind regards Vangelis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R is GNU S, not C.... [was how to get or store .....]
vincent == vincent [EMAIL PROTECTED] on Tue, 06 Dec 2005 11:09:36 +0100 writes: vincent shanmuha boopathy a écrit : a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... vincent may I suggest, inside your function vincent res = c(k, l, m); vincent return(res); please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! {and I have another chain of argments why - is so more expressive than = but I'll be happy already if you could drop these ugly empty statements at the end of your lines... vincent # also ... read some intro docs ! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help on a matrix task
Here is one possible solution: for(cr in seq(1, dim(combin_mat)[2])) { W = which(input_mat[,combin_mat[1,cr]] == 1 input_mat[,combin_mat[2,cr]] == 1) cat(Combination, cr, (, combin_mat[,cr], ) :, W, \n) } JeeBee. --- Full program: N = 4 input_numbers = seq((2^N)-1, 0, -1) # convert to binary matrix input_mat = NULL for(i in seq(N-1,0,-1)) { new_col = input_numbers %% 2 input_mat = cbind(new_col, input_mat) input_numbers = (input_numbers - new_col) / 2 } colnames(input_mat) = NULL library(gtools) combin_mat = t(combinations(n=N, r=2, v=1:N, set=TRUE, repeats.allowed=FALSE)) for(cr in seq(1, dim(combin_mat)[2])) { W = which(input_mat[,combin_mat[1,cr]] == 1 input_mat[,combin_mat[2,cr]] == 1) cat(Combination, cr, (, combin_mat[,cr], ) :, W, \n) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] about partial correlation (again)
Vangelis == Vangelis Panagiotaras [EMAIL PROTECTED] on Tue, 6 Dec 2005 13:39:33 +0200 writes: Vangelis Hello everyone Vangelis I tried to install the library GeneNT in order to use the command Vangelis pcor.confint because I want to construct confidence intervals for Vangelis partial correlations but among other demanding the specific library Vangelis needs the library Graph which I don't have it and I cannot find it at Vangelis this site. Is there any other site that I can download this library? Vangelis Thanks Oh my! : 4 times in only 2 sentences !! Maybe you really should install the fortunes **PACKAGE** and look at the result of fortune(yikes !) It's the GeneNT *package* and the 'graph' *package* , Vangelis Kind regards Vangelis Vangelis Vangelis [[alternative HTML version deleted]] Vangelis PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Yes, please do: it well tell you why HTMLified e-mails are not liked on our mailing lists.. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
On Tue, 2005-12-06 at 13:43 +0100, Martin Maechler wrote: vincent == vincent [EMAIL PROTECTED] on Tue, 06 Dec 2005 11:09:36 +0100 writes: vincent shanmuha boopathy a écrit : a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... vincent may I suggest, inside your function vincent res = c(k, l, m); vincent return(res); please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! {and I have another chain of argments why - is so more expressive than = but I'll be happy already if you could drop these ugly empty statements at the end of your lines... vincent # also ... read some intro docs ! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Martin Maechler wrote: vincent == vincent [EMAIL PROTECTED] on Tue, 06 Dec 2005 11:09:36 +0100 writes: vincent shanmuha boopathy a écrit : a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... vincent may I suggest, inside your function vincent res = c(k, l, m); vincent return(res); please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! {and I have another chain of argments why - is so more expressive than = but I'll be happy already if you could drop these ugly empty statements at the end of your lines... vincent # also ... read some intro docs ! By the way, does anybody knows if there is a R tidy or some similar project to automatically reformat (and possibly check) R code, beside what Emacs does? Best, Philippe Grosjean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Yes, it drives me mad too when people use = instead of - for assignment and suppress spaces in an naive attempt for saving space. As an example compare o=fn(x=1,y=10,z=1) with o - fn( x=1, y=10, z=1 ) Regards, Adai On Tue, 2005-12-06 at 13:43 +0100, Martin Maechler wrote: vincent == vincent [EMAIL PROTECTED] on Tue, 06 Dec 2005 11:09:36 +0100 writes: vincent shanmuha boopathy a écrit : a-function(a,b,c,d) { k=a+b l=c+d m=k+l } in this example the function will return only the value of m ...But I like to extract the values of l k also. which command to use for storing or for extracting those intermediate value... vincent may I suggest, inside your function vincent res = c(k, l, m); vincent return(res); please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! {and I have another chain of argments why - is so more expressive than = but I'll be happy already if you could drop these ugly empty statements at the end of your lines... vincent # also ... read some intro docs ! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] O-ring statistic in R?
Hi Thorsten Wiegand used in his paper Wiegand T., and K. A. Moloney 2004. Rings, circles and null-models for point pattern analysis in ecology. Oikos 104: 209-229 a statistic he called O-Ring statistic which is similar to Ripley's K, only that it uses rings instead of circles. http://www.oesa.ufz.de/towi/towi_programita.html#ring Is this statistic included in one of the packages in R? Thanks, Rainer -- Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Department of Conservation Ecology University of Stellenbosch Matieland 7602 South Africa Tel:+27 - (0)72 808 2975 (w) Fax:+27 - (0)21 808 3304 Cell:+27 - (0)83 9479 042 email:[EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] merging with aggregating
Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Write List to ASCII File !!
Hi All, This may be trivial in R but I have been trying with out any success. I have a list of 100 elements each having a sub list of different length. I would like to write the list to a ASCII file. I tried with write.table(), after converting my list to a matrix. Now it looks like Robert c(90, 50, 30) Johnc(91, 20, 25, 45) How can I get rid off c(, ..)? In my file, I would like to have Robert 90, 50, 30 John91, 20, 25, 45 Thanks in advance. Regards, Ezhil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] array of lists? is this the best way to do it?
[Q.] How to create an array of lists, or structures the most elegant way? There have been questions in the past but none too recently...I want to know if the following looks OK to you guys or if there is a better way to create an array of lists: # PREAMBLE ... JUST TO GET THINGS GOING makeList- function(data, anythingElse) { rval - list( data = data, anythingElse = anythingElse ) class(rval) - myListOfArbitraryThings return(rval) } # make up some arbitrary data payload- list( as.matrix(cbind(1,1:3)), 10:15, data.frame(cbind(x=1, y=1:10), fac=sample(LETTERS[1:3], 10, repl=TRUE)) ) # HERE'S THE ARRAY-CONSTRUCTION PART THAT I WANT CRITIQUED: n- 3 # number of lists in the array of lists v- vector(list, n) # --- IS THIS THE BEST WAY TO CREATE AN ARRAY OF LISTS? # fill the array with essentially arbitrary stuff: for (i in 1:n) v[[i]]- makeList(payload[[i]], i) Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] urgent
Hello R Users, I have two sets of values x - c(7, 7 , 8, 9, 15, 17, 18) y - c(7, 8, 9, 15, 17, 19, 20, 20, 25, 23, 22) I am able to create multi histogram using multhist(). But not able to control the 'xlim'. ie the xaxis is showing 7.5, 13, 18, 23 1st on what basis it is calculated 2nd I want it to be like 7 8 9 15 17 and so on Can any one help me With Regards Subhabrata Pal [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reading in data with variable length
I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.8546,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8V9V10V11 V12V13 V14 V15V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NANA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] array of lists? is this the best way to do it?
On 12/6/05, John McHenry [EMAIL PROTECTED] wrote: [Q.] How to create an array of lists, or structures the most elegant way? There have been questions in the past but none too recently...I want to know if the following looks OK to you guys or if there is a better way to create an array of lists: # PREAMBLE ... JUST TO GET THINGS GOING makeList- function(data, anythingElse) { rval - list( data = data, anythingElse = anythingElse ) class(rval) - myListOfArbitraryThings return(rval) } # make up some arbitrary data payload- list( as.matrix(cbind(1,1:3)), 10:15, data.frame(cbind(x=1, y=1:10), fac=sample(LETTERS[1:3], 10, repl=TRUE)) ) # HERE'S THE ARRAY-CONSTRUCTION PART THAT I WANT CRITIQUED: n- 3 # number of lists in the array of lists v- vector(list, n) # --- IS THIS THE BEST WAY TO CREATE AN ARRAY OF LISTS? # fill the array with essentially arbitrary stuff: for (i in 1:n) v[[i]]- makeList(payload[[i]], i) You could use lapply to avoid having to set up the empty list: lapply(1:n, function(i) makeList(payload[[i]], i)) # untested __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks There might be a somewhat easier way, but here is one approach: # Use merge() to join A and B on 'n' # Set all = TRUE to include non-matched rows C - merge(A, B, by = n, all = TRUE) C n V1.x V2.x V1.y V2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA # Now get the rowSums() for the V1/V2 column pairs # and create a new dataframe from the # results AB - data.frame(n = C$n, V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), V2 = rowSums(C[, c(3, 5)], na.rm = TRUE)) AB n V1 V2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 See ?merge and ?rowSums for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
Hi all, the moment you hit the 'send' button you know the answer... I approached a solution similar to this one given by Marc. But maybe there is a better one? Even because this operation is done in a for-loop during which R gets new data from a database. So I sum up 16 data.frames eventually. Dubro -Ursprüngliche Nachricht- Von: Marc Schwartz [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 6. Dezember 2005 15:11 An: Dubravko Dolic Cc: r-help@stat.math.ethz.ch Betreff: Re: [R] merging with aggregating On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks There might be a somewhat easier way, but here is one approach: # Use merge() to join A and B on 'n' # Set all = TRUE to include non-matched rows C - merge(A, B, by = n, all = TRUE) C n V1.x V2.x V1.y V2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA # Now get the rowSums() for the V1/V2 column pairs # and create a new dataframe from the # results AB - data.frame(n = C$n, V1 = rowSums(C[, c(2, 4)], na.rm = TRUE), V2 = rowSums(C[, c(3, 5)], na.rm = TRUE)) AB n V1 V2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 See ?merge and ?rowSums for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
By the way, does anybody knows if there is a R tidy or some similar project to automatically reformat (and possibly check) R code, beside what Emacs does? See the appropriate section in `Writing R Extensions' (3.1 `Tidying R code'). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Stack overflow error while creating package
Hi all, I am trying to build a package in R (ver 2.1.0, on a PC). I am able to run package.skeleton successfully and populate the different environments. However, when I attempt to invoke the build (R CMD BUILD), i get an error which says something like protect(): Stack Overflow I would appreciate if anyone could suggest a way to get around this error message and help me build the package. thanks in advance, manohar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) m.all - merge(m1, m2, by=n, all=T) n v1.x v2.x v1.y v2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA Then depending on how many such columns there are, you have a number of ways of aggregating this dataset. One such way is cbind( n=m.all[ , n], v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) ) ], na.rm=T ), v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) ) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Regards, Adai On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] urgent
1) R-help mailing list is run entirely by volunteers, so requests such as urgent may sound rude 2) Use an informative subject line please ! 3) Please state which package multhist comes from. 4) Please show your call to multhist. 5) multhist does _histograms_ by aggregating points within certain intervals. In your case, you simply want a plot of your raw data. You can use barplot directly via multi.barplot - function( mylist, ... ){ u - unique( unlist( mylist ) ) tb - t(sapply( mylist, function(v) table(factor(v, levels=u)) ) ) barplot( tb, beside=TRUE, ... ) return(tb) } x - c(7, 7 , 8, 9, 15, 17, 18) y - c(7, 8, 9, 15, 17, 19, 20, 20, 25, 23, 22) z - c(8, 9, 9, 9, 31) multi.barplot( list(x, y, z), col=1:3 ) legend( topright, legend=c(one, two, three), fill=1:3 ) Regards, Adai On Tue, 2005-12-06 at 15:32 +0530, Subhabrata wrote: Hello R Users, I have two sets of values x - c(7, 7 , 8, 9, 15, 17, 18) y - c(7, 8, 9, 15, 17, 19, 20, 20, 25, 23, 22) I am able to create multi histogram using multhist(). But not able to control the 'xlim'. ie the xaxis is showing 7.5, 13, 18, 23 1st on what basis it is calculated 2nd I want it to be like 7 8 9 15 17 and so on Can any one help me With Regards Subhabrata Pal [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
I consistently use ; at every end of my R code and have found it much more neat than those sentences without an end; for - and =, if I were the author I would rather take the first representation as a sign of passing-by-reference while the latter by value. Xiaofan Li DAMTP, University of Cambridge, CB3 0WA, UK Tel +44 7886 614030, Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 06 December 2005 14:16 To: rHelp Subject: Re: [R] R is GNU S, not C [was how to get or store .] Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmle.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8V9 V10V11V12V13 V14 V15V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NANA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
=== 2005-12-06 22:16:17 您在来信中写道:=== Martin Maechler a �crit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. I think it is NOT just for historical reason.see the following example: rm(x) mean(x=1:10) [1] 5.5 x Error: object x not found mean(x-1:10) [1] 5.5 x [1] 1 2 3 4 5 6 7 8 9 10 But I'll read your chain of arguments with interest. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html = = = = = = = = = = = = = = = = = = = = 2005-12-06 -- Deparment of Sociology Fudan University My new mail addres is [EMAIL PROTECTED] Blog:http://sociology.yculblog.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
On Tue, Dec 06, 2005 at 03:16:17PM +0100, [EMAIL PROTECTED] wrote: Martin Maechler a ?crit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. I agree with this view -- I prefer an explicit statement terminator to a whitespace which terminates if termination is possible, too. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. As far as plain, stand-alone assignment statements are concerned, = and - are equivalent. Given the diversity of coding styles that are permitted by R, consistently using one style is, in practice, perhaps more relevant than finding out what the best style is. There is a draft R Coding Convention available at http://www.maths.lth.se/help/R/RCC/ which may be useful for finding a style that is good because it is widely used and therefore familiar to a large number of readers. Best regards, Jan -- +- Jan T. Kim ---+ | email: [EMAIL PROTECTED] | | WWW: http://www.cmp.uea.ac.uk/people/jtk | *-= hierarchical systems are for files, not for humans =-* __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy [EMAIL PROTECTED] wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
[EMAIL PROTECTED] wrote: Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. Well, I'll have to disagree a bit. While I don't care so much about trailing ; (as long as it does not become mandatory), I don't like the use of = for assignment and that's definitely NOT for snob reasons, whatever those are. I just think code is *much* easier to read if assignment is distinguished from argument settings. Peter Ehlers __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] urgent
I don't have an answer to your query, but I do have three suggestions: 1. Use a sensible subject line. This may be urgent to you, but I doubt that it is to anyone else. 2. Do indicate what package contains multhist(). I have no idea (nor do I know what a 'multi histogram' is). 3. Don't send HTML mail. People are very willing to help, but you do have to make it easy to do so. Peter Ehlers Subhabrata wrote: Hello R Users, I have two sets of values x - c(7, 7 , 8, 9, 15, 17, 18) y - c(7, 8, 9, 15, 17, 19, 20, 20, 25, 23, 22) I am able to create multi histogram using multhist(). But not able to control the 'xlim'. ie the xaxis is showing 7.5, 13, 18, 23 1st on what basis it is calculated 2nd I want it to be like 7 8 9 15 17 and so on Can any one help me With Regards Subhabrata Pal [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] figure with inset
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Tuesday, December 06, 2005 1:54 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] figure with inset [EMAIL PROTECTED] wrote: I am trying to plot a figure within a figure (an inset that shows a closeup of part of the data set). I have searched R-help and other sources but not found a solution. See the examples on the grid package by Paul Murrel in R News. Uwe Ligges 1. Nice posting -- Your little diagram makes your question crystal clear. 2. Murrell's new book, R GRAPHICS, is a comprehensive resource on grid, if you decide you want to do more with it. 3. See also ?par ... new=TRUE for a (less flexible, but perhaps adequate for your needs) way to do this in R's traditional graphics system. Cheers, Bert What I would like to do is (1) produce a plot (2) specify a window that will be used for the next plot (in inches or using the coordinate system of the plot produced in (1) (3) overlay a new plot in the window specified under (2) The result would be: +--+ | | | first plot | | ++ | | | inset | | | ++ | | | +--+ Thank you for your help Pascal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R formatting
While mucking about with semicolons and line endings I wrote this little piece of mildly obfuscated R code: f1=function(n){ x = 1 --- n return(x) } [best viewed with a proportionally-spaced font] f1(1) does indeed return 1/1. Baz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
On Tue, 2005-12-06 at 15:19 +0100, Dubravko Dolic wrote: Hi all, the moment you hit the 'send' button you know the answer... I approached a solution similar to this one given by Marc. But maybe there is a better one? Even because this operation is done in a for-loop during which R gets new data from a database. So I sum up 16 data.frames eventually. Dubro SNIP OKso here is one possible approach to a more generic solution: # Preallocate a list with 16 elements DF.List - replicate(16, list(numeric(0))) DF.List looks like: head(DF.List) [[1]] numeric(0) [[2]] numeric(0) [[3]] numeric(0) [[4]] numeric(0) ... # Do your loop here, placing the actual results # of your queries into DF.List[[i]]. I am just using # random samples here for the example. # NOTE: I am making the assumption in this example # that each resultant DF will have the same structure. for (i in 1:16) { DF.List[[i]] - data.frame(n = sample(20, 10), V1 = sample(20, 10), V2 = sample(0:10, 10)) } # Now rbind() the data frames together DF.All - do.call(rbind, DF.List) # Now do use aggregate() to get the sums of V1 and V2 # by 'n'. DF.Sums - aggregate(DF.All[, c(V1, V2)], list(n = DF.All$n), sum) DF.Sums n V1 V2 1 1 161 65 2 2 86 67 3 3 72 28 4 4 59 31 5 5 101 48 6 6 68 41 7 7 75 34 8 8 73 30 9 9 59 26 10 10 80 16 11 11 127 44 12 12 111 78 13 13 111 38 14 14 69 28 15 15 71 26 16 16 90 51 17 17 50 36 18 18 48 41 19 19 92 38 20 20 71 22 Does that get closer to what you need? HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
I don't put in extraneous ';' because I maybe get a blister on my little finger. I suspect that those who find the semi-colons ugly in R do not find them ugly in C. I think the reason there would be a visceral reaction in R but not in C is that there is a danger when using them in R that they really mean something. We get questions on R-help often enough about why code like: if(x 0) y - 4 else y - 4.5e23 doesn't work. If people habitually used semi-colons, those sorts of questions would probably multiply. Patrick Burns [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Xiaofan Li wrote: I consistently use ; at every end of my R code and have found it much more neat than those sentences without an end; for - and =, if I were the author I would rather take the first representation as a sign of passing-by-reference while the latter by value. Xiaofan Li DAMTP, University of Cambridge, CB3 0WA, UK Tel +44 7886 614030, Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: 06 December 2005 14:16 To: rHelp Subject: Re: [R] R is GNU S, not C [was how to get or store .] Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmle.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Jan T. Kim wrote: There is a draft R Coding Convention available at http://www.maths.lth.se/help/R/RCC/ which may be useful for finding a style that is good because it is widely used and therefore familiar to a large number of readers.-- However, as the author Henrik Bengtsson points out these guidelines are ours and not the R-developers. Perhaps a definitive style guide published by R core that ensured consistency among new base code would be a helpful addition. I personally find the above style guide extremely useful when multiple programmers work on the same project, and would welcome a formal endorsement or revision by the R developers. (And despite Henrik's elegant guide, I too leave off the semicolons at the end of the lines.) --Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stack overflow error while creating package
manohar == manohar [EMAIL PROTECTED] on Tue, 6 Dec 2005 06:39:27 -0800 (PST) writes: manohar Hi all, manohar I am trying to build a package in R (ver 2.1.0, on a manohar PC). which I interpret that you are running Windows, right? manohar I am able to run package.skeleton successfully manohar and populate the different environments. manohar However, when I attempt to invoke the build (R CMD manohar BUILD), i get an error which says something like manohar protect(): Stack Overflow The NEWS for the current 'R 2.2.1 beta' (- http://stat.ethz.ch/R-manual/R-patched/NEWS ) has had a very prominent entry at the beginning (for many weeks now), USER-VISIBLE CHANGES o options(expressions) has been reduced to 1000: the limit of 5000 introduced in 2.1.0 was liable to give crashes from C stack overflow. (and actually, the crashes seemed to happen particurlary often on Windows) manohar I would appreciate if anyone could suggest a way to manohar get around this error message and help me build the manohar package. You can download the pretty new precompiled R-patched (as of today R beta) versions for windows from your nearest CRAN mirror, newest via Precompiled - Windows - base and r-patched snapshot build Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot and factors
Your first question seems relatively simple: data.fall-subset(data,semester=='fall') data.spring-subset(data,semester=='spring') data.summer-subset(data,semester=='summer') plot(x=year,y=value1,data=data.fall) lines(x=year,y=value1,data=data.spring) lines(x=year,y=value1,data=data.summer) As for the second question perhaps the paste command is the way to go datayearsem-paste(data$year,data$semester,sep='.') On Fri, 2005-02-12 at 06:40 -0600, Jason Miller wrote: Read R-helpers, I'm relatively new to R and trying to jump in feet first. I've been able to learn a lot from on-line and printed documentation, but here's one question to which I can't find an answer. Well, it's a question with a couple parts. Thanks in advance for any direction (partial or complete) that anyone can provide. I have a data frame with three columns: Year, Semester, value1. I want to treat Year and Semester as factors; there are many years, and there are three semesters (Fall, Spring, Summer). First, I would like to be able to plot the data in this frame as Year v. value with one curve for each factor. I have been unable to do this. Is there any built-in R functionality that makes this easy, or do I need to build this by hand (e.g., using the techniques in FAQ 5.11 or 5.29)? Second, I would like to be able to plot the values against a doubly labeled axis that uses Year and Semester (three Semester ticks per Year). Is there a relatively straightforward way to do this? (What's happening, of course, is that I'd like to treat Year+Semester as a single factor for the purpose of marking the axis, but I'm not sure how to do that, either.) Again, thanks for whatever pointers people can share. Jason Jason E. Miller, Ph.D. Associate Professor of Mathematics Truman State University Kirksville, MO http://pyrite.truman.edu/~millerj/ 660.785.7430 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] strange behavior of loess() predict()
Dear altogether, I tried local regression with the following data. These data are a part of a bigger dataset for which loess is no problem. However, the plot shows extreme values and by looking into the fits, it reveals very extreme values (up to 2 !) although the original data are summary(cbind(x,y)) x y Min. :1.800 Min. :2.000 1st Qu.:2.550 1st Qu.:2.750 Median :2.800 Median :3.000 Mean :2.779 Mean :3.093 3rd Qu.:3.050 3rd Qu.:3.450 Max. :4.000 Max. :4.000 As you can see below, the difference lies in the line predict(mod, data.frame(x=X), se=TRUE) # strange values predict(mod, x=X, se=TRUE) # plausible values What is the difference whether predict() is called via data.frame(x=X) or just x=X Here are the data + R-code. It can be repoduced. --- snip --- # data x - c(3.4,2.8,2.6,2.2,2.0,2.8,2.6,2.6,2.8,4.0,2.4,2.8,3.0,3.6,3.2,2.8,3.2,2.4,2.2,1.8,2.8,2.0,3.6,2.6,2.8,3.2,3.0,2.6) y - c(3.0,2.6,2.8,2.6,3.0,4.0,3.6,2.4,3.0,4.0,2.4,3.4,3.0,3.2,2.8,3.4,3.4,3.8,3.8,3.6,3.2,2.4,3.8,3.0,3.0,2.0,2.6,2.8) par(mfrow=c(2,1)) # normal plot plot(x,y) lines(lowess(x,y)) # loess part mod - loess(y ~ x, span=.5, degree=1) X - seq(min(x), max(x), length=50) fit - predict(mod, data.frame(x=X), se=TRUE) zv - qnorm((1 + .95)/2) lower - fit$fit - zv*fit$se upper - fit$fit + zv*fit$se plot(x, y, ylim=range(y, lower, upper)) lines(X, fit$fit) # strange values in fit fit # here is the difference!! predict(mod, data.frame(x=X), se=TRUE) predict(mod, x=X, se=TRUE) --- end of snip --- I assume this has some reason but I do not understand this reason. Merci, best regards leo gürtler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R newbie...
Hello, I'm a new user... I have a function : calculate - function(x,y) { z - x + y } I would like to use the result (z) with another function : recalculate - function(...) { a - z^2 } But R says that z does not exist... How can I use z in an another function ? Thank you for your answer... -- David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R formatting
Barry Rowlingson [EMAIL PROTECTED] writes: While mucking about with semicolons and line endings I wrote this little piece of mildly obfuscated R code: f1=function(n){ x = 1 --- n return(x) } [best viewed with a proportionally-spaced font] f1(1) does indeed return 1/1. It doesn't calculate it though... ;-) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
P == P Ehlers [EMAIL PROTECTED] on Tue, 06 Dec 2005 08:35:07 -0700 writes: P [EMAIL PROTECTED] wrote: Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. P Well, I'll have to disagree a bit. While I don't care so much P about trailing ; (as long as it does not become mandatory), P I don't like the use of = for assignment and that's definitely P NOT for snob reasons, whatever those are. I just think code is P *much* easier to read if assignment is distinguished from P argument settings. Thank you, Peter. Indeed, this is exactly the main of my arguments: Since = is used quite often in S for argument setting in function calls, *additionally* using - for assignment is more expressive. Also, e.g., a2ps (a nice 'ASCII' to PostScript converter), comes {at least on Debian Linux} preconfigured for R, and uses nice typesetting for -; similarly for ESS. OTOH, it's pretty hard to correctly markup and differentiate those = which are assignments from those which are function. argument settings. P Peter Ehlers [But really, I'm more concerned and quite bit disappointed by the diehard ; lovers] Martin Maechler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Stack overflow error while creating package
I don't think this is C stack overflow. His R is so old the message means `protection stack overflow'. The first action (as described in the posting guide) is indeed to update R, though. On Tue, 6 Dec 2005, Martin Maechler wrote: manohar == manohar [EMAIL PROTECTED] on Tue, 6 Dec 2005 06:39:27 -0800 (PST) writes: manohar Hi all, manohar I am trying to build a package in R (ver 2.1.0, on a manohar PC). which I interpret that you are running Windows, right? manohar I am able to run package.skeleton successfully manohar and populate the different environments. manohar However, when I attempt to invoke the build (R CMD manohar BUILD), i get an error which says something like manohar protect(): Stack Overflow The NEWS for the current 'R 2.2.1 beta' (- http://stat.ethz.ch/R-manual/R-patched/NEWS ) has had a very prominent entry at the beginning (for many weeks now), USER-VISIBLE CHANGES ooptions(expressions) has been reduced to 1000: the limit of 5000 introduced in 2.1.0 was liable to give crashes from C stack overflow. (and actually, the crashes seemed to happen particurlary often on Windows) manohar I would appreciate if anyone could suggest a way to manohar get around this error message and help me build the manohar package. You can download the pretty new precompiled R-patched (as of today R beta) versions for windows from your nearest CRAN mirror, newest via Precompiled - Windows - base and r-patched snapshot build Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] xyplot question
Dear R users, I have a question regarding the use of xyplot in the lattice() package. I have two factors (each with two levels), and I´d like to change the order of the panels in a 2x2 panel layout from the default alphabetic order that R uses based on the names of the factor levels. My approach is (in principle) xyplot(y~x|Factor1+Factor2) Let´s assume, my factor levels for Factor1 are A and B, and for Factor2 they´re C and D, respectively. Now the default arrangement of my panels would be (from bottom top left to bottom right): BC,CA,BD,AD What I´d like to have is BD,AC,BC,AD. Can anyone tell me how to solve this problem easily? I´ve read that using perm.cond and/or index.cond could solve this problem, but couldn´t find an appropriate example, unfortunately... Thank you very much for your help! Regards, Christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
First of all, you might try reading the manual. Second, you might try something like this: calculate - function(x,y) { z - x + yz } recalculate(z) { a - z^2 a } z - calculate(x, y) recalculate(z) You need to return some value from your functions, and you need to assign that value to a variable. Sarah -- Sarah Goslee http://www.stringpage.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
Thank you for your answer. And what if my first function gives 2 results : calculate - function(x,y) { a - x + y b - x - y } How can I use both a and b in a new function ? 2005/12/6, Sarah Goslee [EMAIL PROTECTED]: First of all, you might try reading the manual. Second, you might try something like this: calculate - function(x,y) { z - x + y z } recalculate(z) { a - z^2 a } z - calculate(x, y) recalculate(z) You need to return some value from your functions, and you need to assign that value to a variable. Sarah -- Sarah Goslee http://www.stringpage.com -- David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
Her you just make the functions. R calculate - function(x,y){z - x + y} R recalculate - function(z){a - z^2} You should run the functions, by take z as output for the first function ans z as input for the next function: R calculate - function(x,y){z - x + y} R recalculate - function(z){a - z^2} R z - calculate(1,2) R a - recalculate(z) R z [1] 3 R a [1] 9 Good luck, Kristel David Hajage wrote: Hello, I'm a new user... I have a function : calculate - function(x,y) { z - x + y } I would like to use the result (z) with another function : recalculate - function(...) { a - z^2 } But R says that z does not exist... How can I use z in an another function ? Thank you for your answer... -- David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- __ Kristel JoossensPh.D. Student Research Center ORSTAT K.U. Leuven Naamsestraat 69 Tel: +32 16 326929 3000 Leuven, BelgiumFax: +32 16 326732 E-mail: [EMAIL PROTECTED] http://www.econ.kuleuven.be/public/ndbae49 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] strange behavior of loess() predict()
On Tue, 2005-12-06 at 18:09 +0100, Leo Gürtler wrote: Dear altogether, snip # here is the difference!! predict(mod, data.frame(x=X), se=TRUE) predict(mod, x=X, se=TRUE) --- end of snip --- I assume this has some reason but I do not understand this reason. Merci, Not sure if this is the reason, but there is no argument x in predict.loess, and: a - predict(mod, se = TRUE) gives you the same results as: b - predict(mod, x=X, se=TRUE) so the x argument appears to be being passed on/in the ... arguments and ignored? As such, you have no newdata, so mod$x is used. Now, when you do: c - predict(mod, data.frame(x=X), se=TRUE) You have used an un-named argument in position 2. R takes this to be what you want to use for newdata and so works with this data rather than the one in mod$x as in the first case: # now named second argument - gets ignored as in a and b d - predict(mod, x = data.frame(x=X), se=TRUE) all.equal(a, b) # TRUE all.equal(a, c) # FALSE all.equal(a, d) # TRUE # this time we assign X to x by using (), the result is used as newdata e - predict(mod, (x=X), se=TRUE) all.equal(c, e) # TRUE If in doubt, name your arguments and check the help! ?predict.loess would have quickly shown you where the problem lay. HTH G best regards leo gürtler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [T] +44 (0)20 7679 5522 ENSIS Research Fellow [F] +44 (0)20 7679 7565 ENSIS Ltd. ECRC [E] gavin.simpsonATNOSPAMucl.ac.uk UCL Department of Geography [W] http://www.ucl.ac.uk/~ucfagls/cv/ 26 Bedford Way[W] http://www.ucl.ac.uk/~ucfagls/ London. WC1H 0AP. %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Could you time these and see how each of these do: # 1 ta.split - strsplit(ta, split = ,) ta.num - lapply(ta.split, function(x) as.numeric(x[-(1:2)])) # 2 ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - lapply(ta0, scan, sep = ,) # 3 - loop version of #1 n - length(ta) ta.split - strsplit(ta, split = ,) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - as.numeric(ta.split[[i]][-(1:2)]) # 4 - loop version of #2 n - length(ta) ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - scan(t0[[i]) On 12/6/05, John McHenry [EMAIL PROTECTED] wrote: I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy [EMAIL PROTECTED] wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
calculate - function(x,y) { a - x + yb - x - y list(a=a, b=b) } myresult - calculate(x, y) myresult$a myresult$b Please at least read the Introduction to R at http://www.r-project.org/ It covers all of this very basic material. Sarah -- Sarah Goslee http://www.stringpage.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
David Hajage wrote: Hello, I'm a new user... I have a function : calculate - function(x,y) { z - x + y # insert: z } I would like to use the result (z) with another function : recalculate - function(...) { a - z^2 # insert: a } Type: recalculate(calculate(3,4)) Please read An Introduction to R as well as the posting guide! Uwe Ligges But R says that z does not exist... How can I use z in an another function ? Thank you for your answer... -- David [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R formatting
Peter Dalgaard wrote: It doesn't calculate it though... ;-) My previous example is a bit ugly - this one looks nicer: f1=function(n){ -1 x = --- n return(x) } And it returns f(1) as -1/1 and f(-1) as -1/-1 as well. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
From: Martin Maechler P == P Ehlers [EMAIL PROTECTED] on Tue, 06 Dec 2005 08:35:07 -0700 writes: P [EMAIL PROTECTED] wrote: Martin Maechler a écrit : please, please, these trailing ; are *so* ugly. This is GNU S, not C (or matlab) ! but I'll be happy already if you could drop these ugly empty statements at the end of your lines... May I disagree ? I find missing ; at end of lines *so* ugly. Ugly/not ugly depends on our observer's eyes. From my programmer point of view, I prefer to mark clearly the end of the lines. In many languages, it's safer to do it this way, and I thank the R developers to permit it. (in my opinion, it should even be mandatory). (By the way, marking the end of lines with a unique symbol makes also the job easier for the following treatment.) And yes, I'm also a C programmer ;-) {and I have another chain of argments why - is so more expressive than = Why - seems better than = is also quite mysterious for me. There was a discussion about this point recently I think. I believe in 99% of cases it's more for historical reason (and perhaps also for some snob reasons). I am not at all a 20 years experienced R programmer, but I have written several hundreds of R lines those 6 last months, and until today didn't get any problem using = instead of -. But I'll read your chain of arguments with interest. P Well, I'll have to disagree a bit. While I don't care so much P about trailing ; (as long as it does not become mandatory), P I don't like the use of = for assignment and that's definitely P NOT for snob reasons, whatever those are. I just think code is P *much* easier to read if assignment is distinguished from P argument settings. Thank you, Peter. Indeed, this is exactly the main of my arguments: Since = is used quite often in S for argument setting in function calls, *additionally* using - for assignment is more expressive. Also, e.g., a2ps (a nice 'ASCII' to PostScript converter), comes {at least on Debian Linux} preconfigured for R, and uses nice typesetting for -; similarly for ESS. OTOH, it's pretty hard to correctly markup and differentiate those = which are assignments from those which are function. argument settings. P Peter Ehlers [But really, I'm more concerned and quite bit disappointed by the diehard ; lovers] Martin Maechler Matlab also allows both with and without ;, but I guess most people learn quickly what the preferred way is: Without ;, Matlab prints the output of commands, including assignments; e.g., if you assign a 1e5-row matrix to something, and didn't terminate the line with ;, Matlab will print that matrix to the console. Personally, having the extraneous ; doesn't bother me nearly as much as not indenting the code properly or leave spaces around operators. I don't use them, because I seldom have difficulty knowing when a statement is suppose to end (given the code is properly indented). Those who use Python would know quite well, too, I guess. For those who insist on having ;, I guess they will never get the point of something like Python (or even Fortran...). Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
On 06-Dec-05 John McHenry wrote: I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.8546,0.2696,0 .3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. While you may well get a good R solution from the experts, in such a situation (as in so many) I would be tempted to pre-process the file with 'awk' (installed by default on Unix/Linux systems, available also for Windows). The following will give you a CSV file with a constant number of fields per line. While this does not eliminate the NAs which you apparently find unsightly, it should be a fast and clean way of doing the basic job, since it a line-by-line operation in two passes, so there should be no question. of choking the system (unless you run out of HD space as a result of creating the second file). Two passes, on the lines of Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' which gives you the maximum number of fields in any line. Suppose (for example) that this number is 37. Then Pass 2: cat foo.csv | awk -v maxF=37 ' BEGIN{FS=,; OFS=,} {if(NFmaxF){$maxF=}} {print $0} ' newfoo.csv Tiny example: 1) See foo.csv cat foo.csv 1 1,2 1,2,3 1,2,3,4 1,2 2) Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' 4 3) So we need 4 fields per line. With maxF=4, Pass 2: cat foo.csv | awk -v maxF=4 ' BEGIN{FS=,; OFS=,} {if(NFmaxF){$maxF=}} {print $0} ' newfoo.csv 4) See newfoo.csv cat newfoo.csv 1,,, 1,2,, 1,2,3, 1,2,3,4 1,2,, So you now have a CSV file with a constant number of fields per line. This doesn't make it into lists, though. Hoping this helps, Ted. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8V9V10 V11V12V13 V14 V15V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NANA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 18:08:54 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
I consistently use ; at every end of my R code and have found it much more neat than those sentences without an end; for - and =, if I were the author I would rather take the first representation as a sign of passing-by-reference while the latter by value. Xiaofan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert Sent: 06 December 2005 16:32 To: Jan T. Kim; rHelp Subject: Re: [R] R is GNU S, not C [was how to get or store .] Jan T. Kim wrote: There is a draft R Coding Convention available at http://www.maths.lth.se/help/R/RCC/ which may be useful for finding a style that is good because it is widely used and therefore familiar to a large number of readers.-- However, as the author Henrik Bengtsson points out these guidelines are ours and not the R-developers. Perhaps a definitive style guide published by R core that ensured consistency among new base code would be a helpful addition. I personally find the above style guide extremely useful when multiple programmers work on the same project, and would welcome a formal endorsement or revision by the R developers. (And despite Henrik's elegant guide, I too leave off the semicolons at the end of the lines.) --Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlde.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Thanks for the awk scripts, Ted. There are reasons (read political!) why R needs to be able to read the files in directly. But, sure, I agree, why not just awk the durned thing. Just to be clear: the NAs aren't so much unsightly as the storage required in RAM is too much. With 1GB files it's easy to rapidly run out of space. [EMAIL PROTECTED] wrote: On 06-Dec-05 John McHenry wrote: I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.8546,0.2696,0 .3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. While you may well get a good R solution from the experts, in such a situation (as in so many) I would be tempted to pre-process the file with 'awk' (installed by default on Unix/Linux systems, available also for Windows). The following will give you a CSV file with a constant number of fields per line. While this does not eliminate the NAs which you apparently find unsightly, it should be a fast and clean way of doing the basic job, since it a line-by-line operation in two passes, so there should be no question. of choking the system (unless you run out of HD space as a result of creating the second file). Two passes, on the lines of Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' which gives you the maximum number of fields in any line. Suppose (for example) that this number is 37. Then Pass 2: cat foo.csv | awk -v maxF=37 ' BEGIN{FS=,; OFS=,} {if(NF {print $0} ' newfoo.csv Tiny example: 1) See foo.csv cat foo.csv 1 1,2 1,2,3 1,2,3,4 1,2 2) Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' 4 3) So we need 4 fields per line. With maxF=4, Pass 2: cat foo.csv | awk -v maxF=4 ' BEGIN{FS=,; OFS=,} {if(NF {print $0} ' newfoo.csv 4) See newfoo.csv cat newfoo.csv 1,,, 1,2,, 1,2,3, 1,2,3,4 1,2,, So you now have a CSV file with a constant number of fields per line. This doesn't make it into lists, though. Hoping this helps, Ted. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 18:08:54 -- XFMail -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to keep the dropped term at each step when calling step?
Hi all: I am using the R function step to perform a model selection in backward direction. I'd like to automatically keep the dropped term at each step. So I wrote a filter function for the keep argument. However, the filter function cannot change the value of external variable and so doesn't work well. Anybody can help? Thank you in advance! Regards, Riyan Cheng P.S., R code ### example(lm) lm1 - lm(Fertility ~ ., data = swiss) tm- attr(lm1$terms,term.labels) kp- function(obj,aic){ x- attr(obj$terms,term.labels) y- setdiff(tm,x) if(length(y)==0)y=NULL tm- x AIC- aic list(n=length(x),dropped=y,AIC=aic) } g- step(lm1,keep=kp,k=10) g$keep ## __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
On 06-Dec-05 David Hajage wrote: Hello, I'm a new user... I have a function : calculate - function(x,y) { z - x + y } I would like to use the result (z) with another function : recalculate - function(...) { a - z^2 } But R says that z does not exist... How can I use z in an another function ? Thank you for your answer... With 'calculate' as written, z is internal to 'calculate' and is not visible from outside (and the internal assignment to z will not affact the value of a variable also called z outside the function). The simplest way to extract the calculated value is to return it from the function and assign it to z outside the function: calculate - function(x,y) { return(x + y) } z-calculate(x,y) and then say a-recalculate(z) where, again, you need to get a out of the function, so recalculate - function(...) { return(z^2) } While it is possible to change the values of external variables from within functions, this is not a recommended way to proceed, since it depends on the named variable inside the function meaning the same as the variable with the same name outside the function. Since the purpose of defining functions is to have something which is re-usable in different contexts, it is generally desriable to make function definitions independent of the environment from which they may be called. Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 18:25:53 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Coefficient of association for 2x2 contingency tables
Hi, Found no measure of association or correlation for 2x2 contingency tables in fullrefman.pdf or google. Can someone point to a package that implements such calculations? Thanx. -- Alexandre Santos Aguiar - consultoria para pesquisa em saúde - R Botucatu, 591 cj 81 tel 11-9320-2046 fax 11-5549-8760 www.spsconsultoria.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R newbie...
Return something that can hold more than one value, eg: calculate - function(x, y) { list(a=x+y, b=x-y) } David Hajage wrote: Thank you for your answer. And what if my first function gives 2 results : calculate - function(x,y) { a - x + y b - x - y } How can I use both a and b in a new function ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
ronggui a écrit : I think it is NOT just for historical reason. see the following example: rm(x) mean(x=1:10) [1] 5.5 x Error: object x not found x is an argument local to mean(), did you expect another answer ? mean(x-1:10) [1] 5.5 x [1] 1 2 3 4 5 6 7 8 9 10 What is the goal of this example ? Here with -, (voluntary, or not, side effect) the global variable x is, also, created. Did the writer really want that ??? I though there were other specific statements especially intended for global assignment, eg -. If this example was intended to prove - is better than = ... I'm not really convinced ! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
On 06-Dec-05 Martin Maechler wrote: [But really, I'm more concerned and quite bit disappointed by the diehard ; lovers] Martin Maechler Well, while not die-hard, I will put in my own little reason for often using ; at the end of lines which don't need them. Basically, this is done to protect me from myself (so in fact is quite a strong reason). I tend to develop extended R code in a side-window, using a text editor (vim) in that window, and cutpasting the chunks of R code from that window into the R window. This usually means that I have a lot of short lines, since it is easier when developing code to work with the commands one per line, as they are easier to find and less likely to be corrected erroneously. Finally, when when I am content that the code does the job I then put several short lines into one longer one. For example (a function to do with sampling with probability proportional to weights); first, as written line-by-line: myfunction - function(X,n1,n2,n3,WTS){ N1-n1; N2-n1+n2; N3-n1+n2+n3; # first selection pii-WTS/sum(WTS); alpha-N2; Pi-alpha*pii; r-runif(N3); ix-sort(which(r=Pi)); # second selection ix0-(1:N3); ix3-ix0[-ix]; ix20-ix0[ix]; W-WTS[ix]; pii-W/sum(W); Pi-N1*pii; r-runif(length(Pi)); ix10-sort(which(r=Pi)); ix1-ix20[ix10]; ix2-ix20[-ix10]; # return the results list(X1=X[ix1],X2=X[ix2],X3=X[ix3],ix1=ix1,ix2=ix2,ix3=ix3) } Having got that function right, with 'vim' in command mode successive lines are readily brought up to the current line by simply pressing J, which is very fast. This, in the above case, then results in MARselect-function(X,n1,n2,n3,WTS){ N1-n1; N2-n1+n2; N3-n1+n2+n3; # first selection pii-WTS/sum(WTS); alpha-N2; Pi-alpha*pii; r-runif(N3); ix-sort(which(r=Pi)); # second selection ix0-(1:N3); ix3-ix0[-ix]; ix20-ix0[ix]; W-WTS[ix]; pii-W/sum(W); Pi-N1*pii; r-runif(length(Pi)); ix10-sort(which(r=Pi)); ix1-ix20[ix10]; ix2-ix20[-ix10]; # return the results list(X1=X[ix1],X2=X[ix2],X3=X[ix3],ix1=ix1,ix2=ix2,ix3=ix3) } The greater readability of the first relative to the second is obvious. The compactness of the second relative to the first is evident. Obtaining the second from the first by repeated J is very quick. BUT -- if I had not put the ; at the ends of the lines in the string-out version (which is easy to do as you type in the line in the first place), then it would be much more trouble to get the second version, and very easy to get it wrong! Also, being long used to programming in C and octave/matlab, putting ; at the end of a command is an easy reflex, and of course does no harm at all to an R command. Not that I'm trying to encourage others to do the same as I do -- as I said, it's a self-protective habit -- but equally if people (e.g. me) may find it useful I don't think it should be discouraged either -- especially on aesthetic grounds! Just my little bit ... Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 19:02:23 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Patrick Burns a écrit : We get questions on R-help often enough about why code like: if(x 0) y - 4 else y - 4.5e23 doesn't work. If people habitually used semi-colons, those sorts of questions would probably multiply. I wrote end of line in my first message, but in fact I did mean end of statement. By the way, there will always be more ways to make mistakes than to make rigth ... with or without semi-colons ;-) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] merging with aggregating
Here's a solution that uses aggregate(), as suggested in the subject of this thread. m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) tt - as.data.frame(rbind(m1,m2)) aggregate(list(v1=tt$v1,v2=tt$v2),by=list(n=tt$n),sum) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Cheers, Pierre Adaikalavan Ramasamy offered the following remark on 12/06/05 04:40... m1 - cbind( n=c(1,2,3,4,6,7,8,9,10,13), v1=c(12,10,3,8,7,12,1,18,1,2), v2=c(0,8,8,4,3,0,0,0,0,0) ) m2 - cbind( n=c(1,2,3,4,5,6,8,10,11,12), v1=c(0,0,1,12,2,2,2,4,7,0), v2=c(2,3,9,8,9,9,0,1,1,1) ) m.all - merge(m1, m2, by=n, all=T) n v1.x v2.x v1.y v2.y 1 1 12002 2 2 10803 3 33819 4 484 128 5 5 NA NA29 6 67329 7 7 120 NA NA 8 81020 9 9 180 NA NA 10 101041 11 11 NA NA71 12 12 NA NA01 13 1320 NA NA Then depending on how many such columns there are, you have a number of ways of aggregating this dataset. One such way is cbind( n=m.all[ , n], v1=rowSums( m.all[ , grep( ^v1, colnames(m.all) ) ], na.rm=T ), v2=rowSums( m.all[ , grep( ^v2, colnames(m.all) )], na.rm=T ) ) n v1 v2 1 1 12 2 2 2 10 11 3 3 4 17 4 4 20 12 5 5 2 9 6 6 9 12 7 7 12 0 8 8 3 0 9 9 18 0 10 10 5 1 11 11 7 1 12 12 0 1 13 13 2 0 Regards, Adai On Tue, 2005-12-06 at 14:22 +0100, Dubravko Dolic wrote: Dear List, I have two data.frame of the following form: A: n V1 V2 1 12 0 2 10 8 3 3 8 4 8 4 6 7 3 7 12 0 8 1 0 9 18 0 10 1 0 13 2 0 B: n V1 V2 1 0 2 2 0 3 3 1 9 4 12 8 5 2 9 6 2 9 8 2 0 10 4 1 11 7 1 12 0 1 Now I want to merge those frame to one data.frame with summing up the columns V1 and V2 but not the column n. So the result in this example would be: AB: n V1 V2 1 12 2 2 10 11 3 4 17 4 20 12 5 2 9 6 9 12 7 12 0 8 3 0 9 18 0 10 5 1 11 7 1 12 0 1 13 2 0 So Columns V1 and V2 are the sum of A und B while n has its old value. Notice that there are different rows in n of A and B. I don't have a clue how to start here. Any hint is welcome. Thanks Dubravko Dolic Munich Germany __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- - Pierre Kleiber, Ph.D Email: [EMAIL PROTECTED] Fishery BiologistTel: 808 983-5399 / (hm)808 737-7544 NOAA Fisheries Service - Honolulu LaboratoryFax: 808 983-2902 2570 Dole St., Honolulu, HI 96822-2396 - God could have told Moses about galaxies and mitochondria and all. But behold... It was good enough for government work. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
Xiaofan Li wrote: I consistently use ; at every end of my R code and have found it much more neat than those sentences without an end; for - and =, if I were the author I would rather take the first representation as a sign of passing-by-reference while the latter by value. The problem with doing this is that it can be misleading. For example, you might think the following code does something different than what it does: x - 1 + 2 ; which gives a result that might surprise you: x - 1 + 2 ; [1] 2 x [1] 1 You can argue that R's rules for marking the end of statements are rather bizarre and they should be different, but they aren't, and you shouldn't use a style of coding that suggests that they are. Duncan Murdoch Xiaofan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of McGehee, Robert Sent: 06 December 2005 16:32 To: Jan T. Kim; rHelp Subject: Re: [R] R is GNU S, not C [was how to get or store .] Jan T. Kim wrote: There is a draft R Coding Convention available at http://www.maths.lth.se/help/R/RCC/ which may be useful for finding a style that is good because it is widely used and therefore familiar to a large number of readers.-- However, as the author Henrik Bengtsson points out these guidelines are ours and not the R-developers. Perhaps a definitive style guide published by R core that ensured consistency among new base code would be a helpful addition. I personally find the above style guide extremely useful when multiple programmers work on the same project, and would welcome a formal endorsement or revision by the R developers. (And despite Henrik's elegant guide, I too leave off the semicolons at the end of the lines.) --Robert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlde.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Course***R/Splus Fundamentals and Programming Techniques, December 2005
Happy Holidays! XLSolutions Corporation (www.xlsolutions-corp.com) is proud to announce 2-day R/S-plus Fundamentals and Programming Techniques in : www.xlsolutions-corp.com/Rfund.htm ***Seattle -- January 9th - 10th, 2006 ***San Francisco January 16th-17th, 2006 ***Atlanta -- January 19th-20th, 2006 ***New York -- January 26th-27th, 2006 ***Boston --January 30th-31st, 2006 Reserve your seat now at the early bird rates! Payment due AFTER the class Course Description: This two-day beginner to intermediate R/S-plus course focuses on a broad spectrum of topics, from reading raw data to a comparison of R and S. We will learn the essentials of data manipulation, graphical visualization and R/S-plus programming. We will explore statistical data analysis tools,including graphics with data sets. How to enhance your plots, build your own packages (librairies) and connect via ODBC,etc. We will perform some statistical modeling and fit linear regression models. Participants are encouraged to bring data for interactive sessions With the following outline: - An Overview of R and S - Data Manipulation and Graphics - Using Lattice Graphics - A Comparison of R and S-Plus - How can R Complement SAS? - Writing Functions - Avoiding Loops - Vectorization - Statistical Modeling - Project Management - Techniques for Effective use of R and S - Enhancing Plots - Using High-level Plotting Functions - Building and Distributing Packages (libraries) - Connecting; ODBC, Rweb, Orca via sockets and via Rjava Email us for group discounts. Email Sue Turner: [EMAIL PROTECTED] Phone: 206-686-1578 Visit us: www.xlsolutions-corp.com/training.htm Please let us know if you and your colleagues are interested in this classto take advantage of group discount. Register now to secure your seat! Interested in R/Splus Advanced course? email us. Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Course***R/Splus Fundamentals and Programming Techniques, January 2006 Nationwide
Happy Holidays! XLSolutions Corporation (www.xlsolutions-corp.com) is proud to announce 2-day R/S-plus Fundamentals and Programming Techniques in : www.xlsolutions-corp.com/Rfund.htm ***Seattle -- January 9th - 10th, 2006 ***San Francisco January 16th-17th, 2006 ***Atlanta -- January 19th-20th, 2006 ***New York -- January 26th-27th, 2006 ***Boston --January 30th-31st, 2006 Reserve your seat now at the early bird rates! Payment due AFTER the class Course Description: This two-day beginner to intermediate R/S-plus course focuses on a broad spectrum of topics, from reading raw data to a comparison of R and S. We will learn the essentials of data manipulation, graphical visualization and R/S-plus programming. We will explore statistical data analysis tools,including graphics with data sets. How to enhance your plots, build your own packages (librairies) and connect via ODBC,etc. We will perform some statistical modeling and fit linear regression models. Participants are encouraged to bring data for interactive sessions With the following outline: - An Overview of R and S - Data Manipulation and Graphics - Using Lattice Graphics - A Comparison of R and S-Plus - How can R Complement SAS? - Writing Functions - Avoiding Loops - Vectorization - Statistical Modeling - Project Management - Techniques for Effective use of R and S - Enhancing Plots - Using High-level Plotting Functions - Building and Distributing Packages (libraries) - Connecting; ODBC, Rweb, Orca via sockets and via Rjava Email us for group discounts. Email Sue Turner: [EMAIL PROTECTED] Phone: 206-686-1578 Visit us: www.xlsolutions-corp.com/training.htm Please let us know if you and your colleagues are interested in this classto take advantage of group discount. Register now to secure your seat! Interested in R/Splus Advanced course? email us. Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Matrix of dummy variables from a factor
What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles -- Charles H. Franklin Professor, Political Science University of Wisconsin, Madison [EMAIL PROTECTED] [EMAIL PROTECTED] 608-263-2022 (voice) 608-265-2663 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Everything has slowed down with #1 and #3 by about 50%. Can't do #2 #4 : ta.num - lapply(ta0, scan, sep = ,) Error in file(file, r) : unable to open connection scan seems to want a file or a connection ... Gabor Grothendieck [EMAIL PROTECTED] wrote: Could you time these and see how each of these do: # 1 ta.split - strsplit(ta, split = ,) ta.num - lapply(ta.split, function(x) as.numeric(x[-(1:2)])) # 2 ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - lapply(ta0, scan, sep = ,) # 3 - loop version of #1 n - length(ta) ta.split - strsplit(ta, split = ,) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - as.numeric(ta.split[[i]][-(1:2)]) # 4 - loop version of #2 n - length(ta) ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - scan(t0[[i]) On 12/6/05, John McHenry wrote: I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Coefficient of association for 2x2 contingency tables
The CoCo bundle might contain various measures of association... Søren Fra: [EMAIL PROTECTED] på vegne af Alexandre Santos Aguiar Sendt: ti 06-12-2005 19:48 Til: r-help Emne: [R] Coefficient of association for 2x2 contingency tables Hi, Found no measure of association or correlation for 2x2 contingency tables in fullrefman.pdf or google. Can someone point to a package that implements such calculations? Thanx. -- Alexandre Santos Aguiar - consultoria para pesquisa em saúde - R Botucatu, 591 cj 81 tel 11-9320-2046 fax 11-5549-8760 www.spsconsultoria.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xyplot question
On 12/6/05, Christoph Scherber [EMAIL PROTECTED] wrote: Dear R users, I have a question regarding the use of xyplot in the lattice() package. I have two factors (each with two levels), and I´d like to change the order of the panels in a 2x2 panel layout from the default alphabetic order that R uses based on the names of the factor levels. My approach is (in principle) xyplot(y~x|Factor1+Factor2) Let´s assume, my factor levels for Factor1 are A and B, and for Factor2 they´re C and D, respectively. Now the default arrangement of my panels would be (from bottom top left I assume you mean 'top left' to bottom right): BC,CA,BD,AD No it won't, unless you meant xyplot(y~x|Factor2+Factor1) Instead of describing your problem 'in principle' (which can be very confusing when you make a mistake), please do as the posting guide asks and give a reproducible example. Anyone trying to answer you will have to come up with an example anyway, and since it's your problem, it might as well be you. What I´d like to have is BD,AC,BC,AD. This is impossible if you have two conditioning factors (whichever way you count, the combination following BD has to have at least one of B and D in it). If you want to lose the 2-factor structure, create an interaction, after which you can reorder its levels any way you want, e.g. d - data.frame(f1 = sample(gl(2, 10, labels = LETTERS[1:2])), f2 = sample(gl(2, 10, labels = LETTERS[3:4])), x = rnorm(20), y = rnorm(20)) xyplot(y ~ x | f1:f2, d)[c(1, 2, 4, 3)] which is a shortcut for xyplot(y ~ x | f1:f2, d, index.cond = list(c(1, 2, 4, 3))) -Deepayan Can anyone tell me how to solve this problem easily? I´ve read that using perm.cond and/or index.cond could solve this problem, but couldn´t find an appropriate example, unfortunately... __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Constructing a transition matrix
Hi, I would like to construct a transition matrix from a data frame with annual transitions of marked plants. plant-c(1:6) class-c(seed,seed, seed, veg, rep, rep) fate-c(dead, veg,veg,rep, rep, veg) trans-data.frame(plant, class, fate) plant class fate 1 1 seed dead 2 2 seed veg 3 3 seed veg 4 4 veg rep 5 5 rep rep 6 6 rep veg I have been using sql queries to do this, but I would like to construct the matrix in R since I plan to resample transitions using trans[sample(nrow(trans), 6, replace=T), ] I know I can get the original size vector using table() data.matrix(table(trans$class)) [,1] rep 2 seed3 veg 1 but I don't know how to get counts of each class-fate combination where fate does NOT equal dead seed veg = 2 veg rep = 1 rep rep = 1 rep veg = 1 or how to divide the class-fate count by the original class count in the size vector to get survival probabilities seed veg = 2 / 3 seed = 0.67 veg rep = 1 / 1 veg = 1 rep rep = 1 / 2 rep = 0.5 rep veg = 1 / 2 rep = 0.5 or construct the square matrix with rows and columns in the same developmental sequence like dev- c(seed,veg, rep). seed veg rep seed0 0 0 veg 0.67 0 0.5 rep 0 1 0.5 Any help or suggestions would be appreciated. Thanks, Chris Stubben -- Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Writing a list to a file !
Here is a function I use to send lists to ASCII files. list2ascii - function(x,file=paste(deparse(substitute(x)),.txt,sep=)) { # MHP July 7, 2004 # R or S function to write an R list to an ASCII file. # This can be used to create files for those who want to use # a spreadsheet or other program on the data. # tmp.wid = getOption(width) # save current width options(width=1) # increase output width sink(file)# redirect output to file print(x) # print the object sink()# cancel redirection options(width=tmp.wid)# restore linewidth return(invisible(NULL)) # return (nothing) from function } I hope it's helpful. To write it to a file that can be read by R, I would suggest using dput instead. Regards, Mike Prager A Ezhil wrote: Hi All, This may be trivial in R but I have been trying with out any success. I have a list of 100 elements each having a sub list of different length. I would like to write the list to a ASCII file. I tried with write.table(), after converting my list to a matrix. Now it looks like Robert c(90, 50, 30) Johnc(91, 20, 25, 45) How can I get rid off c(, ..)? In my file, I would like to have Robert 90, 50, 30 John91, 20, 25, 45 Thanks in advance. Regards, Ezhil __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Coefficient of association for 2x2 contingency tables
On Tue, 2005-12-06 at 16:48 -0200, Alexandre Santos Aguiar wrote: Hi, Found no measure of association or correlation for 2x2 contingency tables in fullrefman.pdf or google. Can someone point to a package that implements such calculations? Thanx. Alexandre, See the assocstats() function in the 'vcd' package on CRAN. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] figure with inset
Thanks for all your help. For documentation (I'm probably not the last one searching R-help for a solution to this problem), here is what I believe works best for the intended purouse: x-0:10; y-x^4; library(gridBase) X11(width=8,height=8) # produce outer (main) plot plot(x,y,xaxs=i,yaxs=i) vp - baseViewports() pushViewport(vp$inner,vp$figure,vp$plot) # push viewport that will contain the inset pushViewport(viewport(x=0.1,y=0.9,width=.5,height=.5,just=c(left,top))) # now either define viewport to contain the whole inset figure par(fig=gridFIG(),new=T) # or gridPLT() # ...or just the plotting are (coordinate system) par(plt=gridPLT(),new=T) # draw frame around selected area (for illustration only) grid.rect(gp=gpar(lwd=3,col=red)) # plot inset figure plot(x,y,xaxs=i,yaxs=i,xlab=,ylab=) # pop all viewports from stack popViewport(4) Pascal Niklaus -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Uwe Ligges Sent: Tuesday, December 06, 2005 1:54 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] figure with inset [EMAIL PROTECTED] wrote: I am trying to plot a figure within a figure (an inset that shows a closeup of part of the data set). I have searched R-help and other sources but not found a solution. See the examples on the grid package by Paul Murrel in R News. Uwe Ligges 1. Nice posting -- Your little diagram makes your question crystal clear. 2. Murrell's new book, R GRAPHICS, is a comprehensive resource on grid, if you decide you want to do more with it. 3. See also ?par ... new=TRUE for a (less flexible, but perhaps adequate for your needs) way to do this in R's traditional graphics system. Cheers, Bert What I would like to do is (1) produce a plot (2) specify a window that will be used for the next plot (in inches or using the coordinate system of the plot produced in (1) (3) overlay a new plot in the window specified under (2) The result would be: +--+ | | | first plot | | ++ | | | inset | | | ++ | | | +--+ Thank you for your help Pascal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
From: [EMAIL PROTECTED] ronggui a écrit : I think it is NOT just for historical reason. see the following example: rm(x) mean(x=1:10) [1] 5.5 x Error: object x not found x is an argument local to mean(), did you expect another answer ? mean(x-1:10) [1] 5.5 x [1] 1 2 3 4 5 6 7 8 9 10 What is the goal of this example ? I believe it's to show why - is to be preferred over = for assignment... Here with -, (voluntary, or not, side effect) the global variable x is, also, created. Did the writer really want that ??? Very much so, I believe. I though there were other specific statements especially intended for global assignment, eg -. You need to distinguish assignment in function _call_ and assignment in function _definition_. They ain't the same. If this example was intended to prove - is better than = ... I'm not really convinced ! In that case, let's try another one (which is one big reason I stopped using = for assignment): long.comp - function(n) { + Sys.sleep(n) + n + } result = long.comp(30) system.time(result = long.comp(30)) Error in system.time(result = long.comp(30)) : unused argument(s) (result ...) system.time(result - long.comp(30)) [1] 0.00 0.00 30.05NANA str(result) num 30 Cheers, Andy __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot and factors
On 12/2/05, Jason Miller [EMAIL PROTECTED] wrote: Read R-helpers, I'm relatively new to R and trying to jump in feet first. I've been able to learn a lot from on-line and printed documentation, but here's one question to which I can't find an answer. Well, it's a question with a couple parts. Thanks in advance for any direction (partial or complete) that anyone can provide. I have a data frame with three columns: Year, Semester, value1. I want to treat Year and Semester as factors; there are many years, and there are three semesters (Fall, Spring, Summer). First, I would like to be able to plot the data in this frame as Year v. value with one curve for each factor. I have been unable to do this. Is there any built-in R functionality that makes this easy, or do I need to build this by hand (e.g., using the techniques in FAQ 5.11 or 5.29)? Second, I would like to be able to plot the values against a doubly labeled axis that uses Year and Semester (three Semester ticks per Year). Is there a relatively straightforward way to do this? (What's happening, of course, is that I'd like to treat Year+Semester as a single factor for the purpose of marking the axis, but I'm not sure how to do that, either.) Here are some possibilities using the lattice package: library(lattice) d - data.frame(Year = factor(rep(2000:2004, each = 3)), Semester = gl(3, 1, 15, labels = c(Fall, Spring, Summer)), val = sort(rnorm(15))) xyplot(val ~ Year, d, groups = Semester, type = 'o', auto.key = TRUE) xyplot(val ~ Year:Semester, d, scales = list(x = list(rot = 90))) dotplot(Year:Semester ~ val, d) dotplot(Semester ~ val | Year, d, layout = c(1, 5)) HTH, -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix of dummy variables from a factor
See ?model.matrix. HTH, Andy From: Charles H. Franklin What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles -- Charles H. Franklin Professor, Political Science University of Wisconsin, Madison [EMAIL PROTECTED] [EMAIL PROTECTED] 608-263-2022 (voice) 608-265-2663 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
On 12/6/05, Ted Harding [EMAIL PROTECTED] wrote: On 06-Dec-05 Martin Maechler wrote: [But really, I'm more concerned and quite bit disappointed by the diehard ; lovers] Martin Maechler Well, while not die-hard, I will put in my own little reason for often using ; at the end of lines which don't need them. Basically, this is done to protect me from myself (so in fact is quite a strong reason). I tend to develop extended R code in a side-window, using a text editor (vim) in that window, and cutpasting the chunks of R code from that window into the R window. This usually means that I have a lot of short lines, since it is easier when developing code to work with the commands one per line, as they are easier to find and less likely to be corrected erroneously. Finally, when when I am content that the code does the job I then put several short lines into one longer one. For example (a function to do with sampling with probability proportional to weights); first, as written line-by-line: myfunction - function(X,n1,n2,n3,WTS){ N1-n1; N2-n1+n2; N3-n1+n2+n3; # first selection pii-WTS/sum(WTS); alpha-N2; Pi-alpha*pii; r-runif(N3); ix-sort(which(r=Pi)); # second selection ix0-(1:N3); ix3-ix0[-ix]; ix20-ix0[ix]; W-WTS[ix]; pii-W/sum(W); Pi-N1*pii; r-runif(length(Pi)); ix10-sort(which(r=Pi)); ix1-ix20[ix10]; ix2-ix20[-ix10]; # return the results list(X1=X[ix1],X2=X[ix2],X3=X[ix3],ix1=ix1,ix2=ix2,ix3=ix3) } Having got that function right, with 'vim' in command mode successive lines are readily brought up to the current line by simply pressing J, which is very fast. This, in the above case, then results in MARselect-function(X,n1,n2,n3,WTS){ N1-n1; N2-n1+n2; N3-n1+n2+n3; # first selection pii-WTS/sum(WTS); alpha-N2; Pi-alpha*pii; r-runif(N3); ix-sort(which(r=Pi)); # second selection ix0-(1:N3); ix3-ix0[-ix]; ix20-ix0[ix]; W-WTS[ix]; pii-W/sum(W); Pi-N1*pii; r-runif(length(Pi)); ix10-sort(which(r=Pi)); ix1-ix20[ix10]; ix2-ix20[-ix10]; # return the results list(X1=X[ix1],X2=X[ix2],X3=X[ix3],ix1=ix1,ix2=ix2,ix3=ix3) } The greater readability of the first relative to the second is obvious. The compactness of the second relative to the first is evident. Obtaining the second from the first by repeated J is very quick. I'm curious: exactly what purpose does this 'compactness' serve? The file size doesn't decrease, since you are replacing newlines by semicolons. It does not improve readability. So why do it at all? -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
A slight variation on one of Gabor's idea might work: ## simulate a data file: n - 2e5 minF - 20 maxF - 30 f - file(test.csv, open=w) invisible(replicate(n, writeLines(paste(runif(sample(minF:maxF, 1)), collapse=,), f))) close(f) f - file(test.csv, open=r) system.time(dat - replicate(n, scan(f, nlines=1, sep=,))) close(f) The above code creates a file around 270MB. It took around 46 seconds on my 1GB laptop to read the data into dat. The corresponding strsplit(readLines()) solution took over a minute, and another 23 seconds to run lapply(dat, as.numeric). Andy -Original Message- From: John McHenry [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 06, 2005 3:05 PM To: Gabor Grothendieck Cc: Liaw, Andy; r-help@stat.math.ethz.ch Subject: Re: [R] reading in data with variable length Everything has slowed down with #1 and #3 by about 50%. Can't do #2 #4 : ta.num - lapply(ta0, scan, sep = ,) Error in file(file, r) : unable to open connection scan seems to want a file or a connection ... Gabor Grothendieck [EMAIL PROTECTED] wrote: Could you time these and see how each of these do: # 1 ta.split - strsplit(ta, split = ,) ta.num - lapply(ta.split, function(x) as.numeric(x[-(1:2)])) # 2 ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - lapply(ta0, scan, sep = ,) # 3 - loop version of #1 n - length(ta) ta.split - strsplit(ta, split = ,) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - as.numeric(ta.split[[i]][-(1:2)]) # 4 - loop version of #2 n - length(ta) ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - scan(t0[[i]) On 12/6/05, John McHenry wrote: I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html --
Re: [R] Matrix of dummy variables from a factor
Dear Charles, Try model.matrix(~f)[,-1]. Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Charles H. Franklin Sent: Tuesday, December 06, 2005 3:03 PM To: r-help@stat.math.ethz.ch Subject: [R] Matrix of dummy variables from a factor What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles -- Charles H. Franklin Professor, Political Science University of Wisconsin, Madison [EMAIL PROTECTED] [EMAIL PROTECTED] 608-263-2022 (voice) 608-265-2663 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R is GNU S, not C.... [was how to get or store .....]
On 06-Dec-05 Deepayan Sarkar wrote: [...] The greater readability of the first relative to the second is obvious. The compactness of the second relative to the first is evident. Obtaining the second from the first by repeated J is very quick. I'm curious: exactly what purpose does this 'compactness' serve? The file size doesn't decrease, since you are replacing newlines by semicolons. It does not improve readability. So why do it at all? -Deepayan You are taking a more abstract and logical a view than I do! a) It is more compact in the sense that the same anount of code takes fewer lines. b) My editing window is typically about 56 lines tall. Once I have got the code working as I want, I can compact it onto fewer lines, thereby leaving more space for further code all of which will be visible in the same screen space. c) Since the compacted code is already OK, I don't need to be able to read it so readily -- it is enough that I can see it when I need to refer to it. It is all a matter of layout, perception and psychology: by experience I have found that this way of working improves my accuracy and speed, and my overview of the problem (and hence my ability to see solutions). This may or may not be valid for anyone else; but as far as I'm concerned it is (along with the J trick when using vim) a cogent (if personal) reason for putting ; at the ends of commands. Which was the original point. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 21:01:30 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix of dummy variables from a factor
Charles H. Franklin wrote: What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) model.matrix(y~f) Uwe Ligges where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] strange behavior of loess() predict()
Gavin Simpson wrote: Dear list, I am very sorry for being inaccurate in my question. But re-reading the predict.loess help site does not provide a solution. As long as predict is used on a new dataset based on this dataset, the strange values remain and can be reproduced. Adding a new element to both vectors (at the beginning, e.g. 1 for each vector) results in plausible values - but not in every case. Even switching x and y is sufficient (i.e. x as predictor and y as dependent variable). So my question is: Is it normal - or under which conditions does it take place - that predict.loess predicts values that are almost 2/max(y) ~ 5000 times higher than expected? best, leo gürtler On Tue, 2005-12-06 at 18:09 +0100, Leo Gürtler wrote: Dear altogether, snip # here is the difference!! predict(mod, data.frame(x=X), se=TRUE) predict(mod, x=X, se=TRUE) --- end of snip --- I assume this has some reason but I do not understand this reason. Merci, Not sure if this is the reason, but there is no argument x in predict.loess, and: a - predict(mod, se = TRUE) gives you the same results as: b - predict(mod, x=X, se=TRUE) so the x argument appears to be being passed on/in the ... arguments and ignored? As such, you have no newdata, so mod$x is used. Now, when you do: c - predict(mod, data.frame(x=X), se=TRUE) You have used an un-named argument in position 2. R takes this to be what you want to use for newdata and so works with this data rather than the one in mod$x as in the first case: # now named second argument - gets ignored as in a and b d - predict(mod, x = data.frame(x=X), se=TRUE) all.equal(a, b) # TRUE all.equal(a, c) # FALSE all.equal(a, d) # TRUE # this time we assign X to x by using (), the result is used as newdata e - predict(mod, (x=X), se=TRUE) all.equal(c, e) # TRUE If in doubt, name your arguments and check the help! ?predict.loess would have quickly shown you where the problem lay. HTH G best regards leo gürtler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- email: [EMAIL PROTECTED] www: http://www.anicca-vijja.de/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix of dummy variables from a factor
But note: There are (almost?) no situations in R where the dummy variables coding is needed. The coding is (almost?) always handled properly by the modeling functions themselves. Question: Can someone provide a straightforward example where the dummy variable coding **is** explicitly needed? -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA The business of the statistician is to catalyze the scientific learning process. - George E. P. Box -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy Sent: Tuesday, December 06, 2005 12:30 PM To: 'Charles H. Franklin'; r-help@stat.math.ethz.ch Subject: Re: [R] Matrix of dummy variables from a factor See ?model.matrix. HTH, Andy From: Charles H. Franklin What is a simple way to convert a factor into a matrix of dummy variables? fm-lm(y~f) where f is a factor takes care of this in the estimation. I'd like to save the result of expanding f into a matrix for later use. Thanks. Charles -- Charles H. Franklin Professor, Political Science University of Wisconsin, Madison [EMAIL PROTECTED] [EMAIL PROTECTED] 608-263-2022 (voice) 608-265-2663 (fax) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html