[R] acf lag1 value
Hi R, I have doubt. x= c(4,5,6,3,2,4,5) acf(x,plot=F,lag.max=1) Autocorrelations of series 'x', by lag 0 1 1.000 0.182 But if I actually calculate the autocorrelation at lag1 I get, cor(x[-1],x[-length(x)]) [1] 0.1921538 Even in excel I get 0.1921538 value. So, I want to know what the 'acf' function is calculating here Thanks in advance, Shubha Karanth This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple text placements and expressions revisited
Hi all, I asked something like this earlier but decided that a proper minimal example might be helpfull ;0) Why does this work with regards to the expression (substitution): require(stats) plot(cars) text(5,120,labels=substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0))) text(c(5,5),c(115,110),labels=c(One,Two)) But adding this (using a vector of expressions/substitutions) fails to print the expression correctly: text(c(5,5),c(105,100),labels=c(substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0)),Three)) This is a bug, no? Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] acf lag1 value
Please re-check your time-series books. The acf at lag 1 is _not_ the correlation between x and lag(x). For one thing, the variance of x is computed from the whole series, and not from the series with either the first or last value removed -- there is also the question of the divisor. See MASS p.390 for the formulae used. On Thu, 17 Jan 2008, Shubha Vishwanath Karanth wrote: Hi R, I have doubt. x= c(4,5,6,3,2,4,5) acf(x,plot=F,lag.max=1) Autocorrelations of series 'x', by lag 0 1 1.000 0.182 But if I actually calculate the autocorrelation at lag1 I get, Not the right formula. cor(x[-1],x[-length(x)]) [1] 0.1921538 Even in excel I get 0.1921538 value. So, I want to know what the 'acf' function is calculating here Thanks in advance, Shubha Karanth This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple text placements and expressions revisited
On Thu, 17 Jan 2008, Johannes Graumann wrote: Hi all, I asked something like this earlier but decided that a proper minimal example might be helpfull ;0) Why does this work with regards to the expression (substitution): require(stats) plot(cars) text(5,120,labels=substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0))) text(c(5,5),c(115,110),labels=c(One,Two)) But adding this (using a vector of expressions/substitutions) fails to print the expression correctly: text(c(5,5),c(105,100),labels=c(substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0)),Three)) This is a bug, no? Yes, but not where you appear to think it is. 'labels' is not an expression: check it by typeof(). Using expression() in place of c() will give what I think you intended. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple text placements and expressions revisited
Thanks for your help! Works like a charm now - I can even append to an expression abject as if it was plain 'c()' ... Joh Prof Brian Ripley wrote: On Thu, 17 Jan 2008, Johannes Graumann wrote: Hi all, I asked something like this earlier but decided that a proper minimal example might be helpfull ;0) Why does this work with regards to the expression (substitution): require(stats) plot(cars) text(5,120,labels=substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0))) text(c(5,5),c(115,110),labels=c(One,Two)) But adding this (using a vector of expressions/substitutions) fails to print the expression correctly: text(c(5,5),c(105,100),labels=c(substitute(i^{z+phantom()}*(*a* AMU),list(i=yx,z=2,a=0)),Three)) This is a bug, no? Yes, but not where you appear to think it is. 'labels' is not an expression: check it by typeof(). Using expression() in place of c() will give what I think you intended. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave and xtable
yes, but we loose the possibility to use a 'simple' text editor. For example, I like to use emacs+ess to edit and evaluate R code, and to write my report in the same editor. I like the idea that the input format could be writen with a simple text editor, and the output format be chosen after. I know that the syntax of the input file could be too difficult for newbie users (that's why odfWeave is very usefull), but this can be bypass a specific editor like kile for latex... 2008/1/16, Max Kuhn [EMAIL PROTECTED]: 2008/1/16 David Hajage [EMAIL PROTECTED]: Ps : I would like to know if there is an R project to include all existing format outputs (latex with Sweave, odf with odfWeave, html with rWeaveHTML) and all the wonderful work of their author in a same package or in a same project. All of these use a very similar syntax (foo= R code @), but there is anyway a lot of work to rewrite the R code to make a file writes for Sweave working with odfWeave. A latex file can be convert to latex or rtf, but it depends on external programs not very easy to use. For example, I would imagine an input similar to the syntax of help files (.Rd), but the R code could be include ( foo = R code @), and a When developing odfWeave, I did run into issues that could not be taken into account with Sweave. That is why odfWeave calls Sweave instead of using Sweave directly (as you would with the R2HTML package). It would be nice to have a more uniform/expanded interface in Sweave that can more naturally accommodate other types of markup (beyond writing drivers). Sweave-like function could replace R code with results and convert the file into latex, html or odf. I *think* that odfWeave with OpenOffice/NeoOffice can do that. I think I saw somewhere that newer versions of OO support LaTex, but clearly I haven't looked into it. I do know that using OO as a conversion tool to rtf, doc, pdf and html works very well. -- Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aaMI
hi i am new to R language. I want to use aaMI package which calculates the amino acid mutual interaction for a given protein sequence. I had installed the package but when i run the program it gives me the error could not find function aaMI. can anyone tell me what might be the problem.. -- View this message in context: http://www.nabble.com/aaMI-tp14915744p14915744.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using table function
Ricardo Perrone [EMAIL PROTECTED] wrote: Hi, How to join two large vectors ordered, where one has the variable's levels and another has the frequencies, in way similar to that showing by table function in R console? and considering this two vectors how to use summary function to produce statistical informations like mean, sd, min, max and quartile? Thanks Ricardo Hi Ricardo, I'll guess that you mean two vectors like this: V1 1 2 3 4 5 V2 10 21 32 45 5 cat(formatC(V1,width=8),\n,formatC(V2,width=8),\n,sep=) Then you might want: datavec-rep(V1,V2) and perform the summaries on that vector. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aaMI
i am new to R language. I want to use aaMI package which calculates the amino acid mutual interaction for a given protein sequence. I had installed the package but when i run the program it gives me the error could not find function aaMI. can anyone tell me what might be the problem.. Did you type library(aaMi) or require(aaMi) to load the package? Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Intro to R :: London :: 6-7/03/2007
Mango Solutions are pleased to announce the above course in London as part of our schedule for Q1 2008. --- Introduction to R and R Programming - 6-7th March 2008 --- * Who Should Attend ? This is a course suitable for beginners and improvers in the R language and is ideal for people wanting an all round introduction to R * Course Goals - To allow attendees to understand the technology behind the R package - Improve attendees programming style and confidence - To enable users to access a wide range of available functionality - To enable attendees to program in R within their own environment - To understand how to embed R routines within other applications * Course Outline 1. Introduction to the R language and the R community 2. The R Environment 3. R data objects 4. Using R functions 5. The apply family of functions 6. Writing R functions 7. Standard Graphics 8. Advanced Graphics 9. R Statistics The cost of these courses is £900 for commercial attendees and £500 for academic attendees. Should your organization have more than 3 possible attendees why not talk to us about hosting a customized and focused course delivered at your premises? Details of further courses in alternative locations are available at http://www.mango-solutions.com/services/rtraining/r_schedule.html Should you want to book a place on this course or have any questions please contact [EMAIL PROTECTED] -- Mango Solutions data analysis that delivers Tel: +44(0) 1249 467 467 Fax: +44(0) 1249 467 468 Mob: +44(0) 7813 526 123 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using a data frame to create a legend
If I understand your question: x - rnorm(100) plot(x) legend(topright, capture.output(t(t(summary(x) Thank you for your help, but I'm afraid that is not what I meant. As an example of what i am trying to say, imagine I had grouped a whole bunch of people into 4 age ranges and then plotted time (0-24h) against their heart rates (can't think of a better example). So each of the four age ranges are represented by one line in the plot (i.e. the plot has 4 lines). But say I had also calculated the area under each curve, the average of each curve and the expected time at which their heart rate was lowest for each curve, and that I wanted to express this information on the graph in a table that lined up with the legend. So the legend now has a column for age range, AUC, mean, E(Tmin) and finally a column depicting what each age range is represented by on the graph (the col and pch). Unfortunately, even though I have this working in SPlus, I haven't achieved quite the same result in R. Any help would be appreciated. -- View this message in context: http://www.nabble.com/Using-a-data-frame-to-create-a-legend-tp14880656p14916274.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aaMI
Hi Navish, did you run require(aaMI) ? Cheers Andrew On Thu, Jan 17, 2008 at 02:17:12AM -0800, navish wrote: hi i am new to R language. I want to use aaMI package which calculates the amino acid mutual interaction for a given protein sequence. I had installed the package but when i run the program it gives me the error could not find function aaMI. can anyone tell me what might be the problem.. -- View this message in context: http://www.nabble.com/aaMI-tp14915744p14915744.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] acf lag1 value
Thank you Professor... Shubha Karanth | Amba Research Ph +91 80 3980 8031 | Mob +91 94 4886 4510 Bangalore * Colombo * London * New York * San José * Singapore * www.ambaresearch.com -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Thursday, January 17, 2008 2:27 PM To: Shubha Vishwanath Karanth Cc: [EMAIL PROTECTED] Subject: Re: [R] acf lag1 value Please re-check your time-series books. The acf at lag 1 is _not_ the correlation between x and lag(x). For one thing, the variance of x is computed from the whole series, and not from the series with either the first or last value removed -- there is also the question of the divisor. See MASS p.390 for the formulae used. On Thu, 17 Jan 2008, Shubha Vishwanath Karanth wrote: Hi R, I have doubt. x= c(4,5,6,3,2,4,5) acf(x,plot=F,lag.max=1) Autocorrelations of series 'x', by lag 0 1 1.000 0.182 But if I actually calculate the autocorrelation at lag1 I get, Not the right formula. cor(x[-1],x[-length(x)]) [1] 0.1921538 Even in excel I get 0.1921538 value. So, I want to know what the 'acf' function is calculating here Thanks in advance, Shubha Karanth This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 This e-mail may contain confidential and/or privileged i...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aaMI
Dear Richie. Thank you. yes it works after loading the package. can you please tell me what should be the location of the file during uploading for aaMI function. Richard Cotton wrote: i am new to R language. I want to use aaMI package which calculates the amino acid mutual interaction for a given protein sequence. I had installed the package but when i run the program it gives me the error could not find function aaMI. can anyone tell me what might be the problem.. Did you type library(aaMi) or require(aaMi) to load the package? Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/aaMI-tp14915744p14916040.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using an element of an array as a new object
Thankyou for the replies. assign() works. for (i in 1:7) assign(filesBox[i,1],read.table(paste(dir2, filesBox[i,1], sep=), header = FALSE)) -- View this message in context: http://www.nabble.com/using-an-element-of-an-array-as-a-new-object-tp14884435p14917349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aaMI
Thank you. yes it works after loading the package. can you please tell me what should be the location of the file during uploading for aaMI function. You can use an absolute path to a file, e.g. c:/source/project/myrproject/myfile.r .. Or an relative path from the current working directory. The current working directory can be found using getwd() Then you can navigate to the directory where your file lies using standard folder navigation commands ('..' for parent directory, etc.), e.g. ../data directory/mydatafile.csv You might also want to read the 'Introduction to R', and 'R data Import/ Export' manuals. http://cran.r-project.org/doc/manuals/R-intro.pdf http://cran.r-project.org/doc/manuals/R-data.pdf Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave and xtable
David, The value of odfWeave is not limited to newbie users. It is vastly useful for researchers in fields that do not accept LaTeX for journal paper submission (for example, sociology, demography). Best, Shige On Jan 17, 2008 5:46 PM, David Hajage [EMAIL PROTECTED] wrote: yes, but we loose the possibility to use a 'simple' text editor. For example, I like to use emacs+ess to edit and evaluate R code, and to write my report in the same editor. I like the idea that the input format could be writen with a simple text editor, and the output format be chosen after. I know that the syntax of the input file could be too difficult for newbie users (that's why odfWeave is very usefull), but this can be bypass a specific editor like kile for latex... 2008/1/16, Max Kuhn [EMAIL PROTECTED]: 2008/1/16 David Hajage [EMAIL PROTECTED]: Ps : I would like to know if there is an R project to include all existing format outputs (latex with Sweave, odf with odfWeave, html with rWeaveHTML) and all the wonderful work of their author in a same package or in a same project. All of these use a very similar syntax (foo= R code @), but there is anyway a lot of work to rewrite the R code to make a file writes for Sweave working with odfWeave. A latex file can be convert to latex or rtf, but it depends on external programs not very easy to use. For example, I would imagine an input similar to the syntax of help files (.Rd), but the R code could be include ( foo = R code @), and a When developing odfWeave, I did run into issues that could not be taken into account with Sweave. That is why odfWeave calls Sweave instead of using Sweave directly (as you would with the R2HTML package). It would be nice to have a more uniform/expanded interface in Sweave that can more naturally accommodate other types of markup (beyond writing drivers). Sweave-like function could replace R code with results and convert the file into latex, html or odf. I *think* that odfWeave with OpenOffice/NeoOffice can do that. I think I saw somewhere that newer versions of OO support LaTex, but clearly I haven't looked into it. I do know that using OO as a conversion tool to rtf, doc, pdf and html works very well. -- Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Rob Robinson wrote: I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob My understanding is that PROC IML is disconnected from the rest of the SAS language, e.g., you can't have a loop in which PROC GENMOD is called or datasets are merged. If that's the case, IML is not very competitive in my view. Frank Harrell *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. Previously I used SAS for 23 years and now R/S-Plus for 17. SAS is effective for large datasets (in my work 500,000 subjects) but except for that, R is far superior to SAS for data management and manipulation. Just four of the reasons are that you can - merge data frames multiple ways and compare the results - deal with arrays (lists) of datasets using high-level operators - easily do complex calculations on serial data such as find the highest blood pressure per subject that is measured before something else is measured - sense the type of a variable (character, factor, date, discrete numeric, continuous numeric, etc.) while analyzing it, and tailor the analysis to the type of variable http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf has a large section on data manipulation in S. Frank -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rmpi on Linux x86_64 GNU/Linux
Brian, On 16 January 2008 at 11:26, Brian O'Gorman wrote: | I'm having trouble with R CMD INSTALL Rmpi_0.5-5.tar.gz | --configure-args=~/lam | | lam is is installed locally. | lamboot -d (or lamboot-d and also recon) works. make -k check from the | lamtest suite passes all tests. | Is this is problem with the -fPIC compiler as in the message? Should it | be modified in the Makefile? | Any help or comments are appreciated, thanks. Rmpi has been in Debian, and hence on more than ten architectures, incl several 64 bits ones, for a few years now. While that doesn't benefit you directly if you're not on a Debian system (or on a derivative like Ubuntu), you could look at the configuration we use. The .diff.gz files contain the file debian/rules -- a Makefile that governs how we call configure et. For Rmpi, we just call 'R CMD INSTALL' as usual. So your problem may well be with your local LAM library. See http://packages.debian.org/lam4 for the LAM configuration. We still use LAM 7.1.2. I do recall, however, that I had difficulties building Rmpi (locally at work) with LAM 7.1.3 and 7.1.4 before we switched to Open MPI. With current Open MPI and Rmpi packages, everything just works for me. Hope this helps, ping me offline if you have questions. Regards, Dirk -- Three out of two people have difficulties with fractions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] exact method in coxph
The help says that the exact method is computationally demanding, but even after days the computing it won't finish. Also, if I include a frailty-term, the exact method gives me results in no time. Is my setup incorrect? Assume that at some particular time point there are k deaths and n subjects at risk. The exact partial likelihood calculation for that time point involves an average of k choose n terms. This gets very big very fast. For instance 20 events on day x among 100 subjects involves 100!/(20! 80!) 5e20 terms. I stopped ever using the exact likelihood when I realized - a particular coxph model was taking a long time, so I did a back of the envelope approximation and realized that the expected compute time was several years - for all the cases that were small enough to finish, the Efron approximation was very close. When there are penalized terms such as pspline() or frailty() the code only chooses between the Breslow and Efron approximations. The code should have issued an error message when you specified the exact method -- this is an oversight that I will fix. Since the relevant line of code in coxpenal.fit is if (method=='efron') your frailty fit was done using the Breslow approximation. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] non-plot plotting
I really do not know ho to else title this ... I want to draw something like the attached png with R and would like to poll you on how to start ... make an empty plot first and then start positioning the characterstring by 'text' and then drawing the lines ... Joh attachment: sequence.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Previously I used SAS for 23 years and now R/S-Plus for 17. SAS is effective for large datasets (in my work 500,000 subjects) but except for that, R is far superior to SAS for data management and manipulation. Just four of the reasons are that you can - merge data frames multiple ways and compare the results - deal with arrays (lists) of datasets using high-level operators - easily do complex calculations on serial data such as find the highest blood pressure per subject that is measured before something else is measured - sense the type of a variable (character, factor, date, discrete numeric, continuous numeric, etc.) while analyzing it, and tailor the analysis to the type of variable And one more: * you can trust that R will do the correct thing with missing values (propagate them by default) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color ranges on a 2D plot
There must be a better way but this will do it for you. x - runif(100, 0, 1) y - runif(100, 0, 1) z - data.frame(x,y) plot(subset(z, z$y =.5), col=red, ylim=c(min(z$y), max(z$y)), pch=16) points(subset(z, z$y =.49), col=blue, pch=15) --- dxc13 [EMAIL PROTECTED] wrote: useR's I am trying to color the points on a scatter plot (code below) with two colors. Red for values 0.5 -1.0 and blue for 0.0 - .49. Does anyone know a easy way to do this? x - runif(100, 0, 1) y - runif(100, 0, 1) plot(y ~ x, pch=16) Thanks, dxc13 -- View this message in context: http://www.nabble.com/color-ranges-on-a-2D-plot-tp14893457p14893457.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Converting plots to ggplot2
Hello Hadley, I am trying to reproduce the following with ggplot: a - seq(0, 360, 5)*pi/180 ; a ac - sin(a + (45*pi/180)) + 1 ; ac plot(a, ac, type='b', xaxt = n) axis(1, at=seq(0,6,1), labels=round(seq(0,6,1)*180/pi),1) abline(v=c(45*pi/180, 225*pi/180)) I can get the basic plot: p - qplot(a, ac, geom=c('point', 'line')) ; p but cannot seem to add the vertical reference lines: # representing NE and SW compass points p + geom_vline(intercept=45*pi/180) p + geom_vline(intercept=225*pi/180) nor find a reference to manipulating the axes labels (still searching the news archives though). Also, I would like to add additional curves to the same plot with the sequence 'asc' generated by: s - seq(5, 45, 10)*pi/180 ; s asc - lapply(s, function(x) x*cos(ac) + x*sin(ac)) ; asc Suggestions? Thanx, DaveT. sessionInfo() R version 2.6.1 (2007-11-26) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252; LC_MONETARY=English_Canada.1252;LC_NUMERIC=C; LC_TIME=English_Canada.1252 attached base packages: [1] datasets tcltk utils stats graphics grDevices splines grid [9] methods base other attached packages: [1] svGUI_0.9-5svViews_0.9-5 svIO_0.9-5 svMisc_0.9-5 [5] R2HTML_1.58ggplot2_0.5.2 RColorBrewer_0.2-3 MASS_7.2-34 [9] proto_0.3-7reshape_0.7.4 loaded via a namespace (and not attached): [1] lattice_0.14-17 Sys.info()[c(1:3,5)] sysname release Windows NT 5.1 version machine (build 2600) Service Pack 2 x86 * Silviculture Data Analyst Ontario Forest Research Institute Ontario Ministry of Natural Resources [EMAIL PROTECTED] http://ofri.mnr.gov.on.ca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color ranges on a 2D plot
[EMAIL PROTECTED] napsal dne 17.01.2008 15:08:40: There must be a better way but this will do it for you. x - runif(100, 0, 1) y - runif(100, 0, 1) z - data.frame(x,y) plot(subset(z, z$y =.5), col=red, ylim=c(min(z$y), max(z$y)), pch=16) points(subset(z, z$y =.49), col=blue, pch=15) Other option is to subset vector of colors colvec-c(blue, red) plot(z, col=colvec[(z$y=.5)+1]) Regards Petr --- dxc13 [EMAIL PROTECTED] wrote: useR's I am trying to color the points on a scatter plot (code below) with two colors. Red for values 0.5 -1.0 and blue for 0.0 - .49. Does anyone know a easy way to do this? x - runif(100, 0, 1) y - runif(100, 0, 1) plot(y ~ x, pch=16) Thanks, dxc13 -- View this message in context: http://www.nabble.com/color-ranges-on-a-2D-plot-tp14893457p14893457.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector generation
Dear Contributors: I have the next vector: Z 526 723 110 1110 34 778 614 249 14 I want to generate a vector containing the ratios of all the values versus all the values of the z vector. I mean a vector containing the values of 526/723, 526/110, and so on, 723/723, 723/110, and so on, and so on. Is this doable in a simple way?? Thanks in advance again, Juan Pablo Fededa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Converting plots to ggplot2
Thompson, David (MNR) wrote: Hello Hadley, I am trying to reproduce the following with ggplot: a - seq(0, 360, 5)*pi/180 ; a ac - sin(a + (45*pi/180)) + 1 ; ac plot(a, ac, type='b', xaxt = n) axis(1, at=seq(0,6,1), labels=round(seq(0,6,1)*180/pi),1) abline(v=c(45*pi/180, 225*pi/180)) I can get the basic plot: p - qplot(a, ac, geom=c('point', 'line')) ; p but cannot seem to add the vertical reference lines: # representing NE and SW compass points p + geom_vline(intercept=45*pi/180) p + geom_vline(intercept=225*pi/180) You should add together the two lines: p + geom_vline(intercept=45*pi/180) + geom_vline(intercept=225*pi/180) nor find a reference to manipulating the axes labels (still searching the news archives though). last_plot() + scale_x_continuous(name=x axis) + scale_y_continuous(name=y axis) Ciao, domenico Also, I would like to add additional curves to the same plot with the sequence 'asc' generated by: s - seq(5, 45, 10)*pi/180 ; s asc - lapply(s, function(x) x*cos(ac) + x*sin(ac)) ; asc Suggestions? Thanx, DaveT. sessionInfo() R version 2.6.1 (2007-11-26) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252; LC_MONETARY=English_Canada.1252;LC_NUMERIC=C; LC_TIME=English_Canada.1252 attached base packages: [1] datasets tcltk utils stats graphics grDevices splines grid [9] methods base other attached packages: [1] svGUI_0.9-5svViews_0.9-5 svIO_0.9-5 svMisc_0.9-5 [5] R2HTML_1.58ggplot2_0.5.2 RColorBrewer_0.2-3 MASS_7.2-34 [9] proto_0.3-7reshape_0.7.4 loaded via a namespace (and not attached): [1] lattice_0.14-17 Sys.info()[c(1:3,5)] sysname release Windows NT 5.1 version machine (build 2600) Service Pack 2 x86 * Silviculture Data Analyst Ontario Forest Research Institute Ontario Ministry of Natural Resources [EMAIL PROTECTED] http://ofri.mnr.gov.on.ca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of Epi package out (1.0.7)
A new major upgrade of the Epi package for Epidemiological data analysis has been put on CRAN, it is now at version 1.0.7. It contains an entirely new way of representing follow-up data on multiple timescales and multiple states. See the function Lexis(). Plus a lot of other useful stuff for epidemiological analysis. See more on the package homepage, www.biostat.ku.dk/~bxc/Epi Note also that there will be a course 28 May - 2 June in Estonia, see www.biostat.ku.dk/~bxc/SPE. Bendix Carstensen Epi package maintainer __ Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2-4 DK-2820 Gentofte Denmark +45 44 43 87 38 (direct) +45 30 75 87 38 (mobile) [EMAIL PROTECTED] http://www.biostat.ku.dk/~bxc ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping data
You might want to have a look at the recode function in the car package. By the way I think you meant 26-35 not 25-25. === Example xx - data.frame(age=c(25, 33, 22, 19,21, 30, 32, 31), edu=c(2,5 ,3, 1,3, 4, 4, 1)) library(car) aa - recode(xx$age, 18:25='A'; 26:35='B') ; aa table(xx$edu, aa) === --- K. Elo [EMAIL PROTECTED] wrote: Hi, I am quite new to R (but like it very much!), so please apologize if this is a too simple question. I have a large data frame consisting of data from a survey. There is, for example, information about age and education (a numeric value from 1-9). Now I would like to extract the total amount of each type of education within different age groups (e.g. from 18 to 25, from 25 to 35 etc.). How could I achieve this? (I have been thinking about using 'subset', but if there are better ideas they are welcome :) ) An example might clarify my point. Let's assume the following data: # age edu 1 25 2 2 33 5 3 22 3 4 19 1 5 21 3 6 30 4 7 32 4 8 31 1 What I want to have is: edu 18-25 25-35 ... 1 1 1 2 1 0 3 2 0 4 0 2 5 0 1 Thanks in advance kind regards, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[replacing trailing spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Error
d - read.table(C:\\rep.csv, head=TRUE, sep=,) pie(d$Votes, + labels=d$Name, + main=Class Rep Results\n(Final Results)) Error: Error in pie(d$votes, labels = d$name, main = Class Rep Results\n(Final Results)) : 'x' values must be positive. The first input to the pie function represents the size of the pie slice, so it's values must be non-negative. Take a look at what you have stored in the data frame d; it is possible that the data has been read into R incorrectly. Type d and str(d) at the command line to see this. Also, you might want to use read.csv to read in your data if it is comma separated value format. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replace numbers in a column conditional on their value
Splendid, thanks for your quick response. [EMAIL PROTECTED] wrote: I have a data frame column in which I would like to replace some of the numbers dependent on their value. data frame = zz AveExpr t P.Value FC 7.481964 7.323950 1.778503e-04 2.218760 7.585783 12.233056 6.679776e-06 2.155867 6.953215 6.996525 2.353705e-04 1.685733 7.647513 8.099859 9.512639e-05 1.674742 7.285446 7.558675 1.463732e-04 1.584071 6.405605 3.344031 1.276812e-02 1.541569 I would like to replace the values in column 'FC' which are 2 with their squared value. If I do this, however, I get a warning but it does the sum correctly. Warning message: number of items to replace is not a multiple of replacement length in: zz[, 4][zz[, 4] 2] - zz[, 4]^2 Try zz$FC[zz$FC 2] - (zz$FC[zz$FC 2])^2 Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:27}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Several people have mentioned large, messy data sets. I am curious as to in what way messy data sets are messy. (I am also curious about what SAS does that helps one deal with them, but perhaps that's asking too much.) Thanks. -Ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert Sent: Thursday, January 17, 2008 11:39 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R The argument for SAS (and Stata) when working with large dataset comes up fairly often. I have not had much experience in this area, but have been pleasantly surprised using R in combination with an SQL interface, in situations with modestly large, messy datasets. I certainly would appreciate comments on the relative merits from anyone that has more experience in this area. Paul Gilbert Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. == == La version française suit le texte anglais. -- -- This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. -- -- Le présent courriel peut contenir de l'information privilégiée ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
The argument for SAS (and Stata) when working with large dataset comes up fairly often. I have not had much experience in this area, but have been pleasantly surprised using R in combination with an SQL interface, in situations with modestly large, messy datasets. I certainly would appreciate comments on the relative merits from anyone that has more experience in this area. Paul Gilbert Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. La version française suit le texte anglais. This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so. Le présent courriel peut contenir de l'information privilégiée ou confidentielle. La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, utilisation ou copie de ce courriel ou des renseignements qu'il contient par une personne autre que le ou les destinataires désignés est interdite. Si vous recevez ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre ordinateur toute copie du courriel reçu. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Error
Hi, I am having trouble with an error I keep getting. I am just trying to create a simple pic chart from a small table. Hope someone can help. I am new to R. Table: Name Votes John 300 Sean222 Andy 467 Sinead 740 David 124 James 641 William 380 Commands: d - read.table(C:\\rep.csv, head=TRUE, sep=,) pie(d$Votes, + labels=d$Name, + main=Class Rep Results\n(Final Results)) Error: Error in pie(d$votes, labels = d$name, main = Class Rep Results\n(Final Results)) : 'x' values must be positive. Hope to hear from someone soon Best Regards, John. -- View this message in context: http://www.nabble.com/Help-with-Error-tp14923519p14923519.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector generation
See ?outer outer(Z, Z, function(x,y) x/y) Gabor On Thu, Jan 17, 2008 at 01:24:33PM -0300, Juan Pablo Fededa wrote: Dear Contributors: I have the next vector: Z 526 723 110 1110 34 778 614 249 14 I want to generate a vector containing the ratios of all the values versus all the values of the z vector. I mean a vector containing the values of 526/723, 526/110, and so on, 723/723, 723/110, and so on, and so on. Is this doable in a simple way?? Thanks in advance again, Juan Pablo Fededa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Would it be possible for Matthew (the original person) to tell us what he ended up with for his final talk, please? Thanks, Erin On Jan 17, 2008 10:45 AM, Wittner, Ben, Ph.D. [EMAIL PROTECTED] wrote: Several people have mentioned large, messy data sets. I am curious as to in what way messy data sets are messy. (I am also curious about what SAS does that helps one deal with them, but perhaps that's asking too much.) Thanks. -Ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert Sent: Thursday, January 17, 2008 11:39 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R The argument for SAS (and Stata) when working with large dataset comes up fairly often. I have not had much experience in this area, but have been pleasantly surprised using R in combination with an SQL interface, in situations with modestly large, messy datasets. I certainly would appreciate comments on the relative merits from anyone that has more experience in this area. Paul Gilbert Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. == == La version française suit le texte anglais. -- -- This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any distribution, use, or copying of this email or the information it contains by other than the intended recipient is unauthorized. If you received this email in error please delete it immediately from your system and notify the sender promptly by email that you have done so.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Happy families are all alike; every unhappy family is unhappy in its own way. Leo Tolstoy and every messy data is messy in its own way - it's easy to define the characteristics of a clean dataset (rows are observations, columns are variables, columns contain values of consistent types). If you start to look at real life data you'll see every way you can imagine data being messy (and many that you can't)! Hadley On Jan 17, 2008 11:45 AM, Wittner, Ben, Ph.D. [EMAIL PROTECTED] wrote: Several people have mentioned large, messy data sets. I am curious as to in what way messy data sets are messy. (I am also curious about what SAS does that helps one deal with them, but perhaps that's asking too much.) Thanks. -Ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert Sent: Thursday, January 17, 2008 11:39 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R The argument for SAS (and Stata) when working with large dataset comes up fairly often. I have not had much experience in this area, but have been pleasantly surprised using R in combination with an SQL interface, in situations with modestly large, messy datasets. I certainly would appreciate comments on the relative merits from anyone that has more experience in this area. Paul Gilbert Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. == == La version française suit le texte anglais. -- -- This email may contain privileged and/or confidential information, and the Bank of Canada does not waive any related rights. Any
Re: [R] as.integer question
Thanks to all! This is really helpful! Sincerely, Erin On Jan 17, 2008 12:00 PM, Marc Schwartz [EMAIL PROTECTED] wrote: Erin Hodgess wrote: Hi R People: I'm reading Statistical Computing with R, by Maria Rizzo, and it's really good. Anyhow, I have a question about something in there. u- runif(5) u [1] 0.1177041 0.4271790 0.4601597 0.2204846 0.4051473 #in the book sum(as.integer(u 0.4)) [1] 3 #what I would do sum(u 0.4) [1] 3 Is one way better than the other, please? Thanks, Erin There is additional coercion overhead in the first approach, since as.integer() is called separately: set.seed(1) Vec - sample(c(TRUE, FALSE), 100, replace = TRUE) system.time(sum(Vec)) user system elapsed 0.004 0.000 0.025 system.time(sum(as.integer(Vec))) user system elapsed 0.013 0.019 0.050 To paraphrase a financial quote: A microsecond here, a microsecond there and pretty soon you are talking about a serious amount of time... ;-) HTH, Marc Schwartz -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Clint Bowman wrote: So how does SAS compare with one of the specialty languages such as perl. I've found the combination of perl and R to work quite satisfactorily (as long as I don't confuse the syntax and functions available in each.) Now that the topic has drifted off the subject of what stats things you can do in R that you can't do in SAS, I feel I can chime in on a couple of things that are impossible or difficult to do in SAS: 1. Look at the source code. Not impossible with SAS, you'll just need to get a job with them, and then probably get some security clearance. Or disassemble and reverse-engineer the binary, which is probably illegal. Impossible? Near enough. 2. Give a copy to your students. Again, not impossible with SAS, you'll just have to copy the CDs, write down the license code, and let them accidentally fall into your student's handbag. Of course if you can show that SAS has the power of a Turing machine then nothing computable is impossible... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS orSPSSbut simple in R
Thanks Hadley. Witty! Profound! Concise! I think this is definitely a Fortunes candidate. -- Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of hadley wickham Sent: Thursday, January 17, 2008 9:56 AM To: Wittner, Ben, Ph.D. Cc: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS orSPSSbut simple in R Happy families are all alike; every unhappy family is unhappy in its own way. Leo Tolstoy and every messy data is messy in its own way - it's easy to define the characteristics of a clean dataset (rows are observations, columns are variables, columns contain values of consistent types). If you start to look at real life data you'll see every way you can imagine data being messy (and many that you can't)! Hadley On Jan 17, 2008 11:45 AM, Wittner, Ben, Ph.D. [EMAIL PROTECTED] wrote: Several people have mentioned large, messy data sets. I am curious as to in what way messy data sets are messy. (I am also curious about what SAS does that helps one deal with them, but perhaps that's asking too much.) Thanks. -Ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Gilbert Sent: Thursday, January 17, 2008 11:39 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R The argument for SAS (and Stata) when working with large dataset comes up fairly often. I have not had much experience in this area, but have been pleasantly surprised using R in combination with an SQL interface, in situations with modestly large, messy datasets. I certainly would appreciate comments on the relative merits from anyone that has more experience in this area. Paul Gilbert Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide
Re: [R] Converting plots to ggplot2
On Jan 17, 2008 9:53 AM, Thompson, David (MNR) [EMAIL PROTECTED] wrote: Hello Hadley, I am trying to reproduce the following with ggplot: a - seq(0, 360, 5)*pi/180 ; a ac - sin(a + (45*pi/180)) + 1 ; ac plot(a, ac, type='b', xaxt = n) axis(1, at=seq(0,6,1), labels=round(seq(0,6,1)*180/pi),1) abline(v=c(45*pi/180, 225*pi/180)) I can get the basic plot: p - qplot(a, ac, geom=c('point', 'line')) ; p but cannot seem to add the vertical reference lines: # representing NE and SW compass points p + geom_vline(intercept=45*pi/180) p + geom_vline(intercept=225*pi/180) nor find a reference to manipulating the axes labels (still searching the news archives though). Also, I would like to add additional curves to the same plot with the sequence 'asc' generated by: s - seq(5, 45, 10)*pi/180 ; s asc - lapply(s, function(x) x*cos(ac) + x*sin(ac)) ; asc Try this: df - data.frame(s, asc) p + geom_path(aes(x=s, y=asc), data=df) I think Domenico answered your other questions (thanks Domenico!) Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Error
Thnk you very much! It now works correctly! Much Appreciated, John. hoogeebear wrote: Hi, I am having trouble with an error I keep getting. I am just trying to create a simple pic chart from a small table. Hope someone can help. I am new to R. Table: Name Votes John 300 Sean222 Andy 467 Sinead 740 David 124 James 641 William 380 Commands: d - read.table(C:\\rep.csv, head=TRUE, sep=,) pie(d$Votes, + labels=d$Name, + main=Class Rep Results\n(Final Results)) Error: Error in pie(d$votes, labels = d$name, main = Class Rep Results\n(Final Results)) : 'x' values must be positive. Hope to hear from someone soon Best Regards, John. -- View this message in context: http://www.nabble.com/Help-with-Error-tp14923519p14925064.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] as.integer question
Erin Hodgess wrote: Hi R People: I'm reading Statistical Computing with R, by Maria Rizzo, and it's really good. Anyhow, I have a question about something in there. u- runif(5) u [1] 0.1177041 0.4271790 0.4601597 0.2204846 0.4051473 #in the book sum(as.integer(u 0.4)) [1] 3 #what I would do sum(u 0.4) [1] 3 Is one way better than the other, please? Thanks, Erin There is additional coercion overhead in the first approach, since as.integer() is called separately: set.seed(1) Vec - sample(c(TRUE, FALSE), 100, replace = TRUE) system.time(sum(Vec)) user system elapsed 0.004 0.000 0.025 system.time(sum(as.integer(Vec))) user system elapsed 0.013 0.019 0.050 To paraphrase a financial quote: A microsecond here, a microsecond there and pretty soon you are talking about a serious amount of time... ;-) HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change R localization
Dear R-helpers, A student in my Department reports that his R is in Japanese: I lived in Japan and have a US-purchased mac with the Japanese input/ reader enabled, which must be where the Japanese came from. When I uninstalled R and reinstalled it last time (I think I completely uninstalled it, but some folders may have survived?), I installed with the Japanese option turned off. sessionInfo();localeToCharset() R version 2.5.1 (2007-06-27) powerpc-apple-darwin8.9.1 locale: ja_JP.UTF-8/ja_JP.UTF-8/ja_JP.UTF-8/C/ja_JP.UTF-8/ja_JP.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: lattice faraway reshape car 0.15-11 1.0.2 0.8.0 1.2-7 [1] UTF-8 EUC-JP What should he do to get to locale C? I believe that he isn't subscribed to this list, so please include a cc to him. Thanks, _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
Wittner, Ben, Ph.D. wrote: Several people have mentioned large, messy data sets. I am curious as to in what way messy data sets are messy. (I am also curious about what SAS does that helps one deal with them, but perhaps that's asking too much.) One aspect is that in the SAS culture (e.g. pharma industry), data are only allowed to get messy in ways that people know how to handle with SAS. Other data are not statistical data sets... Typically, people like the flexibility of the DATA step in SAS; this allows things like having input data where the records have different formats depending on a code in column 1-3. Once data have been converted to rectangular data sets, there is very little you can do more conveniently in SAS than in R, the main exception could be things that truly require sequential processing beyond cumsum and cumprod. You can of course do that sort of thing in R with an explicit loop over data frame rows, but it does get slow. On the other hand, SAS is not well suited for massively irregular data, e.g. with images inside. Not that this is an area where R shines particularly brightly, but at least it is possible to get a handle on things there. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Any tools for working with US 2000 census data?
I've been given the job of extracting some data from the United States 2000 census (files at http://www2.census.gov/census_2000/datasets/Summary_File_2/Maryland/all_ Maryland.zip 52M). I'm only interested in Census Block Groups (CBGs) located within Baltimore City, Maryland. Additionally, I just have to extract certain data fields. I think I'll be using Summary File 2. This is my first experience working with US census data. I wasn't successful finding anything using RSiteSearch, although there were some packages with data extracted from the US 2000 census. Are there any pre-constructed tools in R for working with this data? Does the US 2000 census data itself come packaged in R? If there are no R tools, I'd welcome any suggestions on working with this data from anyone experienced with it. Thanks for your advice and suggestions for me. -Kevin Kevin Zembower Internet Services Group manager Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University 111 Market Place, Suite 310 Baltimore, Maryland 21202 410-659-6139 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 'simulate.p.value' for goodness of fit
R Help on 'chisq.test' states that if 'simulate.p.value' is 'TRUE', the p-value is computed by Monte Carlo simulation with 'B' replicates. In the contingency table case this is done by random sampling from the set of all contingency tables with given marginals, and works only if the marginals are positive... In the goodness-of-fit case this is done by random sampling from the discrete distribution specified by 'p', each sample being of size 'n = sum(x)'. The last paragraph suggests that in the goodness-of-fit case, if p gives the expected probability for each cell, this random sampling is multinomial. Unfortunately, as the following examples reveal, the sampling model is neither multinomial nor hypergeometric - at least when it is applied to a 4-fold table. This is rather sad as some people assume that R's chisq.test function can perform a Monte Carlo test of X-squared, employing a multinomial model - in other words, assuming that your data are a single random sample. ### Example 1. x=matrix(c(1,2,3,4),nc=2) # To begin with, let us apply the large-sample approximations chisq.test(x,correct=TRUE)$p.value [1] 0.6726038 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, correct = TRUE) chisq.test(x,correct=FALSE)$p.value [1] 0.7781597 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, correct = FALSE) # So let us apply a 2-tailed test of O.R.=1, using a hypergeometric model fisher.test(x)$p.value [1] 1 # This should also apply a hypergeometric model chisq.test(x,simulate.p.value=TRUE,B=50)$p.value [1] 1 # Now we work out the expected probability for each cell p=outer(c(sum(x[1,]),sum(x[2,])),c(sum(x[,1]),sum(x[,2])))/sum(x)^2 # But this applies a hypergeometric model, presumably because p is not scalar chisq.test(x,p=p,simulate.p.value=TRUE,B=50)$p.value [1] 1 # This seems to do something different, # at any rate it is much slower, and needs more memory chisq.test(x[1:4],p=p[1:4],simulate.p.value=TRUE,B=1)$p.value [1] 1 # Which would appear to be using the same model as above # Now let us apply an X2 test using a multinomial model # (The code for this x2.test function is in Appendix 1, below.) x2.test(x,R=20) with cc P = 0.7316812 conventional-P = 0.8838786 mid-P = 0.8423058 # All of these P-values are higher than those given by the Chi-squared approximation, # but they certainly do not equal 1. # But is this is an artefact of our very small sample? ### Example 2. # Let us try a larger sample x=matrix(c(56,35,23,42),nc=2) # First we apply the asymptotic model chisq.test(x,correct=TRUE)$p.value [1] 0.00425 chisq.test(x,correct=FALSE)$p.value [1] 0.001276595 # Now for the hypergeometric (fixed margin totals model) fisher.test(x)$p.value [1] 0.001931078 chisq.test(x,simulate.p.value=TRUE,B=50)$p.value [1] 0.001913996 p=outer(c(sum(x[1,]),sum(x[2,])),c(sum(x[,1]),sum(x[,2])))/sum(x)^2 chisq.test(x,p=p,simulate.p.value=TRUE,B=50)$p.value [1] 0.001891996 Next comes what we had hoped to be a multinomial test chisq.test(x[1:4],p=p[1:4],simulate.p.value=TRUE,B=1)$p.value [1] 0.01639836 # This is obviously not the same hypergeometric model as used for a # chi-squared test. # The P-value is about 10x of the approximate tests (above) # or the exact tests (below). x2.test(x,R=20) with cc P = 0.002059990 conventional-P = 0.001184994 mid-P = 0.001172494 # Whatever that chi-squared test model IS, it is certainly not multinomial! # Could it possibly be Poisson and, if so, why??? Appendix 1: # We have used these functions to do a 2x2 multinomial test of X2: x2=function(y,cc=FALSE){ y=y*1.;n=sum(y);C=cc*n/2 a=y[1];b=y[2];c=y[3];d=y[4] ab=a+b;cd=c+d;ac=a+c;bd=b+d D=ab*cd*ac*bd if(D==0)x2=NA else x2=n*(abs(a*d-b*c)-C)^2/D x2} x2.test=function(x,R=5000){ n=sum(x) p=outer(c(sum(x[1,]),sum(x[2,])),c(sum(x[,1]),sum(x[,2])))/n/n Q=sort(apply(rmultinom(R,n,p),2,x2)) q=x2(x,cc=TRUE) pl=rank(c(q,Q),ties.method='max')[1]/(length(Q)+1) pe=sum(c(q,Q)==q)/(length(Q)+1);pu=1-pl+pe cat('with cc P = ',pu,'\n') q=x2(x) pl=rank(c(q,Q),ties.method='max')[1]/(length(Q)+1) pe=sum(c(q,Q)==q)/(length(Q)+1);pu=1-pl+pe cat('conventional-P = ',pu,'\n') cat('mid-P = ',pu-pe/2,'\n')} Bob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R
So how does SAS compare with one of the specialty languages such as perl. I've found the combination of perl and R to work quite satisfactorily (as long as I don't confuse the syntax and functions available in each.) Clint Clint BowmanINTERNET: [EMAIL PROTECTED] Air Dispersion Modeler INTERNET: [EMAIL PROTECTED] Air Quality Program VOICE: (360) 407-6815 Department of Ecology FAX:(360) 407-7534 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels:300 Desmond Drive, Lacey, WA 98503-1274 On Thu, 17 Jan 2008, Walter Paczkowski wrote: Good morning, I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the messy datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great. Walt Paczkowski Data Analytics Corp. -Original Message- From: Rob Robinson [EMAIL PROTECTED] Sent: Jan 17, 2008 4:31 AM To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R I wonder if those who complain about SAS as a programming environment have discovered SAS/IML which provides a programming environment akin to Matlab which is more than capable (at least for those problems which can be treated with a matrix like approach). As someone who uses both SAS and R - graphical output is so much easier in R, but for handling large 'messy' datasets SAS wins hands down... Cheers Rob *** Want to know about Britain's birds? Try www.bto.org/birdfacts *** Dr Rob Robinson, Senior Population Biologist British Trust for Ornithology, The Nunnery, Thetford, Norfolk, IP24 2PU Ph: +44 (0)1842 750050 E: [EMAIL PROTECTED] Fx: +44 (0)1842 750030 W: http://www.bto.org How can anyone be enlightened, when truth is so poorly lit = -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey J. Hallman Sent: 16 January 2008 22:38 To: [EMAIL PROTECTED] Subject: Re: [R] things that are difficult/impossible to do in SAS or SPSSbut simple in R SAS has no facilities for date arithmetic and no easy way to build it yourself. In fact, that's the biggest problem with SAS: it stinks as a programming environment, so it's always much more difficult than it should be to do something new. As soon as you get away from the canned procs and have to write something of your own, SAS falls down. I don't know enough about SPSS to comment. -- Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-plot plotting
On Jan 17, 2008, at 9:44 AM, Johannes Graumann wrote: I really do not know ho to else title this ... I want to draw something like the attached png with R and would like to poll you on how to start ... make an empty plot first and then start positioning the characterstring by 'text' and then drawing the lines ... That sound about right. I would probably do some computations first to determine how much space I would need etc. A key thing I think would be to choose a good coordinate system, that would make placing the text and lines easier. I would also probably write minifunctions, that place a whole y plus its lines etc. Now, if you want kerning on the letters, that probably gets trickier. But as long as you are willing to assume a fixed horizontal space for each letter, I see no problems. (In your case, for instance, I would probably choose the x dimensions to go from 0 to 13, and place the letters each at the coordinates 1 through 12, centered). Joh sequence.png Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Error
It looks fine to me. Try str(d) and check to be sure that Votes is a numeric value or integer value. I ran this code with no problem. x - Name Votes John 300 Sean222 Andy 467 Sinead 740 David 124 James 641 William 380 d - read.table(textConnection(x), header=TRUE, as.is=TRUE); d pie(d$Votes, labels=d$Name, main=Class Rep Results\n(Final Results)) --- hoogeebear [EMAIL PROTECTED] wrote: Hi, I am having trouble with an error I keep getting. I am just trying to create a simple pic chart from a small table. Hope someone can help. I am new to R. Table: Name Votes John 300 Sean222 Andy 467 Sinead 740 David 124 James 641 William 380 Commands: d - read.table(C:\\rep.csv, head=TRUE, sep=,) pie(d$Votes, + labels=d$Name, + main=Class Rep Results\n(Final Results)) Error: Error in pie(d$votes, labels = d$name, main = Class Rep Results\n(Final Results)) : 'x' values must be positive. Hope to hear from someone soon Best Regards, John. -- View this message in context: http://www.nabble.com/Help-with-Error-tp14923519p14923519.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger at http://ca.beta.messenger.yahoo.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Any tools for working with US 2000 census data?
On Thursday 17 January 2008 10:55:03 am Zembower, Kevin wrote: I've been given the job of extracting some data from the United States 2000 census (files at http://www2.census.gov/census_2000/datasets/Summary_File_2/Maryland/all_ Maryland.zip 52M). I'm only interested in Census Block Groups (CBGs) located within Baltimore City, Maryland. Additionally, I just have to extract certain data fields. I think I'll be using Summary File 2. This is my first experience working with US census data. I wasn't successful finding anything using RSiteSearch, although there were some packages with data extracted from the US 2000 census. Are there any pre-constructed tools in R for working with this data? Does the US 2000 census data itself come packaged in R? If there are no R tools, I'd welcome any suggestions on working with this data from anyone experienced with it. Thanks for your advice and suggestions for me. -Kevin Have a look at the PostGIS and GeoServer projects. I recall that there is an excellent tutorial on making sense of the TIGER data with PostGIS. Cheers, -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'simulate.p.value' for goodness of fit
I'm afraid I can't follow your examples, but you seem to me to be mixing contingency table tests and goodness of fit tests in a somewhat incoherent fashion. Note that your ``x2()'' function does a contingency table test, and not a goodness of fit test. Note that in chisq.test(), if ``x'' is a matrix, then the ``p'' argument is ignored. For a goodness of fit test, the sampling in chisq.test ***is*** multinomial. It uses the sample() function, as a quick glance at the code would have told you. I haven't the time to plough through your code and figure out what you're driving at, but I think that part of your problem could be the degrees of freedom. Under the contingency table test the degrees of freedom are 1; under the goodness of fit test the degrees of freedom are 3. (The vector of probabilities is *known* under the g.o.f. test, not estimated.) cheers, Rolf Turner On 18/01/2008, at 7:59 AM, Bob wrote: R Help on 'chisq.test' states that if 'simulate.p.value' is 'TRUE', the p-value is computed by Monte Carlo simulation with 'B' replicates. In the contingency table case this is done by random sampling from the set of all contingency tables with given marginals, and works only if the marginals are positive... In the goodness-of-fit case this is done by random sampling from the discrete distribution specified by 'p', each sample being of size 'n = sum(x)'. The last paragraph suggests that in the goodness-of-fit case, if p gives the expected probability for each cell, this random sampling is multinomial. Unfortunately, as the following examples reveal, the sampling model is neither multinomial nor hypergeometric - at least when it is applied to a 4- fold table. This is rather sad as some people assume that R's chisq.test function can perform a Monte Carlo test of X-squared, employing a multinomial model - in other words, assuming that your data are a single random sample. ### Example 1. x=matrix(c(1,2,3,4),nc=2) # To begin with, let us apply the large-sample approximations chisq.test(x,correct=TRUE)$p.value [1] 0.6726038 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, correct = TRUE) chisq.test(x,correct=FALSE)$p.value [1] 0.7781597 Warning message: Chi-squared approximation may be incorrect in: chisq.test(x, correct = FALSE) # So let us apply a 2-tailed test of O.R.=1, using a hypergeometric model fisher.test(x)$p.value [1] 1 # This should also apply a hypergeometric model chisq.test(x,simulate.p.value=TRUE,B=50)$p.value [1] 1 # Now we work out the expected probability for each cell p=outer(c(sum(x[1,]),sum(x[2,])),c(sum(x[,1]),sum(x[,2])))/sum(x)^2 # But this applies a hypergeometric model, presumably because p is not scalar chisq.test(x,p=p,simulate.p.value=TRUE,B=50)$p.value [1] 1 # This seems to do something different, # at any rate it is much slower, and needs more memory chisq.test(x[1:4],p=p[1:4],simulate.p.value=TRUE,B=1)$p.value [1] 1 # Which would appear to be using the same model as above # Now let us apply an X2 test using a multinomial model # (The code for this x2.test function is in Appendix 1, below.) x2.test(x,R=20) with cc P = 0.7316812 conventional-P = 0.8838786 mid-P = 0.8423058 # All of these P-values are higher than those given by the Chi- squared approximation, # but they certainly do not equal 1. # But is this is an artefact of our very small sample? ### Example 2. # Let us try a larger sample x=matrix(c(56,35,23,42),nc=2) # First we apply the asymptotic model chisq.test(x,correct=TRUE)$p.value [1] 0.00425 chisq.test(x,correct=FALSE)$p.value [1] 0.001276595 # Now for the hypergeometric (fixed margin totals model) fisher.test(x)$p.value [1] 0.001931078 chisq.test(x,simulate.p.value=TRUE,B=50)$p.value [1] 0.001913996 p=outer(c(sum(x[1,]),sum(x[2,])),c(sum(x[,1]),sum(x[,2])))/sum(x)^2 chisq.test(x,p=p,simulate.p.value=TRUE,B=50)$p.value [1] 0.001891996 Next comes what we had hoped to be a multinomial test chisq.test(x[1:4],p=p[1:4],simulate.p.value=TRUE,B=1)$p.value [1] 0.01639836 # This is obviously not the same hypergeometric model as used for a # chi-squared test. # The P-value is about 10x of the approximate tests (above) # or the exact tests (below). x2.test(x,R=20) with cc P = 0.002059990 conventional-P = 0.001184994 mid-P = 0.001172494 # Whatever that chi-squared test model IS, it is certainly not multinomial! # Could it possibly be Poisson and, if so, why??? Appendix 1: # We have used these functions to do a 2x2 multinomial test of X2: x2=function(y,cc=FALSE){ y=y*1.;n=sum(y);C=cc*n/2 a=y[1];b=y[2];c=y[3];d=y[4] ab=a+b;cd=c+d;ac=a+c;bd=b+d D=ab*cd*ac*bd if(D==0)x2=NA else x2=n*(abs(a*d-b*c)-C)^2/D x2} x2.test=function(x,R=5000){ n=sum(x)
[R] nlme: Variogram.gls error with grouping factor
I am using gls to fit a linear model with spatially-autocorrelated errors. My first step is to fit a simple model, and inspect a semivariogram of residuals. The following example gives this error: Error in FUN(X[[1L]], ...) : unused argument(s) (method = euclidean) #Example library(nlme) x - runif(60, 0, 1) # location in x y - runif(60, 0, 1) # location in y f - factor(rep(c(A, B, C), each = 20)) # grouping factor pred - runif(60, 0 , 1) # predictor variable resp - pred + rnorm(60, 0, 0.5) #response variable dat - data.frame(x, y, f, pred, resp) #make data frame rm(x,y,f,pred,resp) #remove variables m1 - gls(resp ~ pred*f, dat) # simple model assuming independent errors Vm1 - Variogram(m1, form = ~ x + y, metric = euclidean) # works fine Vgm1 - Variogram(m1, form = ~ x + y | f, metric = euclidean) # gives error #End Running the example in ?Variogram.gls gives the same error fm1 - gls(weight ~ Time * Diet, BodyWeight) Variogram(fm1, form = ~ Time | Rat)[1:10,] Any ideas most appreciated. Dan Bebber Checked by AVG Free Edition. 09:01 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loading only particular columns from csv file...
Thank you very much... That was helpful.. On Jan 15, 2008 12:58 AM, Charles C. Berry [EMAIL PROTECTED] wrote: On Mon, 14 Jan 2008, Marko Milicic wrote: Dear all, I'm trying to process HUGE datasets with R. It's very fast, but I would like to optimize it a bit more, by focusing one one column at time. say file is 1GB big and has 100 columns. In order to prevent out of memory problems I need to load one column at the time the only problem is that read.table doesn't support this feature Is there some thick which will do the magic? There is a unix utility called 'cut' that enables stuff like columns.1.3.5.to.7 - read.table( pipe( cut -f1,3,5-7 myfile ) ) and if you have numeric data only, using scan() directly will save some space. HTH, Chuck Thank you in advance. -- This e-mail and any files transmitted with it are confid...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 -- This e-mail and any files transmitted with it are confid...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSS but simple in R
Max Kuhn wrote: Factors have huge benefits over character data in SAS. For a series regulatory filings, I had miles of SAS code to compute KxK tables where all the cells must show up. For example, if one of the levels of one of the variables was never observed, the corresponding row or column would not show up in proc freq. The basic way around this was to get all possible combinations of the variables and assign each cell to have a row count of 0.0001. Then you would merge this data with the real counts. The missing row/columns would show up since they had data, but it was below the printing threshold of proc freq. Hoepfully, they have added a feature to do this. On 18/1/08 4:44 AM, Peter Dalgaard wrote: I could have sworn that this was a fluke and that it would work if you put a user-defined format on the classification variable, but no go I can't find anything that does this, neither in PROC FREQ nor PROC TABULATE. I believe the CLASSDATA option in PROC TABULATE lets you specify which values will show up in the table, including unobserved values. http://support.sas.com/onlinedoc/913/getDoc/en/proc.hlp/a002473736.htm#a003069171 I'm not aware of any way to do this in PROC FREQ, though. -- James Reilly Department of Statistics, University of Auckland Private Bag 92019, Auckland, New Zealand __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to specify a particular contrast
Hi, I am running a simple one-way ANOVA with an independent factot variable treat (3 levels: a, b and c) and a response variable y. I want to test a linear relationship of the response among the 3 levels of the variable treat (ordered a-b-c). I used glht() from multcomp package. Later I found out I need to exclude the situation where the response at the 3 levels of treat are equal. I can do separate contrasts to test them separately: obj-aov(y~treat,data=dat) ### testing a=b=c summary(glht(obj, linfct= mcp (treat=c('a-b=0','a-c=0','b-c=0'))),test=Ftest()) ### testing linear relationship among a,b and c summary(glht(obj, linfct= mcp (treat=c('a+c-2*b=0'))),test=Ftest()) Is there anyway to build one contrast that tests both at the same time, i.e. just generate one single p value. Because the ultimate purpose was to test the linear relationship among the 3 levels of the variable treat. Or I am asking something that is non-sensible to do? Thanks John Zhang Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Res: vector generation
hi Juan, It is not so elegant, but work fine. I know that our colleagues can do it on a simple line. z-c(526,723,110,1110,34,778,614,249,14) v1-NULL v2-NULL for (i in 1:(length(z)-1)) { for (j in i:length(z)) { v1-rbind(v1,z[i]) v2-rbind(v2,z[j]) } } df-data.frame(cbind(v1=v1,v2=v2)) names(df)-c(v1,v2) df$ratio-df$v1/df$v2 Kind regards, Miltinho Brazil - Mensagem original De: Juan Pablo Fededa [EMAIL PROTECTED] Para: [EMAIL PROTECTED] Enviadas: Quinta-feira, 17 de Janeiro de 2008 13:24:33 Assunto: [R] vector generation Dear Contributors: I have the next vector: Z 526 723 110 1110 34 778 614 249 14 I want to generate a vector containing the ratios of all the values versus all the values of the z vector. I mean a vector containing the values of 526/723, 526/110, and so on, 723/723, 723/110, and so on, and so on. Is this doable in a simple way?? Thanks in advance again, Juan Pablo Fededa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. para armazenamento! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] things that are difficult/impossible to do in SAS or SPSS but simple in R
James Reilly wrote: Max Kuhn wrote: Factors have huge benefits over character data in SAS. For a series regulatory filings, I had miles of SAS code to compute KxK tables where all the cells must show up. For example, if one of the levels of one of the variables was never observed, the corresponding row or column would not show up in proc freq. The basic way around this was to get all possible combinations of the variables and assign each cell to have a row count of 0.0001. Then you would merge this data with the real counts. The missing row/columns would show up since they had data, but it was below the printing threshold of proc freq. Hoepfully, they have added a feature to do this. On 18/1/08 4:44 AM, Peter Dalgaard wrote: I could have sworn that this was a fluke and that it would work if you put a user-defined format on the classification variable, but no go I can't find anything that does this, neither in PROC FREQ nor PROC TABULATE. I believe the CLASSDATA option in PROC TABULATE lets you specify which values will show up in the table, including unobserved values. http://support.sas.com/onlinedoc/913/getDoc/en/proc.hlp/a002473736.htm#a003069171 I'm not aware of any way to do this in PROC FREQ, though. You can specify the COMPLETETYPES option in PROC MEANS or PROC SUMMARY to include output rows for empty cells in a cross-classification/crosstabulation - but you won't get a nicely formatted table - you'll have to do that yourself, or wrestle with PROC TABULATE. See http://support.sas.com/onlinedoc/913/getDoc/en/proc.hlp/a000146729.htm - it is a new feature in Version 9.x of SAS, I think? Tim C __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] non-plot plotting
Johannes Graumann wrote: I really do not know ho to else title this ... I want to draw something like the attached png with R and would like to poll you on how to start ... make an empty plot first and then start positioning the characterstring by 'text' and then drawing the lines ... Joh Johannes, Try this (PDF of the output attached): # Open the initial plot window and set # up a coordinate system based upon # placement of the letters centered # on integer 'x' values 1:12 plot(1, xlim = c(0, 13), ylim = c(0, 4), type = n, ann = FALSE, axes = FALSE) # Create the vector of letters Vec - c(T, V, F, S, Q, A, Q, L, C, A, L, K) # Plot them text(1:12, 2, Vec, cex = 2, font = 2) # Get the height of the letters for spacing # See ?strheight height - strheight(T, cex = 2) * .8 # Set a default width for the boxes # around the letter width - 0.5 # Set the values for 'Y's Y - 10:2 X - 3:11 # Loop over the Y's and using plotmath # plot the values and subscripts in bold # See ?plotmath and ?bquote # While looping, do the colored segments for (i in 1:9) { text(X[i], 2 + (height * 1.6), bquote(bold(Y[.(Y[i])])), col = red, font = 2) x - c(X[i] - width, X[i] - width, X[i] + 0.3) y - c(2, 2 + height, 2 + height) lines(x, y, col = red, lwd = 2) } # Same here now for the 'b's b - c(2:6, 8, 9, 11) X - b for (i in 1:8) { text(X[i], 2 - (height * 1.6), bquote(bold(b[.(b[i])])), col = blue, font = 2) x - c(X[i] - 0.3, X[i] + width, X[i] + width) y - c(2 - height, 2 - height, 2) lines(x, y, col = blue, lwd = 2) } The above should provide the basic approach. You can then adjust/fine tune spacing, etc. as you require. HTH, Marc Schwartz plot.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify a particular contrast
I don't quite follow what you are trying to do but the second contrast has a few interpretations with the same meaning in your case 1) are the 2-1 and 3-2 differences equal 2) lack of fit of a linear trend 3) is there a quadratic response If you declare your factor to be ordered then the default contrasts will be poly()nomials. Ross Darnell -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of array chip Sent: Friday, 18 January 2008 10:22 AM To: [EMAIL PROTECTED] Subject: [R] how to specify a particular contrast Hi, I am running a simple one-way ANOVA with an independent factot variable treat (3 levels: a, b and c) and a response variable y. I want to test a linear relationship of the response among the 3 levels of the variable treat (ordered a-b-c). I used glht() from multcomp package. Later I found out I need to exclude the situation where the response at the 3 levels of treat are equal. I can do separate contrasts to test them separately: obj-aov(y~treat,data=dat) ### testing a=b=c summary(glht(obj, linfct= mcp (treat=c('a-b=0','a-c=0','b-c=0'))),test=Ftest()) ### testing linear relationship among a,b and c summary(glht(obj, linfct= mcp (treat=c('a+c-2*b=0'))),test=Ftest()) Is there anyway to build one contrast that tests both at the same time, i.e. just generate one single p value. Because the ultimate purpose was to test the linear relationship among the 3 levels of the variable treat. Or I am asking something that is non-sensible to do? Thanks John Zhang Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Probability weights with density estimation
Charles C. Berry [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: On Wed, 16 Jan 2008, David Winsemius wrote: I am a physician examining an NHANES dataset available at the NCHS website: http://www.cdc.gov/nchs/about/major/nhanes/nhanes2005-2006/demo_d.xpt snip TC.ran - exp(rnorm(400,1.5,.3)) HDL.ran - exp(rnorm(400,.4,.3) ) f1-kde2d(HDL.ran,TC.ran,n=25,lims=c(0,4,2,10)) contour(f1$x,f1$y,f1$z,ylim=c(0,8),xlim=c(0,3),ylab=TC mmol/L, xlab=HDL mmol/L) lines(f1$x,5*f1$x) # iso-ratio lines lines(f1$x,4*f1$x) lines(f1$x,3*f1$x) Two questions: Is there a 2d density estimation function that has provision for probability weights (or inverse sampling probabilities)? snip It looks like you can use bkde2D from the KernSmooth package. You might look at the function sqlocpoly in surveyNG which uses the KernSmooth package for details. The prospect of setting up an SQL database was rather daunting and I continued my search. There were references in the the sql.. functions' documentation that they were providing the functions in package Locfit. Finding locfit() provided the weighting options I needed. This is what I came up with: tc.hdl.fit - with(small.nh.chol, locfit(~LBDHDDSI+LBDTCSI, weights=WTMEC2YR, xlim=c(0,0,4,10) ) ) plot(tc.hdl.fit)#give warnings but does work title(main=Weighted, xlab=HDL, ylab=TC) # add labels _after_ plotting. # never could figure out how to get plot() to accept xlab or ylab # when passing the locfit object to it. with(tc.hdl.fit, lines(x,x*4)) -- Thanks; and thank you, Andy Liaw, for helpful earlier posts; David Winsemius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Res: vector generation
If I understand you correctly, use outer() -- the for loops suggested below are not the R way (nor do they seem to fully address the and so on part of your request): ratios - outer(z,z,/) This produces a matrix, the first column of which is z/z[1], the second of which is z/z[2], and so forth. Since a matrix is just a vector with a dim attribute, you're done; but if you want to show it as a vector, just use, e.g. as.vector(outer(z,z,/)) Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Milton Cezar Ribeiro Sent: Thursday, January 17, 2008 3:04 PM To: Juan Pablo Fededa; [EMAIL PROTECTED] Subject: [R] Res: vector generation hi Juan, It is not so elegant, but work fine. I know that our colleagues can do it on a simple line. z-c(526,723,110,1110,34,778,614,249,14) v1-NULL v2-NULL for (i in 1:(length(z)-1)) { for (j in i:length(z)) { v1-rbind(v1,z[i]) v2-rbind(v2,z[j]) } } df-data.frame(cbind(v1=v1,v2=v2)) names(df)-c(v1,v2) df$ratio-df$v1/df$v2 Kind regards, Miltinho Brazil - Mensagem original De: Juan Pablo Fededa [EMAIL PROTECTED] Para: [EMAIL PROTECTED] Enviadas: Quinta-feira, 17 de Janeiro de 2008 13:24:33 Assunto: [R] vector generation Dear Contributors: I have the next vector: Z 526 723 110 1110 34 778 614 249 14 I want to generate a vector containing the ratios of all the values versus all the values of the z vector. I mean a vector containing the values of 526/723, 526/110, and so on, 723/723, 723/110, and so on, and so on. Is this doable in a simple way?? Thanks in advance again, Juan Pablo Fededa __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. para armazenamento! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need to install RDCOM
Hi, Where do I find the link to install RDCOM? I need to use RExcel for my project. Thanks a lot! Uma This e-mail may contain confidential and/or privileged i...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify a particular contrast
I didn't set the variable treat as ordered factor, just ordinary factor. In my example, the 3 levels of treat are of equal distance in the order of from a to b to c. So my understanding is that a contrast in the form of a+c-2*b=0 was to test a linear trend of the response variable among the 3 levels of treat. However, this linear trend also includes a special situation where a=b=c, i.e. the response variable among the 3 levels are the same, which is contrary to the linear trend conclusion and is what I want to exclude. Hope my explanation helps. Thanks --- Ross Darnell [EMAIL PROTECTED] wrote: I don't quite follow what you are trying to do but the second contrast has a few interpretations with the same meaning in your case 1) are the 2-1 and 3-2 differences equal 2) lack of fit of a linear trend 3) is there a quadratic response If you declare your factor to be ordered then the default contrasts will be poly()nomials. Ross Darnell -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of array chip Sent: Friday, 18 January 2008 10:22 AM To: [EMAIL PROTECTED] Subject: [R] how to specify a particular contrast Hi, I am running a simple one-way ANOVA with an independent factot variable treat (3 levels: a, b and c) and a response variable y. I want to test a linear relationship of the response among the 3 levels of the variable treat (ordered a-b-c). I used glht() from multcomp package. Later I found out I need to exclude the situation where the response at the 3 levels of treat are equal. I can do separate contrasts to test them separately: obj-aov(y~treat,data=dat) ### testing a=b=c summary(glht(obj, linfct= mcp (treat=c('a-b=0','a-c=0','b-c=0'))),test=Ftest()) ### testing linear relationship among a,b and c summary(glht(obj, linfct= mcp (treat=c('a+c-2*b=0'))),test=Ftest()) Is there anyway to build one contrast that tests both at the same time, i.e. just generate one single p value. Because the ultimate purpose was to test the linear relationship among the 3 levels of the variable treat. Or I am asking something that is non-sensible to do? Thanks John Zhang Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.