Re: [R] Data Set
On Sun, 2007-07-22 at 21:51 -0700, Stephen Tucker wrote: It turns out that - and (space) are not valid variable names. They are valid names, the problem is that they aren't very convenient to use, as the OP discovered, because they need to be quoted. Note that if using something like read.csv or read.table, R will correct these problem variable names for you when you import the data. If you read this file in for example: Mydata,S-sharif,A site 1,45,34 2,66,45 3,79,56 using read.csv, you get easy to use names dat - read.csv(temp.csv) dat Mydata S.sharif A.site 1 1 45 34 2 2 66 45 3 3 79 56 You can turn off this safety checking using the argument check.names = FALSE G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
My bad... corrections (semantic and otherwise) always appreciated. I'm still learning too. I also forgot the alternative of using make.names() instead of manually assigning 'more convenient' names. input - Mydata,S-sharif,A site 1,45,34 2,66,45 3,79,56 dat - read.csv(textConnection(input),check.names=FALSE) dat Mydata S-sharif A site 1 1 45 34 2 2 66 45 3 3 79 56 names(dat) [1] Mydata S-sharif A site names(dat) - make.names(names(dat)) names(dat) [1] Mydata S.sharif A.site Which, in the case of the data set, Monsoon, I don't know how it was created originally but may be convenient to reassign names by names(Monsoon) - make.names(names(Monsoon)) --- Gavin Simpson [EMAIL PROTECTED] wrote: On Sun, 2007-07-22 at 21:51 -0700, Stephen Tucker wrote: It turns out that - and (space) are not valid variable names. They are valid names, the problem is that they aren't very convenient to use, as the OP discovered, because they need to be quoted. Note that if using something like read.csv or read.table, R will correct these problem variable names for you when you import the data. If you read this file in for example: Mydata,S-sharif,A site 1,45,34 2,66,45 3,79,56 using read.csv, you get easy to use names dat - read.csv(temp.csv) dat Mydata S.sharif A.site 1 1 45 34 2 2 66 45 3 3 79 56 You can turn off this safety checking using the argument check.names = FALSE G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Set
Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Thank You Regards -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote: Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Then you don't have a column named exactly S.Sharif in your object data. What does str(data) and names(data) tell you about the columns in your data set? If looking at these doesn't help you, post the output from str(data) and names(data) and someone might be able to help. You should always check that R has imported the data in the way you expect; just because you think there is something in there called S.Sharif doesn't mean R sees it that way. You also seem to have included the R-Help email address twice in the To: header of your email - once is sufficient. G Thank You Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
Sir the station name S.Sharif exists in the data but still the error is ocurring of being not found. Please help in this regard. On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote: On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote: Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Then you don't have a column named exactly S.Sharif in your object data. What does str(data) and names(data) tell you about the columns in your data set? If looking at these doesn't help you, post the output from str(data) and names(data) and someone might be able to help. You should always check that R has imported the data in the way you expect; just because you think there is something in there called S.Sharif doesn't mean R sees it that way. You also seem to have included the R-Help email address twice in the To: header of your email - once is sufficient. G Thank You Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
Could you post the output from str(data) ? Perhaps that will give us a clue. --- amna khan [EMAIL PROTECTED] wrote: Sir the station name S.Sharif exists in the data but still the error is ocurring of being not found. Please help in this regard. On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote: On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote: Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Then you don't have a column named exactly S.Sharif in your object data. What does str(data) and names(data) tell you about the columns in your data set? If looking at these doesn't help you, post the output from str(data) and names(data) and someone might be able to help. You should always check that R has imported the data in the way you expect; just because you think there is something in there called S.Sharif doesn't mean R sees it that way. You also seem to have included the R-Help email address twice in the To: header of your email - once is sufficient. G Thank You Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
On Sun, 2007-07-22 at 12:09 -0700, amna khan wrote: Sir the station name S.Sharif exists in the data but still the error is ocurring of being not found. Please help in this regard. If you take the time to do what I asked and actually post the results of typing the following into your R session: str(data) And send the output to the list, then we will be able to help. Did you read /all/ of my email? I did ask you to do this. HTH G On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote: On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote: Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Then you don't have a column named exactly S.Sharif in your object data. What does str(data) and names(data) tell you about the columns in your data set? If looking at these doesn't help you, post the output from str(data) and names(data) and someone might be able to help. You should always check that R has imported the data in the way you expect; just because you think there is something in there called S.Sharif doesn't mean R sees it that way. You also seem to have included the R-Help email address twice in the To: header of your email - once is sufficient. G Thank You Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Set
It turns out that - and (space) are not valid variable names. You can get around that in two ways: == names(Monsoon)[2] - S.Sharif names(Monsoon)[8] - Islamabad.AP attach(Monsoon) S.Sharif Islamabad.AP detach(Monsoon) and do the same for other variable names that contain - or characters. = The other way is to enclose the names in ``. For instance: attach(Monsoon) `S-Sharif` `Islamabad AP` detach(Monsoon) Here is my example in which it works: x - list(1:5,6:8) names(x) - c(S-Sharif,Peshawar) str(x) List of 2 $ S-Sharif: int [1:5] 1 2 3 4 5 $ Peshawar: int [1:3] 6 7 8 attach(x) `S-Sharif` [1] 1 2 3 4 5 detach(x) --- amna khan [EMAIL PROTECTED] wrote: Yes Sir I am sending u the clue for data. str(Monsoon) List of 23 $ Dir : num [1:40] 72.4 60.7 52.1. $ S-Sharif : num [1:55] 23.6 93.5 36.3 .. $ Peshawar : num [1:57] 54.4 27.7 ... $ Kakul : num [1:54] 50.3 116.1 ... $ Balakot : num [1:47] 218.2 76.5 ... $ Parachinar: num [1:40] 41.4 37.6 62.2... $ Kohat : num [1:53] 50.8 93.2 94.5 ... $ Islamabad AP : num [1:48] 140.2 69.3... $ Murree: num [1:47] 130.0 131.3 74.4 ... $ Islamabad SRRC: num [1:24] 172.2 82.3 150.1 ... $ Mian Wali : num [1:48] 80.5 48.5 56.6 43.2 ... $ Jhelum: num [1:57] 111.8 82.3 53.8 94.7 ... $ Sialkot : num [1:55] 62.7 126.0 90.7 ... $ D-I Khan : num [1:57] 24.9 40.6 34.3 ... $ Faisalabad: num [1:56] 79.2 43.9 55.4 ... $ Lahore: num [1:60] 32.5 81.5 28.7 ... when I attach the data file and access the site S-Sharif or D-I Khan or Mian Wali then error messages occur. Please help in this regard. Thank You On 7/23/07, Stephen Tucker [EMAIL PROTECTED] wrote: Could you post the output from str(data) ? Perhaps that will give us a clue. --- amna khan [EMAIL PROTECTED] wrote: Sir the station name S.Sharif exists in the data but still the error is ocurring of being not found. Please help in this regard. On 7/22/07, Gavin Simpson [EMAIL PROTECTED] wrote: On Sun, 2007-07-22 at 03:25 -0700, amna khan wrote: Hi Sir I have made a data set having 23 stations of rainfall. when I use the attach function to approach indevidual stations then following error occurr. *attach(data)* *S.Sharif#S.Sharif is the station name which has 50 data values* *Error: object S.Sharif not found* Now how to solve this problem. Then you don't have a column named exactly S.Sharif in your object data. What does str(data) and names(data) tell you about the columns in your data set? If looking at these doesn't help you, post the output from str(data) and names(data) and someone might be able to help. You should always check that R has imported the data in the way you expect; just because you think there is something in there called S.Sharif doesn't mean R sees it that way. You also seem to have included the R-Help email address twice in the To: header of your email - once is sufficient. G Thank You Regards -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. news, photos more. http://mobile.yahoo.com/go?refer=1GNXIC -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] Pinpoint customers who are looking for what you sell. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R data set size limit
Hi - What is the limit (rows and columns) on the size of a data set that R will process? Thanks. Abhijit Dr. Abhijit Roy Citi - Global Consumer Group - Business Analytics and Methods O: 91 80 4041 6398 Fax: 91 80 2211 0827 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R data set size limit
On Tue, 26 Jun 2007, Roy, Abhijit wrote: Hi - What is the limit (rows and columns) on the size of a data set that R will process? 2^31-1 for each (in a data frame, that number of elements for a matrix). See ?Memory-limits Most likely your computer imposes lower limits. Thanks. Abhijit Dr. Abhijit Roy Citi - Global Consumer Group - Business Analytics and Methods O: 91 80 4041 6398 Fax: 91 80 2211 0827 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Re : R data set size limit
the data set limit depends on your hardware capabilities Justin BEM Elève Ingénieur Statisticien Economiste BP 294 Yaoundé. Tél (00237)9597295. - Message d'origine De : Roy, Abhijit [EMAIL PROTECTED] À : r-help@stat.math.ethz.ch Envoyé le : Mardi, 26 Juin 2007, 13h09mn 40s Objet : [R] R data set size limit Hi - What is the limit (rows and columns) on the size of a data set that R will process? Thanks. Abhijit Dr. Abhijit Roy Citi - Global Consumer Group - Business Analytics and Methods O: 91 80 4041 6398 Fax: 91 80 2211 0827 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data set size question
If you need to analyze something bigger than memory can hold, one option is the biglm package which will fit linear regression models (and a lot of different analyses can be restructured as linear regression models) on blocks of data so that the entire dataset is not in memory all at the same time. I tested it out with a database with over 23 million rows and it worked great. It computed the exact same answers (to about 7 decimal places, I didn't bother to look beyond that) as a couple of other methods used for the same values. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Carl Hauser Sent: Tuesday, June 13, 2006 9:22 PM To: r-help@stat.math.ethz.ch Subject: [R] data set size question Hi there, I'm very new to R and am only in the beginning stages of investigating it for possible use. A document by John Maindonald at the r-project website entitled Using R for Data Analysis and Graphics: Introduction, Code and Commentary contains the following paragraph, The R system may struggle to handle very large data sets. Depending on available computer memory, the processing of a data set containing one hundred thousand observations and perhaps twenty variables may press the limits of what R can easily handle. This document was written in 2004. My questions are: Is this still the case? If so, has anyone come up with creative solutions to mitigate these limitations? If you work with large data sets in R, what have your experiences been? From what I've seen so far, R seems to have enormous potential and capabilities. I routinely work with data sets of several hundred thousand to several million. It would be unfortunate if such potential and capabilities were not realized because of (effective) data set size limitations. Please tell me it ain't so. Thanks for any help or suggestions. Carl [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] data set size question
Hi there, I'm very new to R and am only in the beginning stages of investigating it for possible use. A document by John Maindonald at the r-project website entitled Using R for Data Analysis and Graphics: Introduction, Code and Commentary contains the following paragraph, The R system may struggle to handle very large data sets. Depending on available computer memory, the processing of a data set containing one hundred thousand observations and perhaps twenty variables may press the limits of what R can easily handle. This document was written in 2004. My questions are: Is this still the case? If so, has anyone come up with creative solutions to mitigate these limitations? If you work with large data sets in R, what have your experiences been? From what I've seen so far, R seems to have enormous potential and capabilities. I routinely work with data sets of several hundred thousand to several million. It would be unfortunate if such potential and capabilities were not realized because of (effective) data set size limitations. Please tell me it ain't so. Thanks for any help or suggestions. Carl [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data set size question
The restriction is that objects are kept in memory so if you have sufficient memory and your OS lets you access it then you should be ok. S-Plus is a commercial package similar to R but stores its objects in files and can handle larger data sets if you run into trouble. Given that R is free and once downloaded can be installed on Windows in a minute or so (I assume its just as easy on other OSes) just install it and generate some test data and see if you have any problems, e.g. I had no trouble running the following on my PC: n - 10 p - 20 x - matrix(rnorm(n * p), n) colnames(x) - letters[1:p] # regress column a against the rest x.lm - lm(a ~., as.data.frame(x)) plot(x.lm) # click mouse to advance to successive plots summary(x.lm) On 6/13/06, Carl Hauser [EMAIL PROTECTED] wrote: Hi there, I'm very new to R and am only in the beginning stages of investigating it for possible use. A document by John Maindonald at the r-project website entitled Using R for Data Analysis and Graphics: Introduction, Code and Commentary contains the following paragraph, The R system may struggle to handle very large data sets. Depending on available computer memory, the processing of a data set containing one hundred thousand observations and perhaps twenty variables may press the limits of what R can easily handle. This document was written in 2004. My questions are: Is this still the case? If so, has anyone come up with creative solutions to mitigate these limitations? If you work with large data sets in R, what have your experiences been? From what I've seen so far, R seems to have enormous potential and capabilities. I routinely work with data sets of several hundred thousand to several million. It would be unfortunate if such potential and capabilities were not realized because of (effective) data set size limitations. Please tell me it ain't so. Thanks for any help or suggestions. Carl [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Data set for loglinear analysis
Dear users I need to perform a loglinear analysis of a real data set for a course project. I need a real data set with contingency tables in at least 3 dimensional, each with more than 2 levels. Thanks Joe Warfield [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data set for loglinear analysis
Warfield Jr., Joseph D. wrote: Dear users I need to perform a loglinear analysis of a real data set for a course project. I need a real data set with contingency tables in at least 3 dimensional, each with more than 2 levels. Thanks Joe Warfield [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Do data(package=datasets) and look. maybe data(UCBAdmissions) Kjetil -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- Internal Virus Database is out-of-date. Checked by AVG Anti-Virus. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data Set
You could be really classical and use the iris data! have a look at: http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DataSets the titanic dataset is a real classic one! However it depends very much what you want to study: Anne - Original Message - From: Talita Leite [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, January 11, 2005 12:08 AM Subject: [R] Data Set Hi everybody, I'm studying descriptive statistics using R and I have to make an important work about that. I need some help to choose a good data set to apply those statistics. Does anybody know a good data set I could work with? Thanx, Talita Perciano Costa Leite Graduanda em Ciência da Computação Universidade Federal de Alagoas - UFAL Departamento de Tecnologia da Informação - TCI Construção de Conhecimento por Agrupamento de Dados - CoCADa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Data Set
Hi everybody, I'm studying descriptive statistics using R and I have to make an important work about that. I need some help to choose a good data set to apply those statistics. Does anybody know a good data set I could work with? Thanx, Talita Perciano Costa Leite Graduanda em Ciência da Computação Universidade Federal de Alagoas - UFAL Departamento de Tecnologia da Informação - TCI Construção de Conhecimento por Agrupamento de Dados - CoCADa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data Set
Much of the documentation including manuals, help files and other contributed descriptions include same data sets. What kinds of applications and techniques most interest you? That with the posting guide (http://www.R-project.org/posting-guide.html, especially the search at www.r-project.org) might lead you to suitable examples. hope this helps. spencer graves Talita Leite wrote: Hi everybody, I'm studying descriptive statistics using R and I have to make an important work about that. I need some help to choose a good data set to apply those statistics. Does anybody know a good data set I could work with? Thanx, Talita Perciano Costa Leite Graduanda em Ciência da Computação Universidade Federal de Alagoas - UFAL Departamento de Tecnologia da Informação - TCI Construção de Conhecimento por Agrupamento de Dados - CoCADa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Data Set
Hi, I'll try to be more specific asking my question. I want to apply some functions like mean(), median(), var(), sd(), mad(), quantile(), kurtosis(), skewness() and make some graphics like boxplot, barplot, histogram, stars... In order to do that I need a simple data set, simple but interesting. Thanx, Talita Perciano Costa Leite Graduanda em Ciência da Computação Universidade Federal de Alagoas - UFAL Departamento de Tecnologia da Informação - TCI Construção de Conhecimento por Agrupamento de Dados - CoCADa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Data Set
There are a few packages on CRAN that are collections of data sets, some from intro textbooks. You might find some of them suitable. There are also datasets that come with R. Type data() at the R prompt to see a list. Andy From: Talita Leite Hi, I'll try to be more specific asking my question. I want to apply some functions like mean(), median(), var(), sd(), mad(), quantile(), kurtosis(), skewness() and make some graphics like boxplot, barplot, histogram, stars... In order to do that I need a simple data set, simple but interesting. Thanx, Talita Perciano Costa Leite Graduanda em Ciência da Computação Universidade Federal de Alagoas - UFAL Departamento de Tecnologia da Informação - TCI Construção de Conhecimento por Agrupamento de Dados - CoCADa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html