Re: [R] SQL vs R
So, some feedback. Have installed MariaDB 10.0.10 on the Linux box. That speeded things up. Changed from InnoDB/XtraDb to Aria. That speeded loading of the data up. Have installed MariaDB on the iMac. That speeded things up more. Tried to tune MariadDB's config. Didn't speed things up much, but for the query buffer. Figured out Replication (from the linux box to the iMac). This slowed loading down somewhat. Played with the SQL. Speeded things up significantly. Played with the inxdexes. Did not speed things up much. Found what I could do in data.table that was faster than SQL and did that. Obvious increase in speed. My R processing time came down from 35 to 6 1/2 minutes. Removed all large tables before saving (and once the raw data was no longer required). That reduced RData from 150MB to 7KB. Pushed the table and image generation into a second R file. This takes 4 seconds. The corresponding LyX/LaTeX/Beamer/KnitR runs in 12 seconds. Installed RStudio. Nice. Adding new SQL queries adds between 30 and 90 seconds in the input R file, next to nothing to the presentation generation. I could not care lass how long the input takes, even hours, as long as I can save the analysis results and not the data into the RData. el PS: Ordered a MacPro :-)-O. Will report back. on 2014-05-06, 15:40 Peter Crowther said the following: The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thank you. My requirements are that simple. One table, 11 fields, of which 3 are interesting, 30 Million records, growing daily by between 30. And, yes I have spent an enormous amount of time reading these things, but for someone not dealing with this professionally and/or on a daily basis, the documents don't help much. el on 2014-05-04, 05:26 Jeff Newmiller said the following: ?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On May 6, 2014 1:16:12 AM PDT, Eberhard Lisse nos...@lisse.na wrote: Thank you. My requirements are that simple. One table, 11 fields, of which 3 are interesting, 30 Million records, growing daily by between 30. And, yes I have spent an enormous amount of time reading these things, but for someone not dealing with this professionally and/or on a daily basis, the documents don't help much. el on 2014-05-04, 05:26 Jeff Newmiller said the following: ?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Hi, Yes dplyr syntax is quite equivalent to SQL, although it is faster. Another alternative you could consider is to use *data.table* which has a syntax very similar to the way you select subset within a data.frame and in terms of performance is faster (a bit) than sqldf. You can get some idea of how to work with it here: http://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes-in-r Regards, Carlos Ortega www.qualityexcellence.es 2014-05-06 11:12 GMT+02:00 Dr Eberhard Lisse e...@lisse.na: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Tue, 6 May 2014 10:12:50 +0100 Dr Eberhard Lisse e...@lisse.na wrote Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. It seems like you are trying to extract a (relatively) small data set from a much larger SQL databaseWhy not do the SQL stiff in the database and the analysis *statsm graphics...) in R? Maybe use a make table query to grab the data of interest, and then import the whole table into R for the analysis? (Disclaimer: my ignorance of SQL is not far off total) HTH D. South Africas premier free email service - www.webmail.co.za Cheapest Insurance Quotes! https://www.outsurance.co.za/insurance-quote/personal/?source=msncr=Postit14_468x60_gifcid=322 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
David, this is quite slow :-)-O el on 2014-05-06, 10:55 David McPearson said the following: [...] It seems like you are trying to extract a (relatively) small data set from a much larger SQL databaseWhy not do the SQL stiff in the database and the analysis *statsm graphics...) in R? Maybe use a make table query to grab the data of interest, and then import the whole table into R for the analysis? (Disclaimer: my ignorance of SQL is not far off total) HTH D. [...] -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thanks, tried all of that, too slow. el on 2014-05-06, 12:00 Gabor Grothendieck said the following: On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave On May 6, 2014, at 7:08 AM, Dr Eberhard Lisse e...@lisse.na wrote: Thanks, tried all of that, too slow. el on 2014-05-06, 12:00 Gabor Grothendieck said the following: On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu 804-684-7900w 757-968-5509h 804-413-7125c #240 Andrews Hall Virginia Institute of Marine Science Route 1208, Greate Road PO Box 1346 Gloucester Point, VA, 23062-1346 signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
I believe this discussion should be taken offlist as it no longer seems to be concerned with R. -- Bert Gunter Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Tue, May 6, 2014 at 7:40 AM, Peter Crowther peter.crowt...@melandra.com wrote: The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Wed, May 7, 2014 at 2:21 AM, David R Forrest d...@vims.edu wrote: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. You might try MonetDB and its R interface -- it is fast for aggregation operations, and either the current version or the upcoming version has dplyr support. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On May 5, 2014, at 11:44 AM, Dr Eberhard Lisse wrote: I do not wish to prolong this metadiscussion but I remain confused by your advice: 1) You don't understand what I asked (ie would have to parse two simple SQL statements) Correct ... at least for me. I could have guessed at what that statement might have meant, but why should I need to guess? Why not use a shared naturallanguage rather than restricting your audience to the more limited group that understands both languages? 2) The Original Post is understood enough, however, to point me to the Introduction to R (where I have not found something to help me) That means my guess would have been wrong, since like Jeff Newmiller, I thought a simple call to `table` would have succeeded. (Section 5.10 although the desire for ordering suggested by my guess regarding 3) My name is not Pete. I'm actually not sure who Pete was. It's a local expression of astonishment directed, not at you, but at Satish. That was a prelude to my effort explain to Satish why the other respondents to the list have not seen fit to be more expansive in their responses. I thought Satish's comment was gratuitous (and likewise unhelpful). If you don't want to help me, don't. Several people are trying to help. You are remaining obdurate in failing to explain what is desired in natural language or posting an example in R code with desired output, as well as in failing to heed multiple other bits of advice in the Posting Guide. The accepted practice in responses is to include any context that might further the conversation. To my mind that would have required that you include the original request: How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) How does the earlier suggestion to look at the 'table' function fail to address the first alternative? ( It appears it might satisfy the second one as well.) -- David. Nobody is forcing you to reply. el On 2014-05-04, 06:56 , David Winsemius wrote: On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote: By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. For Pete's sake. The OP didn't even express his original request in natural language or offer a working example. Those of us who are not regular SQL users would have needed to parse out the SQL code in order to figure out what was intended. (My guess is that it would have been quite easy to solve if those were what were offered.) But making the effort to divine the intent didn't seem justified by the level of courtesy offered by the questioner. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Fri, May 2, 2014 at 5:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) In the examples section at the bottom of ?sqldf are a number of SQL statements and the corresponding R statements. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thank you very much, Mr Arkell. el On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thanks, will try to figure this out :-)-O el On 2014-05-03, 06:40 , Carlos Ortega wrote: Hi, With the new package dplyr you can create equivalent SQL sintaxt queries like the one you need. You can find examples of how to apply it here: http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/ http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/ Regards, Carlos. 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na mailto:nos...@lisse.na: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? cheers, Rolf Turner On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Sat, May 3, 2014 at 5:42 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? I believe it's a moderately offensive reply from someone who feels unfairly dismissed, derived from British pop culture. But someone who's actually British could better explain, I'm sure. Personally, I'm not sure how much work someone who appears to have not read the posting guide should really expect the list to do on his behalf. But snarky replies to reasonable requests to read the documentation are easier than doing one's own work. Sarah cheers, Rolf Turner On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Google Pressdram :-)-O el On 2014-05-03, 23:42 , Rolf Turner wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 10:16, Dr Eberhard W Lisse wrote: Google Pressdram :-)-O el On 2014-05-03, 23:42 , Rolf Turner wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? Thank you. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 09:58, Sarah Goslee wrote: SNIP Personally, I'm not sure how much work someone who appears to have not read the posting guide should really expect the list to do on his behalf. But snarky replies to reasonable requests to read the documentation are easier than doing one's own work. Well put. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Saturday, May 03, 2014 1:12 AM To: Dr Eberhard Lisse Cc: r Subject: Re: [R] SQL vs R By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On May 2, 2014 2:23:13 PM PDT, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote: By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. For Pete's sake. The OP didn't even express his original request in natural language or offer a working example. Those of us who are not regular SQL users would have needed to parse out the SQL code in order to figure out what was intended. (My guess is that it would have been quite easy to solve if those were what were offered.) But making the effort to divine the intent didn't seem justified by the level of courtesy offered by the questioner. -- David. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Saturday, May 03, 2014 1:12 AM To: Dr Eberhard Lisse Cc: r Subject: Re: [R] SQL vs R By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL vs R
Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Hi, With the new package dplyr you can create equivalent SQL sintaxt queries like the one you need. You can find examples of how to apply it here: http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/ http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/ Regards, Carlos. 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.