Re: [R] SQL vs R

2014-05-21 Thread Dr Eberhard Lisse
So,

some feedback.

Have installed MariaDB 10.0.10 on the Linux box. That speeded things up.
Changed from InnoDB/XtraDb to Aria. That speeded loading of the data up.
Have installed MariaDB on the iMac. That speeded things up more.
Tried to tune MariadDB's config. Didn't speed things up much, but for
the query buffer.
Figured out Replication (from the linux box to the iMac). This slowed
loading down somewhat.
Played with the SQL. Speeded things up significantly.
Played with the inxdexes. Did not speed things up much.
Found what I could do in data.table that was faster than SQL and did
that. Obvious increase in speed.

My R processing time came down from 35 to 6 1/2 minutes.


Removed all large tables before saving (and once the raw data was no
longer required). That reduced RData from 150MB to 7KB.

Pushed the table and image generation into a second R file. This takes 4
seconds. The corresponding LyX/LaTeX/Beamer/KnitR runs in 12 seconds.

Installed RStudio. Nice.

Adding new SQL queries adds between 30 and 90 seconds in the input R
file, next to nothing to the presentation generation.

I could not care lass how long the input takes, even hours, as long as I
can save the analysis results and not the data into the RData.

el

PS: Ordered a MacPro :-)-O. Will report back.

on 2014-05-06, 15:40 Peter Crowther said the following:
 The dataset is not large by database standards.  Even in mySQL - not known
 for its speed at multi-row querying - the queries you describe should
 complete within a few seconds on even moderately recent hardware if your
 indexes are reasonable.
 
 What are your performance criteria for processing these queries, and how
 have you / your team optimised the relational database storage?
 
 Cheers,
 
 - Peter
 --
 Peter Crowther, Director, Melandra Limited
 
 
 On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:
 
 Exactly,
 
 which is why I am looking for something faster :-)-O
 
 el
 
 on 2014-05-06, 15:21 David R Forrest said the following:
 It sounds as if your underlying MySQL database is too slow for your
 purposes.  Whatever you layer on top of it will be constrained by
 the underlying database.  To speed up the process significantly,
 you may need to do work on the database backend part of the
 process.

 Dave

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Eberhard Lisse
Thank you.

My requirements are that simple. One table, 11 fields, of which 3 are
interesting, 30 Million records, growing daily by between 30.

And, yes I have spent an enormous amount of time reading these things,
but for someone not dealing with this professionally and/or on a daily
basis, the documents don't help much.

el


on 2014-05-04, 05:26 Jeff Newmiller said the following:
 ?table
 ?aggregate
 
 Also, packages plyr, data.table, and dplyr.  You might consider
 reading [1], but if your interests are really as simple as your
 examples then the table function should be sufficient.  That
 function is discussed in the Introduction to R document that you
 really should have read before posting here.
 
 [1] http://www.jstatsoft.org/v40/i01/
[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Jeff Newmiller
In what format is this growing data stored? CSV? SQL? Log textfile? You say 
you don't want to use sqldf, but you haven't said what you do want to use.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 6, 2014 1:16:12 AM PDT, Eberhard Lisse nos...@lisse.na wrote:
Thank you.

My requirements are that simple. One table, 11 fields, of which 3 are
interesting, 30 Million records, growing daily by between 30.

And, yes I have spent an enormous amount of time reading these things,
but for someone not dealing with this professionally and/or on a daily
basis, the documents don't help much.

el


on 2014-05-04, 05:26 Jeff Newmiller said the following:
 ?table
 ?aggregate
 
 Also, packages plyr, data.table, and dplyr.  You might consider
 reading [1], but if your interests are really as simple as your
 examples then the table function should be sufficient.  That
 function is discussed in the Introduction to R document that you
 really should have read before posting here.
 
 [1] http://www.jstatsoft.org/v40/i01/
[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Jeff

It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
dataframe it saves to 180MB. I work from the dataframe.

But, it's not only a size issue it's also a speed issue and hence I
don't care what I am going to use, as long as it is fast.

sqldf is easy to understand for me but it takes ages.  If
alternatives were roughly similar in speed I would remain with
sqldf.

dplyr sounds faster, and promising, but the intrinsic stuff is
way beyond me (elderly Gynaecologist) on the learning curve...

el

on 2014-05-06, 09:41 Jeff Newmiller said the following:
 In what format is this growing data stored?  CSV? SQL? Log
 textfile?  You say you don't want to use sqldf, but you haven't
 said what you do want to use.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Carlos Ortega
Hi,

Yes dplyr syntax is quite equivalent to SQL, although it is faster.
Another alternative you could consider is to use *data.table* which has a
syntax very similar to the way you select subset within a data.frame and in
terms of performance is faster (a bit) than sqldf.

You can get some idea of how to work with it here:

http://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes-in-r

Regards,
Carlos Ortega
www.qualityexcellence.es





2014-05-06 11:12 GMT+02:00 Dr Eberhard Lisse e...@lisse.na:

 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...

 el

 on 2014-05-06, 09:41 Jeff Newmiller said the following:
  In what format is this growing data stored?  CSV? SQL? Log
  textfile?  You say you don't want to use sqldf, but you haven't
  said what you do want to use.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David McPearson
On Tue, 6 May 2014 10:12:50 +0100 Dr Eberhard Lisse e...@lisse.na wrote

 Jeff
 
 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.
 
 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.
 
 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.
 
 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 el
 
 on 2014-05-06, 09:41 Jeff Newmiller said the following:
  In what format is this growing data stored?  CSV? SQL? Log
  textfile?  You say you don't want to use sqldf, but you haven't
  said what you do want to use.
 


It seems like you are trying to extract a (relatively) small data set from a
much larger SQL databaseWhy not do the SQL stiff in the database and the
analysis *statsm graphics...) in R? Maybe use a make table query to grab the
data of interest, and then import the whole table into R for the analysis?
(Disclaimer: my ignorance of SQL is not far off total)

HTH
D.


South Africas premier free email service - www.webmail.co.za 

Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msncr=Postit14_468x60_gifcid=322

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
David,

this is quite slow :-)-O

el

on 2014-05-06, 10:55 David McPearson said the following:
[...]
 It seems like you are trying to extract a (relatively) small data set from a
 much larger SQL databaseWhy not do the SQL stiff in the database and the
 analysis *statsm graphics...) in R? Maybe use a make table query to grab the
 data of interest, and then import the whole table into R for the analysis?
 (Disclaimer: my ignorance of SQL is not far off total)
 
 HTH
 D.
[...]
-- 
Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Gabor Grothendieck
On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...

You can create indices in sqldf and that can speed up processing
substantially for certain operations.   See examples 4h and 4i on the
sqldf home page: http://sqldf.googlecode.com. Also note that sqldf
supports not only the default SQLite backend but also MySQL, h2 and
postgresql.  See ?sqldf for info on using sqldf with MySQL and the
others.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Thanks,

tried all of that, too slow.

el

on 2014-05-06, 12:00 Gabor Grothendieck said the following:
 On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence
 I don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 You can create indices in sqldf and that can speed up processing
 substantially for certain operations.  See examples 4h and 4i on
 the sqldf home page: http://sqldf.googlecode.com.  Also note that
 sqldf supports not only the default SQLite backend but also MySQL,
 h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
 and the others.
 

-- 
Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David R Forrest
It sounds as if your underlying MySQL database is too slow for your purposes.  
Whatever you layer on top of it will be constrained by the underlying database. 
 To speed up the process significantly, you may need to do work on the database 
backend part of the process.

Dave


On May 6, 2014, at 7:08 AM, Dr Eberhard Lisse e...@lisse.na wrote:

 Thanks,
 
 tried all of that, too slow.
 
 el
 
 on 2014-05-06, 12:00 Gabor Grothendieck said the following:
 On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff
 
 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.
 
 But, it's not only a size issue it's also a speed issue and hence
 I don't care what I am going to use, as long as it is fast.
 
 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.
 
 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 You can create indices in sqldf and that can speed up processing
 substantially for certain operations.  See examples 4h and 4i on
 the sqldf home page: http://sqldf.googlecode.com.  Also note that
 sqldf supports not only the default SQLite backend but also MySQL,
 h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
 and the others.
 
 
 -- 
 Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
 e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
 PO Box 8421 \ /
 Bachbrecht, Namibia ;/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Dr. David Forrest
d...@vims.edu
804-684-7900w
757-968-5509h
804-413-7125c
#240 Andrews Hall
Virginia Institute of Marine Science
Route 1208, Greate Road
PO Box 1346
Gloucester Point, VA, 23062-1346












signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Exactly,

which is why I am looking for something faster :-)-O

el

on 2014-05-06, 15:21 David R Forrest said the following:
 It sounds as if your underlying MySQL database is too slow for your
 purposes.  Whatever you layer on top of it will be constrained by
 the underlying database.  To speed up the process significantly,
 you may need to do work on the database backend part of the
 process.
 
 Dave
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
6aytsFtPZTI=
=zepY
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Peter Crowther
The dataset is not large by database standards.  Even in mySQL - not known
for its speed at multi-row querying - the queries you describe should
complete within a few seconds on even moderately recent hardware if your
indexes are reasonable.

What are your performance criteria for processing these queries, and how
have you / your team optimised the relational database storage?

Cheers,

- Peter
--
Peter Crowther, Director, Melandra Limited


On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Exactly,

 which is why I am looking for something faster :-)-O

 el

 on 2014-05-06, 15:21 David R Forrest said the following:
  It sounds as if your underlying MySQL database is too slow for your
  purposes.  Whatever you layer on top of it will be constrained by
  the underlying database.  To speed up the process significantly,
  you may need to do work on the database backend part of the
  process.
 
  Dave
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (Darwin)
 Comment: GPGTools - http://gpgtools.org
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

 iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
 KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
 clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
 6aytsFtPZTI=
 =zepY
 -END PGP SIGNATURE-

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Bert Gunter
I believe this discussion should be taken offlist as it no longer
seems to be concerned with R.

-- Bert Gunter

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Tue, May 6, 2014 at 7:40 AM, Peter Crowther
peter.crowt...@melandra.com wrote:
 The dataset is not large by database standards.  Even in mySQL - not known
 for its speed at multi-row querying - the queries you describe should
 complete within a few seconds on even moderately recent hardware if your
 indexes are reasonable.

 What are your performance criteria for processing these queries, and how
 have you / your team optimised the relational database storage?

 Cheers,

 - Peter
 --
 Peter Crowther, Director, Melandra Limited


 On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Exactly,

 which is why I am looking for something faster :-)-O

 el

 on 2014-05-06, 15:21 David R Forrest said the following:
  It sounds as if your underlying MySQL database is too slow for your
  purposes.  Whatever you layer on top of it will be constrained by
  the underlying database.  To speed up the process significantly,
  you may need to do work on the database backend part of the
  process.
 
  Dave
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (Darwin)
 Comment: GPGTools - http://gpgtools.org
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

 iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
 KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
 clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
 6aytsFtPZTI=
 =zepY
 -END PGP SIGNATURE-

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Thomas Lumley
On Wed, May 7, 2014 at 2:21 AM, David R Forrest d...@vims.edu wrote:
 It sounds as if your underlying MySQL database is too slow for your purposes. 
  Whatever you layer on top of it will be constrained by the underlying 
 database.  To speed up the process significantly, you may need to do work on 
 the database backend part of the process.


You might try MonetDB and its R interface -- it is fast for
aggregation operations, and either the current version or the upcoming
version has dplyr support.

-thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-05 Thread David Winsemius

On May 5, 2014, at 11:44 AM, Dr Eberhard Lisse wrote:

 I do not wish to prolong this metadiscussion but I remain confused
 by your advice:
 
 1) You don't understand what I asked (ie would have to parse two
 simple SQL statements)

Correct ... at least for me. I could have guessed at what that statement might 
have meant, but why should I need to guess? Why not use a shared 
naturallanguage rather than restricting your audience to the more limited group 
that understands both languages?

 2) The Original Post is understood enough, however, to point me to
 the Introduction to R (where I have not found something to help
 me)

That means my guess would have been wrong, since like Jeff Newmiller, I thought 
a simple call to `table` would have succeeded. (Section 5.10 although the 
desire for ordering suggested by my guess regarding 

 
 3) My name is not Pete.

I'm actually not sure who Pete was. It's a local expression of astonishment 
directed, not at you, but at Satish. That was a prelude to my effort explain to 
Satish why the other respondents to the list have not seen fit to be more 
expansive in their responses. I thought Satish's comment was gratuitous (and 
likewise unhelpful).

 
 If you don't want to help me, don't.

Several people are trying to help. You are remaining obdurate in failing to 
explain what is desired in natural language or posting an example in R code 
with desired output, as well as in failing to heed multiple other bits of 
advice in the Posting Guide. The accepted practice in responses is to include 
any context that might further the conversation. To my mind that would have 
required that you include the original request:

 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

How does the earlier suggestion to look at the 'table' function fail to address 
the first alternative? ( It appears it might satisfy the second one as well.)

-- 
David.


  Nobody is forcing you to reply.
 
 el
 
 On 2014-05-04, 06:56 , David Winsemius wrote:
 
 On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote:
 
 
  By making the effort to learn R??  very constructive and not
 condescending at all.  We, lesser beings, are indebted to you,
 sir.
 
 For Pete's sake.  The OP didn't even express his original request
 in natural language or offer a working example.  Those of us who
 are not regular SQL users would have needed to parse out the SQL
 code in order to figure out what was intended.  (My guess is that
 it would have been quite easy to solve if those were what were
 offered.)  But making the effort to divine the intent didn't seem
 justified by the level of courtesy offered by the questioner.
 
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-05 Thread Gabor Grothendieck
On Fri, May 2, 2014 at 5:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


In the examples section at the bottom of ?sqldf are a number of SQL
statements and the corresponding R statements.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Thank you very much, Mr Arkell.

el

On 2014-05-03, 07:11 , Bert Gunter wrote:
 By making the effort to learn R?
 
 See e.g. the Introduction to R tutorial that ships with R.
 
 -- Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch
 
 
 
 
 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Thanks,

will try to figure this out :-)-O

el

On 2014-05-03, 06:40 , Carlos Ortega wrote:
 Hi,
 
 With the new package dplyr you can create equivalent SQL sintaxt
 queries like the one you need.
 You can find examples of how to apply it here:
 
 http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/
 
 http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/
 
 Regards,
 Carlos.
 
 
 
 
 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na
 mailto:nos...@lisse.na:
 
 Hi,
 
 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)
 
 greetings, el
 
 __
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es http://www.qualityexcellence.es

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner


On 04/05/14 00:05, Dr Eberhard W Lisse wrote:


Thank you very much, Mr Arkell.


I don't get it.  Can anyone explain the (joke? allusion?) ?

cheers,

Rolf Turner


On 2014-05-03, 07:11 , Bert Gunter wrote:

By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:

Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Sarah Goslee
On Sat, May 3, 2014 at 5:42 PM, Rolf Turner r.tur...@auckland.ac.nz wrote:

 On 04/05/14 00:05, Dr Eberhard W Lisse wrote:

 Thank you very much, Mr Arkell.


 I don't get it.  Can anyone explain the (joke? allusion?) ?

I believe it's a moderately offensive reply from someone who feels
unfairly dismissed, derived from British pop culture. But someone
who's actually British could better explain, I'm sure.

Personally, I'm not sure how much work someone who appears to have not
read the posting guide should really expect the list to do on his
behalf. But snarky replies to reasonable requests to read the
documentation are easier than doing one's own work.

Sarah

 cheers,

 Rolf Turner

 On 2014-05-03, 07:11 , Bert Gunter wrote:

 By making the effort to learn R?

 See e.g. the Introduction to R tutorial that ships with R.

 -- Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na
 wrote:

 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Google Pressdram :-)-O

el

On 2014-05-03, 23:42 , Rolf Turner wrote:
 On 04/05/14 00:05, Dr Eberhard W Lisse wrote:

 Thank you very much, Mr Arkell.

 I don't get it.  Can anyone explain the (joke? allusion?) ?

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner

On 04/05/14 10:16, Dr Eberhard W Lisse wrote:

Google Pressdram :-)-O

el

On 2014-05-03, 23:42 , Rolf Turner wrote:

On 04/05/14 00:05, Dr Eberhard W Lisse wrote:


Thank you very much, Mr Arkell.


I don't get it.  Can anyone explain the (joke? allusion?) ?


Thank you.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner

On 04/05/14 09:58, Sarah Goslee wrote:

SNIP


Personally, I'm not sure how much work someone who appears to have not
read the posting guide should really expect the list to do on his
behalf. But snarky replies to reasonable requests to read the
documentation are easier than doing one's own work.


Well put.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Satish Anupindi Rao

 By making the effort to learn R?? very constructive and not condescending at 
all. We, lesser beings, are indebted to you, sir.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Bert Gunter
Sent: Saturday, May 03, 2014 1:12 AM
To: Dr Eberhard Lisse
Cc: r
Subject: Re: [R] SQL vs R

By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Jeff Newmiller
?table
?aggregate

Also, packages plyr, data.table, and dplyr. You might consider reading [1], but 
if your interests are really as simple as your examples then the table function 
should be sufficient. That function is discussed in the Introduction to R 
document that you really should have read before posting here.

[1] http://www.jstatsoft.org/v40/i01/

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 2, 2014 2:23:13 PM PDT, Dr Eberhard Lisse nos...@lisse.na wrote:
Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

greetings, el

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread David Winsemius

On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote:

 
  By making the effort to learn R?? very constructive and not condescending 
 at all. We, lesser beings, are indebted to you, sir.

For Pete's sake. The OP didn't even express his original request in natural 
language or offer a working example. Those of us who are not regular SQL users 
would have needed to parse out the SQL code in order to figure out what was 
intended. (My guess is that it would have been quite easy to solve if those 
were what were offered.)  But making the effort to divine the intent didn't 
seem justified by the level of courtesy offered by the questioner.

-- 
David.

 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Bert Gunter
 Sent: Saturday, May 03, 2014 1:12 AM
 To: Dr Eberhard Lisse
 Cc: r
 Subject: Re: [R] SQL vs R
 
 By making the effort to learn R?
 
 See e.g. the Introduction to R tutorial that ships with R.
 
 -- Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge is 
 certainly not wisdom.
 H. Gilbert Welch
 
 
 
 
 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,
 
 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)
 
 greetings, el
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL vs R

2014-05-02 Thread Dr Eberhard Lisse
Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

greetings, el

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Carlos Ortega
Hi,

With the new package dplyr you can create equivalent SQL sintaxt queries
like the one you need.
You can find examples of how to apply it here:

http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/

http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/

Regards,
Carlos.




2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na:

 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Bert Gunter
By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.