[sqlite] PhD student

2015-02-26 Thread Hadley Wickham
I'd also recommend Paul Murrell's "Introduction to Data Technologies":
https://www.stat.auckland.ac.nz/~paul/ItDT/

Hadley

On Thu, Feb 26, 2015 at 2:11 PM, Jim Callahan
 wrote:
> Books that discuss BOTH R and SQL are a very small subset and assume some
> knowledge of both.
> R INTRODUCTORY BOOKS
> 1. Peter Dalgaard, "Introductory Statistics with R", 2002.
> "The book is based upon a set of notes developed for the course in Basic
> Statistics for Health Researchers at the Faculty of Health Sciences of the
> University of Copenhagen. This course had as its primary target.. students
> for the Ph.D. degree in medicine." Intro page viii.
> body mass index (BMI) and age of menarche.
> 2. Jared Lander, "R for Everyone", 2014.
> More modern, but less focused on health and a little more scattershot.
>
> R AUTHORITATIVE REFERENCE
> 1. Brian Ripley and William Venables, "Modern Applied Statistics with S",
> 2002.
>
> Anything by John Chambers, Robert Gentleman or Brian Ripley or any member
> of the "R Core Development Team" can be considered authoritative (the stuff
> you can footnote without frowns) on R.
>
> Also, if you are going to use the R mailing list read all of the PDFs that
> come with the base installation of R. Its better now, but the R mailing
> list used to have a very strong "RTFM" attitude and did not want to explain
> anything that was clearly covered in the manuals. Especially read the "R
> Import/Export Manual" PDF.
>
> ADVANCED R (with SQL)
> Depends on what you are doing.
> If you working with health surveys,
> Thomas Lumley's "Complex Surveys" is invaluable  One of Lumley's
> examples is the CDC's BRFSS, "The Behavioral Risk Factor Surveillance System
>  (BRFSS) is the world's largest, on-going telephone health survey system."
> (from CDC website). Which in Lumley's example is:
>
>- The BRFSS 2007 data as a HUGE (245Mb) SQLite database
>.
>"
>
> 1. Thomas Lumley, "Complex Surveys: A Guide to Health Analysis Using R",
> http://r-survey.r-forge.r-project.org/svybook/index.html
>
> On the other hand, if you are dealing with biological data such as trying
> to match results from GeneChips with existing reference sources you might
> prefer Robert Gentleman's "R Programming for Bioinformatics" especially,
> Chapter 8 "Data Technologies".
>
> 1. Robert Gentleman's "R Programming for Bioinformatics", 2009.
> "We begin our discussion by describing a range of tools that have been
> implemented in R and that can be used to process and transform data. Next
> we discuss the different interfaces to databases that are available, but
> focus our discussion on SQLite as it is used extensively within the
> Bioconductor Project." page 229
> The databases discussion resumes on page 238, Section 8.4, discusses SQLite
> on page 241 including  a specific example:
> "In the code below we load the SQLite package, initialize a driver and open
> a dataase that has been supplied with the RBionf [R] package that
> accompanies this volume. The database contains a number of tables that map
> between identifers on the Affymetrix HG-U95v2 GeneChip and different
> quantities of interest such as GO categories or PubMed IDs (that map
> published papers that discuss the corresponding genes). We then list the
> tables in that database."
>
> Sometimes we get tired of reading dry tomes and we prefer something more
> chatty and amusing.
>
> For R and other tools I enjoy reading:
>
> Cathy O'Neil's and Rachel Schutt's "Doing Data Science: Straight Talk from
> the Frontline", 2013. It's an O'Reilly book.
>
> For SQLite, I enjoy
> Michael Owen's, "The Definitive Guide to SQLite", 2006. -- maybe not the
> whole book, but the Chapter 4 example page 75 "Foods mentioned in episodes
> of the Seinfield sitcom" is a hoot (and turned out to help me solve an real
> world problem).
>
> If you are doing anything beyond Stats 101 classical statistics it helps to
> understand the Bayesian bogeyman.
>
> A fascinating, non-technical, historical account is provided by Sharon
> Bertsch McGrayne, in her book "The Theory that would not Die...".
>
> BAYESIAN STATISTICS (HISTORY)
> Sharon Bertsch McGrayne,
> "The Theory That Would Not Die
> How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines,
> and Emerged Triumphant from Two Centuries of Controversy"
> , 2011.
> http://yalepress.yale.edu/book.asp?isbn=9780300169690
>
> "For the student who is being exposed to Bayesian statistics for the first
> time, McGrayne?s book provides a wealth of illustrations to whet his or her
> appetite for more. It will broaden and deepen the field of reference of the
> more experienced statistician, and the general reader will find an
> understandable, well-written, and fascinating account of a scientific field
> of great importance today. "
> http://www.ams.org/notices/201205/rtx120500657p.pdf
> All the more timely with the release of the movie "The Imitation Game",
> because Turing & 

[sqlite] PhD student

2015-02-26 Thread Roman Fleysher
I like that!!!

Roman

From: sqlite-users-bounces at mailinglists.sqlite.org [sqlite-users-bounces at 
mailinglists.sqlite.org] on behalf of Simon Slavin [slav...@bigfraud.org]
Sent: Thursday, February 26, 2015 5:33 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] PhD student

On 25 Feb 2015, at 4:28pm, VASILEIOU Eleftheria  wrote:

> Could you please provide me some resources for learning SQL and R?

<http://lmgtfy.com/?q=resources%2Bfor%2Blearning%2BSQL>

<http://lmgtfy.com/?q=resources%2Bfor%2Blearning%2BR>

Simon.
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] PhD student

2015-02-26 Thread Jim Callahan
Books that discuss BOTH R and SQL are a very small subset and assume some
knowledge of both.
R INTRODUCTORY BOOKS
1. Peter Dalgaard, "Introductory Statistics with R", 2002.
"The book is based upon a set of notes developed for the course in Basic
Statistics for Health Researchers at the Faculty of Health Sciences of the
University of Copenhagen. This course had as its primary target.. students
for the Ph.D. degree in medicine." Intro page viii.
body mass index (BMI) and age of menarche.
2. Jared Lander, "R for Everyone", 2014.
More modern, but less focused on health and a little more scattershot.

R AUTHORITATIVE REFERENCE
1. Brian Ripley and William Venables, "Modern Applied Statistics with S",
2002.

Anything by John Chambers, Robert Gentleman or Brian Ripley or any member
of the "R Core Development Team" can be considered authoritative (the stuff
you can footnote without frowns) on R.

Also, if you are going to use the R mailing list read all of the PDFs that
come with the base installation of R. Its better now, but the R mailing
list used to have a very strong "RTFM" attitude and did not want to explain
anything that was clearly covered in the manuals. Especially read the "R
Import/Export Manual" PDF.

ADVANCED R (with SQL)
Depends on what you are doing.
If you working with health surveys,
Thomas Lumley's "Complex Surveys" is invaluable  One of Lumley's
examples is the CDC's BRFSS, "The Behavioral Risk Factor Surveillance System
 (BRFSS) is the world's largest, on-going telephone health survey system."
(from CDC website). Which in Lumley's example is:

   - The BRFSS 2007 data as a HUGE (245Mb) SQLite database
   .
   ?"?

1. Thomas Lumley, "Complex Surveys: A Guide to Health Analysis Using R",
http://r-survey.r-forge.r-project.org/svybook/index.html

On the other hand, if you are dealing with biological data such as trying
to match results from GeneChips with existing reference sources you might
prefer Robert Gentleman's "R Programming for Bioinformatics" especially,
Chapter 8 "Data Technologies".

1. Robert Gentleman's "R Programming for Bioinformatics", 2009.
"We begin our discussion by describing a range of tools that have been
implemented in R and that can be used to process and transform data. Next
we discuss the different interfaces to databases that are available, but
focus our discussion on SQLite as it is used extensively within the
Bioconductor Project." page 229
The databases discussion resumes on page 238, Section 8.4, discusses SQLite
on page 241 including  a specific example:
"In the code below we load the SQLite package, initialize a driver and open
a dataase that has been supplied with the RBionf [R] package that
accompanies this volume. The database contains a number of tables that map
between identifers on the Affymetrix HG-U95v2 GeneChip and different
quantities of interest such as GO categories or PubMed IDs (that map
published papers that discuss the corresponding genes). We then list the
tables in that database."

Sometimes we get tired of reading dry tomes and we prefer something more
chatty and amusing.

For R and other tools I enjoy reading:

Cathy O'Neil's and Rachel Schutt's "Doing Data Science: Straight Talk from
the Frontline", 2013. It's an O'Reilly book.

For SQLite, I enjoy
Michael Owen's, "The Definitive Guide to SQLite", 2006. -- maybe not the
whole book, but the Chapter 4 example page 75 "Foods mentioned in episodes
of the Seinfield sitcom" is a hoot (and turned out to help me solve an real
world problem).

If you are doing anything beyond Stats 101 classical statistics it helps to
understand the Bayesian bogeyman.

A fascinating, non-technical, historical account is provided by Sharon
Bertsch McGrayne, in her book "The Theory that would not Die...".

BAYESIAN STATISTICS (HISTORY)
Sharon Bertsch McGrayne,
"The Theory That Would Not Die
How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines,
and Emerged Triumphant from Two Centuries of Controversy"
?, 2011.
http://yalepress.yale.edu/book.asp?isbn=9780300169690

"For the student who is being exposed to Bayesian statistics for the first
time, McGrayne?s book provides a wealth of illustrations to whet his or her
appetite for more. It will broaden and deepen the field of reference of the
more experienced statistician, and the general reader will find an
understandable, well-written, and fascinating account of a scientific field
of great importance today. "
http://www.ams.org/notices/201205/rtx120500657p.pdf
All the more timely with the release of the movie "The Imitation Game",
because Turing & Co. cracked the German Enigma code using Bayesian
statistics.?
There few specific "Bayesian" packages in R (an interface to BUGS); but it
lurks in the background of many of them  -- any use of the word "prior".

Hope this helps.
Jim

On Wed, Feb 25, 2015 at 11:28 AM, VASILEIOU Eleftheria  wrote:

>  Hi,
>
> I would need to use R for my analysis for my 

[sqlite] PhD student

2015-02-26 Thread Gabor Grothendieck
On Wed, Feb 25, 2015 at 11:28 AM, VASILEIOU Eleftheria
 wrote:
> I would need to use R for my analysis for my Project and my supervisor 
> suggested me to learn the SQL language for R.
> Could you please provide me some resources for learning SQL and R?

Assuming you are looking to use SQL to work with R data.frames see
this link for numerous examples:

http://sqldf.googlecode.com


[sqlite] PhD student

2015-02-26 Thread Simon Slavin

On 25 Feb 2015, at 4:28pm, VASILEIOU Eleftheria  wrote:

> Could you please provide me some resources for learning SQL and R?





Simon.


[sqlite] PhD student

2015-02-26 Thread John McKown
On Wed, Feb 25, 2015 at 10:28 AM, VASILEIOU Eleftheria
 wrote:
>  Hi,
>
> I would need to use R for my analysis for my Project and my supervisor 
> suggested me to learn the SQL language for R.
> Could you please provide me some resources for learning SQL and R?

I'm not sure if you want to learn SQL and R. Or if you want to learn
just R and it's use of SQL (this assumes you know SQL already). My
confusion likely is a result of my being a Texan. ;-)

Some basics of the R language, in general, are here:

http://cran.r-project.org/manuals.html
http://adv-r.had.co.nz/ (this has move advanced R work by Hadley
Wickham, an R guru/wizard)
http://en.wikibooks.org/wiki/R_Programming

If you need to learn SQL as well, then there are some useful sites at well.
http://www.w3schools.com/sql/ is a nice one.
http://www.sqlcourse.com/ I don't know this one, but it looks interesting
Google search: 
https://www.google.com/webhp?sourceid=chrome-instant=1C1CHFX_enUS597US598=1=2=UTF-8#q=sql%20tutorial

If you tell us which SQL software you will be using, perhaps we could
recommend other sites or forums where you can get SQL specific help.

Now using SQL in R is a bit more difficult to explain. Mainly because
there are differences depending on the SQL server you are using. I
have used both RODBC and DBI. RODBC is R for ODBC connections. ODBC is
an industry standard interface to a number of different SQL servers
such as MS SQL Server, PostgreSQL, and MySQL (MariaDB). DBI is another
interface, by the previously mentioned Hadley Wickham, which can
connect to may different data base servers as well. Now, just for
myself, I prefer DBI because it is a closer match to what I am used to
using in other languages, such as PERL. A nice "README" by Mr. Wickham
is here:
https://github.com/rstats-db/DBI/blob/master/README.md
http://www.stat.berkeley.edu/~nolan/stat133/Fall05/lectures/SQL-R.pdf

An overview of RODBC is here:
http://cran.r-project.org/web/packages/RODBC/RODBC.pdf and perhaps of
some interest to you might be that the maintainer is in the U.K. Brian
Ripley ripley at stats.ox.ac.uk

You didn't say if you were going to be connecting to an existing SQL
system, or creating your own. If you are going to create your own,
then perhaps the easiest to implement is SQLite. It is not as full
featured as Oracle, MS SQL Sever, or PostgreSQL, but it has the
advantage of being "embedded". That is, there is very little set up
because the "server" code is embedded into R itself. This means that
you don't need to set up an independent server. Of course, SQLite is
"lite" compared to the the "full function" data base servers
previously mentioned. For SQLite, the R package is RSQLite and you can
look at the "README" here:
http://cran.r-project.org/web/packages/RSQLite/RSQLite.pdf
One thing nice about this is that it is the work of Mr. Wickham and is
generally compatible with his DBI package. This may be helpful because
you could start off "easy" with RSQLite, then "upgrade" to a "real
data base server" with most of the R coding remaining generally the
same.

And one other thing that I will warn you of about this forum. It is a
"no homework" forum. This doesn't mean we won't help with general
questions about approaches and the like, but people can become a bit
"terse" if they feel you are trying to get someone to do your work. I
just mention this because it does come up on rare occasion.


>
>
> Thanks in advance,
> Eleftheria
>
> Eleftheria Vasileiou BSc, MPH
> Research Student, Centre for Population Health Sciences
> Room 815, Old Medical School, University of Edinburgh
>
> E.Vasileiou at ed.ac.uk
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.



-- 
He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown


[sqlite] PhD student

2015-02-25 Thread VASILEIOU Eleftheria
 Hi,

I would need to use R for my analysis for my Project and my supervisor 
suggested me to learn the SQL language for R.
Could you please provide me some resources for learning SQL and R?


Thanks in advance,
Eleftheria

Eleftheria Vasileiou BSc, MPH
Research Student, Centre for Population Health Sciences
Room 815, Old Medical School, University of Edinburgh

E.Vasileiou at ed.ac.uk
-- next part --
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: