Re: [R] Database connection query

2015-02-09 Thread Marc Schwartz

 On Feb 9, 2015, at 4:33 AM, Lalitha Kristipati 
 lalitha.kristip...@techmahindra.com wrote:
 
 Hi,
 
 I would like to know when to use drivers and when to use packages to connect 
 to databases in R
 
 Regards,
 Lalitha Kristipati
 Associate Software Engineer


In general, you will need both.

There is more information in the R Data Import/Export manual:

  
http://cran.r-project.org/doc/manuals/r-release/R-data.html#Relational-databases

and there is a SIG list for R and DB specific subject matter:

  https://stat.ethz.ch/mailman/listinfo/r-sig-db

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database connectivity

2012-04-16 Thread andrija djurovic
RODBC

On Mon, Apr 16, 2012 at 11:40 AM, Partha Sinha pnsinh...@gmail.com wrote:
 Dear All
 please let me know how to connect SQL server from R? which all
 packages I will require for this?
 Thanks
 Parth

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database

2012-02-29 Thread R. Michael Weylandt
I don't use Access but my general impression is that the advantages it
brings will be similar to those brought by any other database:
performance rather than ability -- they are both Turing complete after
all, after some trickery on the SQL end.

Databases allow much larger data sets than R currently does and often
allow faster queries -- some would argue the SQL syntax is clearer for
some subsetting operations, but that's perhaps a function of
familiarity. For the task you describe, it should be elementary in
both platforms and I'd just use whichever one the data was already in.
For more substantive data analysis, you almost certainly want to use
R.

Others with Access experience (or more SQL) can add more.

Michael

On Tue, Feb 28, 2012 at 6:06 PM, Trying To learn again
tryingtolearnag...@gmail.com wrote:
 Hi all,

 I´m new using Access. I see that many things that you can do on Access you
 can do on CRAN R but not on contrary.

 My question is: Is there any manual with examples comparing how to do data
 base analysis on access and making the same on CRAN R?

 Imagine I want to compare two columns Name of two different data bases. I
 want to see if there are identical names on both files.

 It is better to use Access? Or it is better to use cran r (importing data
 and work on CRAN R)?

 This is only an example.

 I know CRAN R is more specialized on statistics and data analysis but I ´m
 trying not to learn Access and SQL so on.

 I cannot explain better I hope you comprehed me.

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database

2012-02-29 Thread Prof Brian Ripley

On 29/02/2012 12:45, R. Michael Weylandt wrote:

I don't use Access but my general impression is that the advantages it
brings will be similar to those brought by any other database:
performance rather than ability -- they are both Turing complete after
all, after some trickery on the SQL end.

Databases allow much larger data sets than R currently does and often


But not much larger in Access than 64-bit R allows: Access is a pretty 
limited system.



allow faster queries -- some would argue the SQL syntax is clearer for
some subsetting operations, but that's perhaps a function of
familiarity. For the task you describe, it should be elementary in
both platforms and I'd just use whichever one the data was already in.
For more substantive data analysis, you almost certainly want to use
R.

Others with Access experience (or more SQL) can add more.


Access does not bring the performance benefits of more advanced DBMS 
engines: on Windows I would certainly recommend using SQL Server Express 
(or whatever it is currently called) instead.  If you want to use a DBMS 
to supplement R (as per the R Data Import/Export manual) I would use 
MySQL or SQLite.




Michael

On Tue, Feb 28, 2012 at 6:06 PM, Trying To learn again
tryingtolearnag...@gmail.com  wrote:

Hi all,

I´m new using Access. I see that many things that you can do on Access you
can do on CRAN R but not on contrary.

My question is: Is there any manual with examples comparing how to do data
base analysis on access and making the same on CRAN R?

Imagine I want to compare two columns Name of two different data bases. I
want to see if there are identical names on both files.


'files'?  Where are the data?  If you mean different tables in one 
Access database then I would still do this in R via RODBC.



It is better to use Access? Or it is better to use cran r (importing data
and work on CRAN R)?

This is only an example.

I know CRAN R is more specialized on statistics and data analysis but I ´m
trying not to learn Access and SQL so on.


In which case you can simply use RODBC to import tables to R and work there.


I cannot explain better I hope you comprehed me.

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database

2012-02-29 Thread Gabor Grothendieck
On Tue, Feb 28, 2012 at 6:06 PM, Trying To learn again
tryingtolearnag...@gmail.com wrote:
 Hi all,

 I´m new using Access. I see that many things that you can do on Access you
 can do on CRAN R but not on contrary.

 My question is: Is there any manual with examples comparing how to do data
 base analysis on access and making the same on CRAN R?


The examples section at the bottom of ?sqldf in the sqldf package
shows how to do a variety of calculations in both SQL and in R.  Also
see the home page at http://sqldf.googlecode.com

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database

2012-02-29 Thread John Kane
It all depends on what you are doing but R is pretty powerful.  I have never 
used Access so I don't know what it can do but I have played around with othe 
dbs at a very basic level and most things I did could be done quite easily in R 
 : Sheer data set size could be a problem but unless you have millions of data 
items you are probably okay in R.

For your example.  Assume nams1 is from dataset 1 and nams 2 is from dataset 2.
=
nams1 - letters[1:5]
nams2 - letters[3:7]

nams1 %in% nams2
=
Done.

John Kane
Kingston ON Canada


 -Original Message-
 From: tryingtolearnag...@gmail.com
 Sent: Wed, 29 Feb 2012 00:06:08 +0100
 To: r-help@r-project.org
 Subject: [R] Database
 
 Hi all,
 
 I4m new using Access. I see that many things that you can do on Access
 you
 can do on CRAN R but not on contrary.
 
 My question is: Is there any manual with examples comparing how to do
 data
 base analysis on access and making the same on CRAN R?
 
 Imagine I want to compare two columns Name of two different data bases.
 I
 want to see if there are identical names on both files.
 
 It is better to use Access? Or it is better to use cran r (importing data
 and work on CRAN R)?
 
 This is only an example.
 
 I know CRAN R is more specialized on statistics and data analysis but I
 4m
 trying not to learn Access and SQL so on.
 
 I cannot explain better I hope you comprehed me.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Share photos  screenshots in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1
Works in all emails, instant messengers, blogs, forums and social networks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Database abstraction

2011-02-28 Thread Rob Paul Tirrell
I was also wondering about precedents or similar packages in the R ecosytem. 
I'm familiar with R-ORM and SQLDF, which seem closest. 
At this point any and all feedback and critiques are extremely welcome! I 
imagine other frequent database users have had problems much the same as mine.
I'd like to release this in the near future. I've been using Roxygen for 
documentation, but support for annotations on R5 methods isn't great...

Thanks all.

- Original Message -
From: Rob Tirrell r...@stanford.edu
To: r-help@r-project.org
Sent: Sunday, February 27, 2011 2:38:18 AM
Subject: [R] Database abstraction

Hi all -

I've been working on a lite ORM and database abstraction package for R.
Formatting complex queries by hand has always been an error-prone hassle, so
I've tried to do away with that as much as possible, instead, using R
objects to represent elements of a database system (statements, clauses,
operators, functions, fields, tables, etc.). R5 classes have made the
development of this package somewhat easier than it would have been in the
past (in my view, at least).

Then an expression like:
query(some_table$some_field)$where(some_table$some_field != NA 
some_table$some_other_field %in% c(1, 2, 3))
would generate SQL like:
SELECT `database`.`some_table`.`some_field` AS `some_field` WHERE
`database`.`some_table`.`some_field` IS NOT NULL AND
`database`.`some_table`.`some_other_field` IN (1, 2, 3);

The basic features:
- Abstraction of tables and fields to R5 objects and use of R functions
where a clear parallel in SQL exists (e.g. %in% - IN, unique - DISTINCT).
- Result proxying (by a data.frame-like object), which allows on-demand
fetching and live updates to results (where appropriate).
- Session management.

It's my feeling that most R users care more about using existing tables than
creating new ones, so introspection is more of a focus than is defining new
relations.

It's reasonably far along, and I've been using it for my own projects for
some time. However, certainly much syntax is not supported, and I've only
written a compiler for MySQL. At this point I'd welcome anyone interesting
in contributing or offering advice. The code is (I think) designed as to
support the addition of more complex logic at any stage (generation,
preparation, compilation, formatting) withouth becoming spaghettized. I've
written a reasonably-sized test suite that covers most of the functionality.

The code is hosted on github - https://github.com/rtirrell/databasr. Feel
free to fork, critique and/or use. I've taken some pains to ensure that
interface will remain relatively stable.

Thanks!

--
Robert Tirrell | r...@stanford.edu | (607) 437-6532
Program in Biomedical Informatics | Butte Lab | Stanford University

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Aaron Mackey
I would load your set of userid's into a temporary table in oracle,
then join that table with the rest of your SQL query to get only the
matching rows out.

-Aaron

On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very large 
 (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs present in the R 
 dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 2000 IDs.  The 
 Oracle table is quite large. Additionally, the sql query may need to join to 
 other tables to bring in ancillary fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable where 
 timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R object d and 
 then use the R merge function.  The problem however is that if d is too 
 large it may fail due to memory limitations or be inefficient.  I would like 
 to push the merge portion to the database and it would be very convenient if 
 it were possible to request that the query look to the R object for the ID's 
 to which it should restrict the output.

 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z from largetable t where 
 r$userid=t.userid)

 Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, how?   
 This would be convenient and help me avoid needing to create a temporary 
 table to store the R data, join via sql, then return the data back to R.

 I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
 Thanks for your comments, ideas, recommendations.


 -Avram

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Aaron Mackey
Sorry, I see now you want to avoid this, but you did ask what was the
best way to efficiently ..., and the temp. table solution certainly
matches your description.  What's wrong with using a temporary table?

-Aaron

On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] wrote:
 I would load your set of userid's into a temporary table in oracle,
 then join that table with the rest of your SQL query to get only the
 matching rows out.

 -Aaron

 On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very large 
 (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs present in the R 
 dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 2000 IDs.  
 The Oracle table is quite large. Additionally, the sql query may need to 
 join to other tables to bring in ancillary fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable where 
 timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R object d 
 and then use the R merge function.  The problem however is that if d is 
 too large it may fail due to memory limitations or be inefficient.  I would 
 like to push the merge portion to the database and it would be very 
 convenient if it were possible to request that the query look to the R 
 object for the ID's to which it should restrict the output.

 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z from largetable t where 
 r$userid=t.userid)

 Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, how?  
  This would be convenient and help me avoid needing to create a temporary 
 table to store the R data, join via sql, then return the data back to R.

 I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
 Thanks for your comments, ideas, recommendations.


 -Avram

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony

Perhaps I will need to create a temp table, but I am asking if there is a way 
to avoid it.  It would be great if there were a way to tie the R data frame 
temporarily to the query in a transparent fashion. If not, I will see if I can 
create/drop the temp table directly from sqlQuery.
-Avram


 
On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL PROTECTED] 
wrote:
Sorry, I see now you want to avoid this, but you did ask what was the
best way to efficiently ..., and the temp. table solution certainly
matches your description.  What's wrong with using a temporary table?

-Aaron

On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] wrote:
 I would load your set of userid's into a temporary table in oracle,
 then join that table with the rest of your SQL query to get only the
 matching rows out.

 -Aaron

 On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very large 
 (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs present in the 
 R dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 2000 IDs.  
 The Oracle table is quite large. Additionally, the sql query may need to 
 join to other tables to bring in ancillary fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable where 
 timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R object d 
 and then use the R merge function.  The problem however is that if d is 
 too large it may fail due to memory limitations or be inefficient.  I would 
 like to push the merge portion to the database and it would be very 
 convenient if it were possible to request that the query look to the R 
 object for the ID's to which it should restrict the output.

 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z from largetable t where 
 r$userid=t.userid)

 Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, how? 
   This would be convenient and help me avoid needing to create a temporary 
 table to store the R data, join via sql, then return the data back to R.

 I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
 Thanks for your comments, ideas, recommendations.


 -Avram

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Aaron Mackey
I guess I'd do it something like this:

dbGetQuery(con, CREATE TEMPORARY TABLE foo ( etc etc))
sapply(@userids, function (x) { dbGetQuery(con, paste(INSERT INTO foo
(userid) VALUES (, x, ))) })

then later:

dbGetQuery(con, DROP TABLE foo);

-Aaron

On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Perhaps I will need to create a temp table, but I am asking if there is a way 
 to avoid it.  It would be great if there were a way to tie the R data frame 
 temporarily to the query in a transparent fashion. If not, I will see if I 
 can create/drop the temp table directly from sqlQuery.
 -Avram



 On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL 
 PROTECTED] wrote:
Sorry, I see now you want to avoid this, but you did ask what was the
best way to efficiently ..., and the temp. table solution certainly
matches your description.  What's wrong with using a temporary table?

-Aaron

On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] wrote:
 I would load your set of userid's into a temporary table in oracle,
 then join that table with the rest of your SQL query to get only the
 matching rows out.

 -Aaron

 On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very large 
 (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs present in the 
 R dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 2000 IDs.  
 The Oracle table is quite large. Additionally, the sql query may need to 
 join to other tables to bring in ancillary fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable where 
 timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R object d 
 and then use the R merge function.  The problem however is that if d is 
 too large it may fail due to memory limitations or be inefficient.  I 
 would like to push the merge portion to the database and it would be very 
 convenient if it were possible to request that the query look to the R 
 object for the ID's to which it should restrict the output.

 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z from largetable t where 
 r$userid=t.userid)

 Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, 
 how?   This would be convenient and help me avoid needing to create a 
 temporary table to store the R data, join via sql, then return the data 
 back to R.

 I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
 Thanks for your comments, ideas, recommendations.


 -Avram

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Coey Minear
Aaron Mackey writes:
  I guess I'd do it something like this:
  
  dbGetQuery(con, CREATE TEMPORARY TABLE foo ( etc etc))
  sapply(@userids, function (x) { dbGetQuery(con, paste(INSERT INTO foo
  (userid) VALUES (, x, ))) })
  
  then later:
  
  dbGetQuery(con, DROP TABLE foo);
  

Actually, based on my reading of the DBI reference, you should be able
to do the following to create a table (although possibly not temporary):

  dbWriteTable(connection, r_user_ids, r)

Then you can use the following to drop the table:

  dbRemoveTable(connection, r_user_ids)

Of course, I don't know whether the ODBC driver implements these
functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
have been assuming that.)

Coey

  -Aaron
  
  On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony [EMAIL PROTECTED] wrote:
  
   Perhaps I will need to create a temp table, but I am asking if there is a 
   way to avoid it.  It would be great if there were a way to tie the R data 
   frame temporarily to the query in a transparent fashion. If not, I will 
   see if I can create/drop the temp table directly from sqlQuery.
   -Avram
  
  
  
   On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL 
   PROTECTED] wrote:
  Sorry, I see now you want to avoid this, but you did ask what was the
  best way to efficiently ..., and the temp. table solution certainly
  matches your description.  What's wrong with using a temporary table?
  
  -Aaron
  
  On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] wrote:
   I would load your set of userid's into a temporary table in oracle,
   then join that table with the rest of your SQL query to get only the
   matching rows out.
  
   -Aaron
  
   On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:
  
   Dear R list,
  
   What is the best way to efficiently marry an R dataset with a very 
   large (Oracle) database table?
  
   The goal is to only return Oracle table rows that match IDs present in 
   the R dataset.
   I have an R data frame with 2000 user IDs analogous to: r = 
   data.frame(userid=round(runif(2000)*10,0))
  
   ...and I need to pull data from an Oracle table only for these 2000 
   IDs.  The Oracle table is quite large. Additionally, the sql query may 
   need to join to other tables to bring in ancillary fields.
  
   I currently connect to Oracle via odbc:
  
   library(RODBC)
   connection - odbcConnect(, uid=, pwd=)
   d = sqlQuery(connection, select userid, x, y, z from largetable where 
   timestamp  sysdate -7)
  
   ...allowing me to pull data from the database table into the R object 
   d and then use the R merge function.  The problem however is that if 
   d is too large it may fail due to memory limitations or be 
   inefficient.  I would like to push the merge portion to the database 
   and it would be very convenient if it were possible to request that the 
   query look to the R object for the ID's to which it should restrict the 
   output.
  
   Is there a way to do this?
   Something like the following fictional code:
   d = sqlQuery(connection, select t.userid, x, y, z from largetable t 
   where r$userid=t.userid)
  
   Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, 
   how?   This would be convenient and help me avoid needing to create a 
   temporary table to store the R data, join via sql, then return the data 
   back to R.
  
   I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
   Thanks for your comments, ideas, recommendations.
  
  
   -Avram
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
  
  
  
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Coey Minear
While the subquery with a temporary table is probably the better
option, you could just manually generate the subquery and pass it in
with the query.  As an example, if you have user_ids 1000-1005,
instead of having ... where user_id in (select user_id from
r_user_id), you would have ... where user_id in
(1000,1001,1002,1003,1004,1005).  With 2000 user_ids, you'd just have
R generate the string, the whole query for that matter, and pass it
in.  Sure, it's a few thousand characters going into the database as a
query, but it sure beats pulling down a few million characters that
you're going to just throw away.

I have no idea how the performance of the above method would compare
to using a temporary table (on the database side, of course), but I
have a hard time believing that it wouldn't be within a factor of
two.  But I'm just making the suggestion.

Coey


Avram Aelony writes:
  
  Dear R list,
  
  What is the best way to efficiently marry an R dataset with a very large 
  (Oracle) database table?  
  
  The goal is to only return Oracle table rows that match IDs present in the R 
  dataset.  
  I have an R data frame with 2000 user IDs analogous to: r = 
  data.frame(userid=round(runif(2000)*10,0))
  
  ...and I need to pull data from an Oracle table only for these 2000 IDs.  
  The Oracle table is quite large. Additionally, the sql query may need to 
  join to other tables to bring in ancillary fields.
  
  I currently connect to Oracle via odbc: 
  
  library(RODBC)
  connection - odbcConnect(, uid=, pwd=)
  d = sqlQuery(connection, select userid, x, y, z from largetable where 
  timestamp  sysdate -7)
  
  ...allowing me to pull data from the database table into the R object d 
  and then use the R merge function.  The problem however is that if d is 
  too large it may fail due to memory limitations or be inefficient.  I would 
  like to push the merge portion to the database and it would be very 
  convenient if it were possible to request that the query look to the R 
  object for the ID's to which it should restrict the output.  
  
  Is there a way to do this?
  Something like the following fictional code:
  d = sqlQuery(connection, select t.userid, x, y, z from largetable t where 
  r$userid=t.userid)
  
  Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, how?  
   This would be convenient and help me avoid needing to create a temporary 
  table to store the R data, join via sql, then return the data back to R.
  
  I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 . 
  Thanks for your comments, ideas, recommendations.
  
  
  -Avram
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony

I have not devoted time to setting up ROracle since binaries are not available 
and it seems to require some effort to compile (see 
http://cran.r-project.org/web/packages/ROracle/index.html).  On the other hand, 
RODBC worked more or less magically once I set up the data sources.

What is your success using ROracle  and why would it be preferable to RODBC ?

-Avram

 
On Thursday, September 11, 2008, at 12:47PM, Coey Minear [EMAIL PROTECTED] 
wrote:
Aaron Mackey writes:
  I guess I'd do it something like this:
  
  dbGetQuery(con, CREATE TEMPORARY TABLE foo ( etc etc))
  sapply(@userids, function (x) { dbGetQuery(con, paste(INSERT INTO foo
  (userid) VALUES (, x, ))) })
  
  then later:
  
  dbGetQuery(con, DROP TABLE foo);
  

Actually, based on my reading of the DBI reference, you should be able
to do the following to create a table (although possibly not temporary):

  dbWriteTable(connection, r_user_ids, r)

Then you can use the following to drop the table:

  dbRemoveTable(connection, r_user_ids)

Of course, I don't know whether the ODBC driver implements these
functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
have been assuming that.)

Coey

  -Aaron
  
  On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony [EMAIL PROTECTED] wrote:
  
   Perhaps I will need to create a temp table, but I am asking if there is a 
   way to avoid it.  It would be great if there were a way to tie the R data 
   frame temporarily to the query in a transparent fashion. If not, I will 
   see if I can create/drop the temp table directly from sqlQuery.
   -Avram
  
  
  
   On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL 
   PROTECTED] wrote:
  Sorry, I see now you want to avoid this, but you did ask what was the
  best way to efficiently ..., and the temp. table solution certainly
  matches your description.  What's wrong with using a temporary table?
  
  -Aaron
  
  On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] wrote:
   I would load your set of userid's into a temporary table in oracle,
   then join that table with the rest of your SQL query to get only the
   matching rows out.
  
   -Aaron
  
   On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] wrote:
  
   Dear R list,
  
   What is the best way to efficiently marry an R dataset with a very 
   large (Oracle) database table?
  
   The goal is to only return Oracle table rows that match IDs present in 
   the R dataset.
   I have an R data frame with 2000 user IDs analogous to: r = 
   data.frame(userid=round(runif(2000)*10,0))
  
   ...and I need to pull data from an Oracle table only for these 2000 
   IDs.  The Oracle table is quite large. Additionally, the sql query may 
   need to join to other tables to bring in ancillary fields.
  
   I currently connect to Oracle via odbc:
  
   library(RODBC)
   connection - odbcConnect(, uid=, pwd=)
   d = sqlQuery(connection, select userid, x, y, z from largetable where 
   timestamp  sysdate -7)
  
   ...allowing me to pull data from the database table into the R object 
   d and then use the R merge function.  The problem however is that if 
   d is too large it may fail due to memory limitations or be 
   inefficient.  I would like to push the merge portion to the database 
   and it would be very convenient if it were possible to request that 
   the query look to the R object for the ID's to which it should 
   restrict the output.
  
   Is there a way to do this?
   Something like the following fictional code:
   d = sqlQuery(connection, select t.userid, x, y, z from largetable t 
   where r$userid=t.userid)
  
   Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, 
   how?   This would be convenient and help me avoid needing to create a 
   temporary table to store the R data, join via sql, then return the 
   data back to R.
  
   I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
   Thanks for your comments, ideas, recommendations.
  
  
   -Avram
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
  
  
  
  
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
  




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Coey Minear
Avram Aelony writes:
  
  I have not devoted time to setting up ROracle since binaries are
  not available and it seems to require some effort to compile (see
  http://cran.r-project.org/web/packages/ROracle/index.html).  On the
  other hand, RODBC worked more or less magically once I set up the
  data sources.
  
  What is your success using ROracle and why would it be preferable
  to RODBC ?
  
  -Avram
  

Actually, I've only been using RSQLite, so that's where my (limited)
knowledge is coming from (and only for less than a week).  RSQLite
based on DBI, and noticed that Aaron and I were answering with DBI
functions.  I then noticed that you kept referring to sqlQuery.
That's why I raised the issue of whether RODBC is based on DBI or not.

If RODBC is not based on DBI, then obviously you cannot leverage the
use of dbWriteTable and dbRemoveTable, as they may not be present.  If
RODBC offers something similar, you'll have to find out, but that may
have driven your initial desire to avoid the temporary table option.

Regardless of that, I don't see why you could not consider the
pre-built subquery option, unless you find that ODBC, or RODBC, has a
limit on what can be sent to the database.

Coey


   
  On Thursday, September 11, 2008, at 12:47PM, Coey Minear [EMAIL 
  PROTECTED] wrote:
  Aaron Mackey writes:
I guess I'd do it something like this:

dbGetQuery(con, CREATE TEMPORARY TABLE foo ( etc etc))
sapply(@userids, function (x) { dbGetQuery(con, paste(INSERT INTO foo
(userid) VALUES (, x, ))) })

then later:

dbGetQuery(con, DROP TABLE foo);

  
  Actually, based on my reading of the DBI reference, you should be able
  to do the following to create a table (although possibly not temporary):
  
dbWriteTable(connection, r_user_ids, r)
  
  Then you can use the following to drop the table:
  
dbRemoveTable(connection, r_user_ids)
  
  Of course, I don't know whether the ODBC driver implements these
  functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
  have been assuming that.)
  
  Coey
  
-Aaron

On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Perhaps I will need to create a temp table, but I am asking if there 
 is a way to avoid it.  It would be great if there were a way to tie 
 the R data frame temporarily to the query in a transparent fashion. If 
 not, I will see if I can create/drop the temp table directly from 
 sqlQuery.
 -Avram



 On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL 
 PROTECTED] wrote:
Sorry, I see now you want to avoid this, but you did ask what was the
best way to efficiently ..., and the temp. table solution certainly
matches your description.  What's wrong with using a temporary table?

-Aaron

On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] 
wrote:
 I would load your set of userid's into a temporary table in oracle,
 then join that table with the rest of your SQL query to get only the
 matching rows out.

 -Aaron

 On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] 
 wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very 
 large (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs present 
 in the R dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 2000 
 IDs.  The Oracle table is quite large. Additionally, the sql query 
 may need to join to other tables to bring in ancillary fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable 
 where timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R 
 object d and then use the R merge function.  The problem however 
 is that if d is too large it may fail due to memory limitations 
 or be inefficient.  I would like to push the merge portion to the 
 database and it would be very convenient if it were possible to 
 request that the query look to the R object for the ID's to which 
 it should restrict the output.

 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z from largetable 
 t where r$userid=t.userid)

 Would sqldf (http://code.google.com/p/sqldf/) help me out here? If 
 so, how?   This would be convenient and help me avoid needing to 
 create a temporary table to store the R data, join via sql, then 
 return the data back to R.

 I am using R version 2.7.2 (2008-08-25) / 

Re: [R] database table merging tips with R

2008-09-11 Thread Avram Aelony

At some point I'd like to try to compile the DBI-based ROracle package as well. 
 For now though, I'll stick with RODBC as it seems to do what I need.  I 
believe RODBC is not based on DBI, but that shouldn't preclude using the 
pre-built subquery option which is a great idea and should work regardless.  I 
will try that.   It might also be a great way to implement the innards of the 
fictional code (for both RODBC and ROracle ) that references a column in the R 
data frame as I had suggested previously or a better variation on that...!

Thanks for your responses and insight.

Avram


 
On Thursday, September 11, 2008, at 02:19PM, Coey Minear [EMAIL PROTECTED] 
wrote:
Avram Aelony writes:
  
  I have not devoted time to setting up ROracle since binaries are
  not available and it seems to require some effort to compile (see
  http://cran.r-project.org/web/packages/ROracle/index.html).  On the
  other hand, RODBC worked more or less magically once I set up the
  data sources.
  
  What is your success using ROracle and why would it be preferable
  to RODBC ?
  
  -Avram
  

Actually, I've only been using RSQLite, so that's where my (limited)
knowledge is coming from (and only for less than a week).  RSQLite
based on DBI, and noticed that Aaron and I were answering with DBI
functions.  I then noticed that you kept referring to sqlQuery.
That's why I raised the issue of whether RODBC is based on DBI or not.

If RODBC is not based on DBI, then obviously you cannot leverage the
use of dbWriteTable and dbRemoveTable, as they may not be present.  If
RODBC offers something similar, you'll have to find out, but that may
have driven your initial desire to avoid the temporary table option.

Regardless of that, I don't see why you could not consider the
pre-built subquery option, unless you find that ODBC, or RODBC, has a
limit on what can be sent to the database.

Coey


   
  On Thursday, September 11, 2008, at 12:47PM, Coey Minear [EMAIL 
  PROTECTED] wrote:
  Aaron Mackey writes:
I guess I'd do it something like this:

dbGetQuery(con, CREATE TEMPORARY TABLE foo ( etc etc))
sapply(@userids, function (x) { dbGetQuery(con, paste(INSERT INTO foo
(userid) VALUES (, x, ))) })

then later:

dbGetQuery(con, DROP TABLE foo);

  
  Actually, based on my reading of the DBI reference, you should be able
  to do the following to create a table (although possibly not temporary):
  
dbWriteTable(connection, r_user_ids, r)
  
  Then you can use the following to drop the table:
  
dbRemoveTable(connection, r_user_ids)
  
  Of course, I don't know whether the ODBC driver implements these
  functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
  have been assuming that.)
  
  Coey
  
-Aaron

On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony [EMAIL PROTECTED] wrote:

 Perhaps I will need to create a temp table, but I am asking if there 
 is a way to avoid it.  It would be great if there were a way to tie 
 the R data frame temporarily to the query in a transparent fashion. 
 If not, I will see if I can create/drop the temp table directly from 
 sqlQuery.
 -Avram



 On Thursday, September 11, 2008, at 12:07PM, Aaron Mackey [EMAIL 
 PROTECTED] wrote:
Sorry, I see now you want to avoid this, but you did ask what was the
best way to efficiently ..., and the temp. table solution certainly
matches your description.  What's wrong with using a temporary table?

-Aaron

On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey [EMAIL PROTECTED] 
wrote:
 I would load your set of userid's into a temporary table in oracle,
 then join that table with the rest of your SQL query to get only the
 matching rows out.

 -Aaron

 On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony [EMAIL PROTECTED] 
 wrote:

 Dear R list,

 What is the best way to efficiently marry an R dataset with a very 
 large (Oracle) database table?

 The goal is to only return Oracle table rows that match IDs 
 present in the R dataset.
 I have an R data frame with 2000 user IDs analogous to: r = 
 data.frame(userid=round(runif(2000)*10,0))

 ...and I need to pull data from an Oracle table only for these 
 2000 IDs.  The Oracle table is quite large. Additionally, the sql 
 query may need to join to other tables to bring in ancillary 
 fields.

 I currently connect to Oracle via odbc:

 library(RODBC)
 connection - odbcConnect(, uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from largetable 
 where timestamp  sysdate -7)

 ...allowing me to pull data from the database table into the R 
 object d and then use the R merge function.  The problem however 
 is that if d is too large it may fail due to memory limitations 
 or be inefficient.  I would like to push the merge portion to 

Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
One possibility is as follows:

If r$userid is your array of (2000) ID's then
s - paste(r$userid,sep=,)
s- paste(select t.userid, x, y, z from largetable t where t.serid in 
(,s,),sep=)
and finally
d - sqlQuery(connection,s)

Regards,

Moshe.


--- On Fri, 12/9/08, Avram Aelony [EMAIL PROTECTED] wrote:

 From: Avram Aelony [EMAIL PROTECTED]
 Subject: [R] database table merging tips with R
 To: [EMAIL PROTECTED]
 Received: Friday, 12 September, 2008, 4:33 AM
 Dear R list,
 
 What is the best way to efficiently marry an R dataset with
 a very large (Oracle) database table?  
 
 The goal is to only return Oracle table rows that match IDs
 present in the R dataset.  
 I have an R data frame with 2000 user IDs analogous to: r =
 data.frame(userid=round(runif(2000)*10,0))
 
 ...and I need to pull data from an Oracle table only for
 these 2000 IDs.  The Oracle table is quite large.
 Additionally, the sql query may need to join to other tables
 to bring in ancillary fields.
 
 I currently connect to Oracle via odbc: 
 
 library(RODBC)
 connection - odbcConnect(,
 uid=, pwd=)
 d = sqlQuery(connection, select userid, x, y, z from
 largetable where timestamp  sysdate -7)
 
 ...allowing me to pull data from the database table into
 the R object d and then use the R merge
 function.  The problem however is that if d is
 too large it may fail due to memory limitations or be
 inefficient.  I would like to push the merge portion to the
 database and it would be very convenient if it were possible
 to request that the query look to the R object for the
 ID's to which it should restrict the output.  
 
 Is there a way to do this?
 Something like the following fictional code:
 d = sqlQuery(connection, select t.userid, x, y, z
 from largetable t where r$userid=t.userid)
 
 Would sqldf (http://code.google.com/p/sqldf/) help me out
 here? If so, how?   This would be convenient and help me
 avoid needing to create a temporary table to store the R
 data, join via sql, then return the data back to R.
 
 I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
 
 Thanks for your comments, ideas, recommendations.
 
 
 -Avram
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Moshe Olshansky
Just a small correction:

start with 

s - paste(r$userid,collapse=,)

and not 

s - paste(r$userid,sep=,)

--- On Fri, 12/9/08, Moshe Olshansky [EMAIL PROTECTED] wrote:

 From: Moshe Olshansky [EMAIL PROTECTED]
 Subject: Re: [R] database table merging tips with R
 To: [EMAIL PROTECTED], Avram Aelony [EMAIL PROTECTED]
 Received: Friday, 12 September, 2008, 8:59 AM
 One possibility is as follows:
 
 If r$userid is your array of (2000) ID's then
 s - paste(r$userid,sep=,)
 s- paste(select t.userid, x, y, z from largetable
 t where t.serid in (,s,),sep=)
 and finally
 d - sqlQuery(connection,s)
 
 Regards,
 
 Moshe.
 
 
 --- On Fri, 12/9/08, Avram Aelony [EMAIL PROTECTED]
 wrote:
 
  From: Avram Aelony [EMAIL PROTECTED]
  Subject: [R] database table merging tips with R
  To: [EMAIL PROTECTED]
  Received: Friday, 12 September, 2008, 4:33 AM
  Dear R list,
  
  What is the best way to efficiently marry an R dataset
 with
  a very large (Oracle) database table?  
  
  The goal is to only return Oracle table rows that
 match IDs
  present in the R dataset.  
  I have an R data frame with 2000 user IDs analogous
 to: r =
  data.frame(userid=round(runif(2000)*10,0))
  
  ...and I need to pull data from an Oracle table only
 for
  these 2000 IDs.  The Oracle table is quite large.
  Additionally, the sql query may need to join to other
 tables
  to bring in ancillary fields.
  
  I currently connect to Oracle via odbc: 
  
  library(RODBC)
  connection - odbcConnect(,
  uid=, pwd=)
  d = sqlQuery(connection, select userid, x, y, z
 from
  largetable where timestamp  sysdate -7)
  
  ...allowing me to pull data from the database table
 into
  the R object d and then use the R merge
  function.  The problem however is that if
 d is
  too large it may fail due to memory limitations or be
  inefficient.  I would like to push the merge portion
 to the
  database and it would be very convenient if it were
 possible
  to request that the query look to the R object for the
  ID's to which it should restrict the output.  
  
  Is there a way to do this?
  Something like the following fictional code:
  d = sqlQuery(connection, select t.userid, x, y,
 z
  from largetable t where r$userid=t.userid)
  
  Would sqldf (http://code.google.com/p/sqldf/) help me
 out
  here? If so, how?   This would be convenient and help
 me
  avoid needing to create a temporary table to store the
 R
  data, join via sql, then return the data back to R.
  
  I am using R version 2.7.2 (2008-08-25) /
 i386-pc-mingw32 .
  
  Thanks for your comments, ideas, recommendations.
  
  
  -Avram
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] database table merging tips with R

2008-09-11 Thread Thomas Lumley

On Thu, 11 Sep 2008, Coey Minear wrote:


Actually, based on my reading of the DBI reference, you should be able
to do the following to create a table (although possibly not temporary):

 dbWriteTable(connection, r_user_ids, r)

Then you can use the following to drop the table:

 dbRemoveTable(connection, r_user_ids)

Of course, I don't know whether the ODBC driver implements these
functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
have been assuming that.)



RODBC is not built on DBI, but it has analogous functions: sqlSave and sqlDrop.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.