At some point I'd like to try to compile the DBI-based ROracle package as well. 
 For now though, I'll stick with RODBC as it seems to do what I need.  I 
believe RODBC is not based on DBI, but that shouldn't preclude using the 
pre-built subquery option which is a great idea and should work regardless.  I 
will try that.   It might also be a great way to implement the innards of the 
fictional code (for both RODBC and ROracle ) that references a column in the R 
data frame as I had suggested previously or a better variation on that...!

Thanks for your responses and insight.

Avram


 
On Thursday, September 11, 2008, at 02:19PM, "Coey Minear" <[EMAIL PROTECTED]> 
wrote:
>Avram Aelony writes:
> > 
> > I have not devoted time to setting up ROracle since binaries are
> > not available and it seems to require some effort to compile (see
> > http://cran.r-project.org/web/packages/ROracle/index.html).  On the
> > other hand, RODBC worked more or less magically once I set up the
> > data sources.
> > 
> > What is your success using ROracle and why would it be preferable
> > to RODBC ?
> > 
> > -Avram
> > 
>
>Actually, I've only been using RSQLite, so that's where my (limited)
>knowledge is coming from (and only for less than a week).  RSQLite
>based on DBI, and noticed that Aaron and I were answering with DBI
>functions.  I then noticed that you kept referring to "sqlQuery".
>That's why I raised the issue of whether RODBC is based on DBI or not.
>
>If RODBC is not based on DBI, then obviously you cannot leverage the
>use of dbWriteTable and dbRemoveTable, as they may not be present.  If
>RODBC offers something similar, you'll have to find out, but that may
>have driven your initial desire to avoid the temporary table option.
>
>Regardless of that, I don't see why you could not consider the
>pre-built subquery option, unless you find that ODBC, or RODBC, has a
>limit on what can be sent to the database.
>
>Coey
>
>
> >  
> > On Thursday, September 11, 2008, at 12:47PM, "Coey Minear" <[EMAIL 
> > PROTECTED]> wrote:
> > >Aaron Mackey writes:
> > > > I guess I'd do it something like this:
> > > > 
> > > > dbGetQuery(con, "CREATE TEMPORARY TABLE foo ( etc etc)")
> > > > sapply(@userids, function (x) { dbGetQuery(con, paste("INSERT INTO foo
> > > > (userid) VALUES (", x, ")")) })
> > > > 
> > > > then later:
> > > > 
> > > > dbGetQuery(con, "DROP TABLE foo");
> > > > 
> > >
> > >Actually, based on my reading of the DBI reference, you should be able
> > >to do the following to create a table (although possibly not temporary):
> > >
> > >  dbWriteTable(connection, "r_user_ids", r)
> > >
> > >Then you can use the following to drop the table:
> > >
> > >  dbRemoveTable(connection, "r_user_ids")
> > >
> > >Of course, I don't know whether the ODBC driver implements these
> > >functions or not.  (Is 'RODBC' built on DBI?  Looks like Aaron and I
> > >have been assuming that.)
> > >
> > >Coey
> > >
> > > > -Aaron
> > > > 
> > > > On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > Perhaps I will need to create a temp table, but I am asking if there 
> > > > > is a way to avoid it.  It would be great if there were a way to tie 
> > > > > the R data frame temporarily to the query in a transparent fashion. 
> > > > > If not, I will see if I can create/drop the temp table directly from 
> > > > > sqlQuery.
> > > > > -Avram
> > > > >
> > > > >
> > > > >
> > > > > On Thursday, September 11, 2008, at 12:07PM, "Aaron Mackey" <[EMAIL 
> > > > > PROTECTED]> wrote:
> > > > >>Sorry, I see now you want to avoid this, but you did ask what was the
> > > > >>"best way to efficiently ...", and the temp. table solution certainly
> > > > >>matches your description.  What's wrong with using a temporary table?
> > > > >>
> > > > >>-Aaron
> > > > >>
> > > > >>On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey <[EMAIL PROTECTED]> 
> > > > >>wrote:
> > > > >>> I would load your set of userid's into a temporary table in oracle,
> > > > >>> then join that table with the rest of your SQL query to get only the
> > > > >>> matching rows out.
> > > > >>>
> > > > >>> -Aaron
> > > > >>>
> > > > >>> On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony <[EMAIL PROTECTED]> 
> > > > >>> wrote:
> > > > >>>>
> > > > >>>> Dear R list,
> > > > >>>>
> > > > >>>> What is the best way to efficiently marry an R dataset with a very 
> > > > >>>> large (Oracle) database table?
> > > > >>>>
> > > > >>>> The goal is to only return Oracle table rows that match IDs 
> > > > >>>> present in the R dataset.
> > > > >>>> I have an R data frame with 2000 user IDs analogous to: r = 
> > > > >>>> data.frame(userid=round(runif(2000)*100000,0))
> > > > >>>>
> > > > >>>> ...and I need to pull data from an Oracle table only for these 
> > > > >>>> 2000 IDs.  The Oracle table is quite large. Additionally, the sql 
> > > > >>>> query may need to join to other tables to bring in ancillary 
> > > > >>>> fields.
> > > > >>>>
> > > > >>>> I currently connect to Oracle via odbc:
> > > > >>>>
> > > > >>>> library(RODBC)
> > > > >>>> connection <- odbcConnect("****", uid="****", pwd="****")
> > > > >>>> d = sqlQuery(connection, "select userid, x, y, z from largetable 
> > > > >>>> where timestamp > sysdate -7")
> > > > >>>>
> > > > >>>> ...allowing me to pull data from the database table into the R 
> > > > >>>> object "d" and then use the R merge function.  The problem however 
> > > > >>>> is that if "d" is too large it may fail due to memory limitations 
> > > > >>>> or be inefficient.  I would like to push the merge portion to the 
> > > > >>>> database and it would be very convenient if it were possible to 
> > > > >>>> request that the query look to the R object for the ID's to which 
> > > > >>>> it should restrict the output.
> > > > >>>>
> > > > >>>> Is there a way to do this?
> > > > >>>> Something like the following fictional code:
> > > > >>>> d = sqlQuery(connection, "select t.userid, x, y, z from largetable 
> > > > >>>> t where r$userid=t.userid")
> > > > >>>>
> > > > >>>> Would sqldf (http://code.google.com/p/sqldf/) help me out here? If 
> > > > >>>> so, how?   This would be convenient and help me avoid needing to 
> > > > >>>> create a temporary table to store the R data, join via sql, then 
> > > > >>>> return the data back to R.
> > > > >>>>
> > > > >>>> I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
> > > > >>>> Thanks for your comments, ideas, recommendations.
> > > > >>>>
> > > > >>>>
> > > > >>>> -Avram
> > > > >>>>
> > > > >>>> ______________________________________________
> > > > >>>> R-help@r-project.org mailing list
> > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >>>> PLEASE do read the posting guide 
> > > > >>>> http://www.R-project.org/posting-guide.html
> > > > >>>> and provide commented, minimal, self-contained, reproducible code.
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > > 
> > > > ______________________________________________
> > > > R-help@r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide 
> > > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > > 
> > >
> > >
> > >
> > 
>
>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to