I guess I'd do it something like this:

dbGetQuery(con, "CREATE TEMPORARY TABLE foo ( etc etc)")
sapply(@userids, function (x) { dbGetQuery(con, paste("INSERT INTO foo
(userid) VALUES (", x, ")")) })

then later:

dbGetQuery(con, "DROP TABLE foo");

-Aaron

On Thu, Sep 11, 2008 at 3:21 PM, Avram Aelony <[EMAIL PROTECTED]> wrote:
>
> Perhaps I will need to create a temp table, but I am asking if there is a way 
> to avoid it.  It would be great if there were a way to tie the R data frame 
> temporarily to the query in a transparent fashion. If not, I will see if I 
> can create/drop the temp table directly from sqlQuery.
> -Avram
>
>
>
> On Thursday, September 11, 2008, at 12:07PM, "Aaron Mackey" <[EMAIL 
> PROTECTED]> wrote:
>>Sorry, I see now you want to avoid this, but you did ask what was the
>>"best way to efficiently ...", and the temp. table solution certainly
>>matches your description.  What's wrong with using a temporary table?
>>
>>-Aaron
>>
>>On Thu, Sep 11, 2008 at 3:05 PM, Aaron Mackey <[EMAIL PROTECTED]> wrote:
>>> I would load your set of userid's into a temporary table in oracle,
>>> then join that table with the rest of your SQL query to get only the
>>> matching rows out.
>>>
>>> -Aaron
>>>
>>> On Thu, Sep 11, 2008 at 2:33 PM, Avram Aelony <[EMAIL PROTECTED]> wrote:
>>>>
>>>> Dear R list,
>>>>
>>>> What is the best way to efficiently marry an R dataset with a very large 
>>>> (Oracle) database table?
>>>>
>>>> The goal is to only return Oracle table rows that match IDs present in the 
>>>> R dataset.
>>>> I have an R data frame with 2000 user IDs analogous to: r = 
>>>> data.frame(userid=round(runif(2000)*100000,0))
>>>>
>>>> ...and I need to pull data from an Oracle table only for these 2000 IDs.  
>>>> The Oracle table is quite large. Additionally, the sql query may need to 
>>>> join to other tables to bring in ancillary fields.
>>>>
>>>> I currently connect to Oracle via odbc:
>>>>
>>>> library(RODBC)
>>>> connection <- odbcConnect("****", uid="****", pwd="****")
>>>> d = sqlQuery(connection, "select userid, x, y, z from largetable where 
>>>> timestamp > sysdate -7")
>>>>
>>>> ...allowing me to pull data from the database table into the R object "d" 
>>>> and then use the R merge function.  The problem however is that if "d" is 
>>>> too large it may fail due to memory limitations or be inefficient.  I 
>>>> would like to push the merge portion to the database and it would be very 
>>>> convenient if it were possible to request that the query look to the R 
>>>> object for the ID's to which it should restrict the output.
>>>>
>>>> Is there a way to do this?
>>>> Something like the following fictional code:
>>>> d = sqlQuery(connection, "select t.userid, x, y, z from largetable t where 
>>>> r$userid=t.userid")
>>>>
>>>> Would sqldf (http://code.google.com/p/sqldf/) help me out here? If so, 
>>>> how?   This would be convenient and help me avoid needing to create a 
>>>> temporary table to store the R data, join via sql, then return the data 
>>>> back to R.
>>>>
>>>> I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
>>>> Thanks for your comments, ideas, recommendations.
>>>>
>>>>
>>>> -Avram
>>>>
>>>> ______________________________________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to