Hi,

I have a large table in Postgresql (result of an MCMC simulation, with 1
million rows) and I would like to retrive colums (correspond to variables)
using RODBC.  I have a column called "index" which is used to order rows.

Unfortunately, sqlQuery can't return all the values from a column at once
(RODBC complains about lack of memory).  So I am using the following code:

getcolumns <- function(channel, tablename, colnames, totalrows,
                      ordered=TRUE,chunksize=1e5) {
  r <- matrix(double(0),totalrows,length(colnames))
  for (i in 1:ceiling(totalrows/chunksize)) {
    cat(".")
    r[((i-1)*chunksize+1):(i*chunksize)] <- as.matrix(
      sqlQuery(channel, paste("SELECT", paste(colnames,collapse=", "),
                              "FROM", tablename,
                              "WHERE index <=", i*chunksize,
                              "AND index >", (i-1)*chunksize,
                              if (ordered) "ORDER BY index;" else ";")))
  }
  cat("\n")
  drop(r)                               # convert to vector if needed
}

to retrieve it in chunks.  However, this is very slow -- takes about 15
minutes on my machine.  Is there a way to speed it up?

I am running Linux on a powerbook, RODBC version 1.1-4, R 2.1.1.  The
machine has only 512 Mb of RAM.

Thanks,

Tamas

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to