"Gabor Grothendieck" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> You could try using the COM interface rather than the ODBC
> interface.  Try code such as this:
>
> library(RDCOMClient)
> xls <- COMCreate("Excel.Application")
> xls[["Workbooks"]]$Open("MySpreadsheet.xls")
> sheet <- xls[["ActiveSheet"]]
> mydata <- sheet[["UsedRange"]][["value"]]
> xls$Quit()
>
> # convert mydata to a character matrix
> mydata.char <- matrix(unlist(mydata), nc = length(xx))

Gabor,

Thank you for that suggestion.  I try to avoid COM, but it seems to work
well with this problem.

Because I have empty cells, which are treated as NULLS, the unlist didn't
quite work.

Here's what I did:

library(RDCOMClient)
xls <- COMCreate("Excel.Application")
xls[["Workbooks"]]$Open("U:/efg/lab/R/Plasmid/construct list.xls")
sheet <- xls[["ActiveSheet"]]
mydata <- sheet[["UsedRange"]][["value"]]
xls$Quit()

for (column in 1:length(mydata))
{
  cat(column, " ", length(mydata[[column]]), " ",
length(unlist(mydata[[column]])), "\n")
}

The results show that while mydata is a list of columns, if you unlist each
column you'll be short by the number of NULL values.

1   1251   1251
2   1251   1198
3   1251   870
4   1251   327
5   1251   1250

This seemed a bit crude to fix that problem (can someone suggest a more
elegant way?):

mymatrix <- NULL
for (column in 1:length(mydata))
{
 # Use lappy to replace NULLs with "" strings, column-by-column
  mymatrix <- cbind(mymatrix, lapply(mydata[[column]], function(cell) {
ifelse(is.null(cell), "", cell) } ))
}
# Fix column names
colnames(mymatrix) <- mymatrix[1,]
mymatrix <- mymatrix[-1,]

> mymatrix[273:276,]
     Plasmid Number Plasmid
Concentration Comments Lost
[1,] 274            "yxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxy"         "1 ug/ul"
"4 mg"   ""
[2,] "275a"         "xyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyx" "1 ug/2
ul"   ""       ""
[3,] "275b"         "xyxyxyxyxyxyxyxyxyxyxyxyx"                   "1 ug/5
ul"   ""       ""
[4,] 276            "xyxyxyxyxyxyxyxyxyxyxyxyxyxy"                "1 ug/5
ul"   ""       "Assumed Lost"

Thank you for preserving "275a" and "275b" as the names here.

So, I'd recommend RDCOMClient over RODBC with Excel files.  "Being lucky"
shouldn't be part of processing Excel files.

efg

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to