"Kevin Wright" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > >From my experience (somewhat of a guess):
> Excel uses the first 16 rows of data to determine if a column is numeric or > character. The data type which is most common in the first 16 rows will then > be used for the whole column. I ran some experiments trying to force RODBC to read column 1 of my worksheet as character data (the data are mostly numbers with two exceptions, 275a and 275b, as mentioned earlier). Here's the base code: > library(RODBC) > channel <- odbcConnectExcel("U:/efg/lab/R/Krumlauf-Plasmid/construct list.xls") > plasmid <- sqlFetch(channel,"Sheet1", as.is=TRUE) > odbcClose(channel) > names(plasmid) [1] "Plasmid Number" "Plasmid" "Concentration" "Comments" "Lost" When Excel Sheet1 has rows 2:13 as an "X" to attempt to force treatment of column 1 as character data: > class(plasmid$"Plasmid Number") [1] "numeric" > typeof(plasmid$"Plasmid Number") [1] "double" > plasmid$"Plasmid Number"[1:20] [1] NA NA NA NA NA NA NA NA NA NA NA NA 2 3 4 5 6 7 8 9 Why would any software with 12 consecutive "X" character strings "assume" the data are purely numeric? Add one more "X" so rows 2:14 have an "X" to attempt to force treatment of column 1 as character data: > class(plasmid$"Plasmid Number") [1] "character" > typeof(plasmid$"Plasmid Number") [1] "character" > plasmid$"Plasmid Number"[1:20] [1] "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" NA NA NA NA NA NA NA So RODBC now recognizes "character" Xs in column 1 and then declares all numbers as invalid? These are incredibly (bad) assumptions. I say this is a "bug", but it may be an ODBC problem and not one with "R. And if this is not an official "bug", then it's a serious design problem. Minimally, this issue should be described in the R Data Import/Export document, which everyone is told to read before asking a question. It's frustrating when packages like this work for "toy" problems, and the documentation never mentions the pitfalls of real data. > The gregmisc bundle has a different read.xls function that uses a Perl > script (xls2csv) and seems to be safer with mixed-type columns. > Requires a working version of Perl. Thanks for this suggestion, but I think I'll just convert the Excel spreadsheet to a .csv and maintain it in that format. efg ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html