Dear R-listers:
I want to import a reasonably big file into a table. (15797 x 257 columns). The file is tab delimited with NA in every empty space. I have reproduced what I have used as my read.table instruction. I have read the R-dataImportExport FAQ and still couldn't solve my problem. (I might have missed it, of course). I'm using R.2.01 in a Mac G4, 10.3.7.
I can import the file, but one of the columns "invades the other", meaning that the if there is an empty space marked as NA on the first column, it gets the value of the second column. I tried to import four different files (details below) and I think the problem is with the number of columns (with less columns it works)
workarounds:
a) I can separate my file into several files, import them and then make one file in R
b) try to learn basic commands in awk? perl?
any advice on this?
another question (much less important) I have a binnary file in Splus for this object. I exported the object in Splus as it says in the FAQ (dump.data). But data.restore doesn't exist as a function. Is it because I'm using a Mac?
details of what I did:
##
a) importing a shorter version of my file (58 columns); I get the "invading" behaviour and a column of row.names that I don't understand where it comes from. (UNIQID should be empty and 1006 should be in All.FB.Id
AllFBImpFields <- read.table('AllFBAllFieldsNAShorter.txt', fill=T, header=T,
+ row.names=paste('a',1:15797, sep=''),
+ as.is=T, nrows=15797)
AllFBImpFields[1:2,1:5]
row.names UNIQID All.FB.Id All.FB.5 All.FB.4 a1 <NA> 10006 <NA> <NA> <NA> a2 <NA> 10007 <NA> <NA> <NA>
##
b) Importing only 5 cols of the previous file. It works. there is no "invasion" and the col row.names is not inserted
AllFB5Cols <- read.table('AllFB5Cols.txt', fill=T, header=T,
+ row.names=paste('a',1:15797, sep=''),
+ as.is=T, nrows=15797)
AllFB5Cols[1:2,1:5]
UNIQID All.FB.Id Symbol FB.gn CG.name a1 <NA> 10006 p53 FBgn0039044 CG10873 a2 <NA> 10007 Gr94a FBgn0041225 CG31280
##
c) importing file with 4 rows, 58 columns; invasion behaviour and a warning that I don't get in a) although the file is the same for the first 4 rows
x4rowsAllCol <- read.table('AllFB4rowsAllCols.txt', fill=T, header=T,
+ row.names=paste('a',1:4, sep=''),
+ as.is=T, nrows=4)
Warning message:
incomplete final line found by readTableHeader on `AllFB4rowsAllCols.txt'
x4rowsAllCol[1:2,1:5]
row.names UNIQID All.FB.Id All.FB.5 All.FB.4 a1 NA 10006 NA NA NA a2 NA 10007 NA NA NA
##
d) importing file with 4 rows and 4 cols, result is like b) but gives the same warning as c!)
x4rows5cols <- read.table('AllFB4rows5cols.txt', fill=T, header=T,
+ row.names=paste('a',1:4, sep=''),
+ as.is=T, nrows=4)
Warning message:
incomplete final line found by readTableHeader on `AllFB4rows5cols.txt'
x4rows5cols[1:2,1:5]
UNIQID All.FB.Id All.FB.5 All.FB.4 All.FB.3 a1 NA 10006 NA NA NA a2 NA 10007 NA NA NA
______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
