On 25-Feb-05 Sean Davis wrote: > I have a commonly recurring problem and wondered if folks > would share tips. I routinely get tab-delimited text files > that I need to read in. > In very many cases, I get: > > > a <- read.table('junk.txt.txt',header=T,skip=10,sep="\t") > Error in scan(file = file, what = what, sep = sep, quote = quote, > dec = dec, : > line 67 did not have 88 elements > > I am typically able to go through the file and find a single > quote or something like that causing the problem, but with a > recent set of files, I haven't been able to find such an issue. > What can I do to get around this problem? I can use perl, also....
Hi Sean, This is only a shot in the dark, but your description has reminded me of similar messes in files which have been exported from Excel. What I have often done in such cases, to check (e.g.) the numbers of fields in records (using 'awk' on Linux) is on the following lines: cat filename | awk 'BEGIN{FS="\t"} {print NF}' | unique In that case, if there are varying numbers of fields then two or more different numbers will be printed instead of the single value which it should be. If you know how many fields to expect (e.g. 88), then you can find the line numbers of offending records by something like cat filename | awk 'BEGIN{FS="\t"} {if(NF!=88){print NR}}' In data files with a lot of records per line, doing it in this kind of way is vastly superior to trying to spot the problem by eye -- it's extemely difficult to count 88 tab-separated fields on screen! Hoping this helps! If not, supply further details and we'll see what we can think up. Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 25-Feb-05 Time: 20:54:43 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html