On 10/09/2010 10:03 AM, Marcelo Estácio wrote:


Dear,

When I try to to execute the following command, R don't read all lines (reads 
only 57658 lines when the file has 814125 lines):



dados2<-read.table("C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua 
IFPD\\200701_02_03_04\\SegurosClube.txt",header=FALSE,sep="^",colClasses=c("character","character","NULL",NA,"NULL","NULL","NULL","character","character","NULL","NULL","NULL","NULL",NA,"NULL","NULL","NULL","NULL",NA,"NULL","NULL"),quote="",comment.char="",skip=1,fill=TRUE)

If I exclude "fill=TRUE", R gives the message



Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
   número de itens não é múltiplo do número de colunas (number of itens is not 
multiple of number of columns)



I identified that the problem is the following line of my data (line 57659 of 
my file):



13850074571^01/01/1940^00000000000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência
 privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE


As you can observe, my data have a "square" string like this:  (i don't know 
if you can see the character, but it looks like a white square). It looks like that R 
understands this character as the end of the archive.

I opened my data on the notepad and copied the character. When I paste this 
character on R, it try to close asking if I want to save my work. What is 
happenning?

That symbol is the way some systems display the hex 1A character, which in DOS marked the end of file. By the pathname it looks as though you're working on Windows, which has inherited that behaviour.

The best way to get around it would be to correct those bad characters: they are almost certainly errors in the data file. If you want to keep them, then you could try reading the file in binary mode rather than text mode. You do this using

con <- file( "filename", open="rb")
read.table(con, header=FALSE, ...)
close(con)

You could also try reading it on a different OS; I don't think Linux cares about 1A characters.

Duncan Murdoch



Thanks very much.

Marcelo Estácio

                                        
        [[alternative HTML version deleted]]



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to