Probably easier to change it outside of R, e.g.
perl -pe 's{0742076391\?39524}{whatever}g' file > newfile
but you may want to check that it really is a '?' character and not just
printed that way.
You could of course write this in R along the lines of
while (length(line <- readLines(in, 1L)) > 0) {
line <- sub("0,0742076391?39524", "whatever", line, fixed = TRUE)
writeLines(line, out)
}
for suitable connections in and out.
HTH
Allan
On 16/06/11 22:40, DanMik wrote:
Im fairly new to R.
I have a huge csv file, of 400.000+ K, and now it looks like one of the
values is corrupt. (it contains a ?, so one value becomes:
"0,0742076391?39524")
Because of the size i can't edit it in a text editor, and the file took
several days to create (many calculations)
When i read the file it cant be converted to numbers because of this one
value which i found with scan() and have found the coordinates of.
I'm reading the file with:
x<- read.csv2("filename.csv", stringsAsFactor= FALSE)
Can i read the file with everything as numeric, and replace non numeric
values with 0 ?
or somehow correct this one value?
I have tried first reading the file, then set the value to 0 and then use
as.matrix and afterwards as.numeric. This just creates a lot of NA
--
View this message in context:
http://r.789695.n4.nabble.com/Reading-corrupt-csv-and-replace-wrong-value-tp3603848p3603848.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help