No wasn't know. Now added to the to do list.
Thanks!
Matthew
On 05.01.2013 21:05, patricknic wrote:
Hit a snag reading some imperfect data. I'm not sure what it was
exported
from, but the file has some lines with consecutive quotation marks
(i.e., a
character field actually contained quotation marks before it was
written to
a text file). Not sure if this is a known issue. A reproducible
example:
text <- paste(rep(c('a,b,c,d,e,f\na,b,c,"d",e,f\na,b,c,""d"",e,f'),
10000),
collapse="\n")
f <- tempfile()
writeLines(text, f)
df <- read.table(f, sep=",")
dt <- fread(f, sep=",", header=FALSE)
No error for read.table, but I get this error for fread:
Error in fread(f, sep = ",", header = FALSE) :
Expected sep (',') but 'd' ends field 4 on line 30 when detecting
types:
a,b,c,""d
This also gave me an idea for a suggestion: text replacement in
readfile.c.
(I'm no C programmer, so I don't know if this would be more trouble
than
it's worth. Also, not sure if it is in your project scope.) An R
mock-up
(still using fread) of this would be something like:
freadWrapper <- function(input=f, eliminate='"', ...) {
A <- readLines(f)
B <- gsub(eliminate, "", A)
C <- paste(B, collapse="\n")
fread(C, ...)
}
freadWrapper(f, sep=",", stringsAsFactors=FALSE, header=FALSE)
--
View this message in context:
http://r.789695.n4.nabble.com/New-function-fread-in-v1-8-7-tp4653745p4654754.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help