No wasn't know. Now added to the to do list.

Thanks!
Matthew

On 05.01.2013 21:05, patricknic wrote:
Hit a snag reading some imperfect data. I'm not sure what it was exported from, but the file has some lines with consecutive quotation marks (i.e., a character field actually contained quotation marks before it was written to a text file). Not sure if this is a known issue. A reproducible example:

text <- paste(rep(c('a,b,c,d,e,f\na,b,c,"d",e,f\na,b,c,""d"",e,f'), 10000),
collapse="\n")
f <- tempfile()
writeLines(text, f)

df <- read.table(f, sep=",")
dt <- fread(f, sep=",", header=FALSE)


No error for read.table, but I get this error for fread:

Error in fread(f, sep = ",", header = FALSE) :
Expected sep (',') but 'd' ends field 4 on line 30 when detecting types:
a,b,c,""d


This also gave me an idea for a suggestion: text replacement in readfile.c. (I'm no C programmer, so I don't know if this would be more trouble than it's worth. Also, not sure if it is in your project scope.) An R mock-up
(still using fread) of this would be something like:

freadWrapper <- function(input=f, eliminate='"', ...) {
  A <- readLines(f)
  B <- gsub(eliminate, "", A)
  C <- paste(B, collapse="\n")
  fread(C, ...)
}
freadWrapper(f, sep=",", stringsAsFactors=FALSE, header=FALSE)





--
View this message in context:

http://r.789695.n4.nabble.com/New-function-fread-in-v1-8-7-tp4653745p4654754.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
[email protected]

https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to