Many thanks. I'll take a look. If you can find a way to narrow
down the problem then it might be quicker to resolve. Does it
happen with the first 2 items passed to rblindlist, the first
10, which one causes the NA? If each item is chopped to the
first 2 rows, does it still happen?
Also if the list of data.table/data.frame passed to rbindlist
is called L, and rbindlist(L) returns an NA column, does
lapply(L, sapply, class) reveal any type differences?
It does sound like rblindlist should be issuing a warning or
being more helpful at least, anyway.
Hm. It seems I put it in but commented it out :
if (TYPEOF(thiscol) != TYPEOF(target)) {
thiscol = PROTECT(coerceVector(thiscol, TYPEOF(target)));
coerced = TRUE;
// TO DO: options(datatable.pedantic=TRUE) to issue this warning :
// warning("Column %d of item %d is type '%s', inconsistent with
column %d of item %d's type
('%s')",j+1,i+1,type2char(TYPEOF(thiscol)),j+1,first+1,type2char(TYPEOF(target)));
}
Likely that coerce is creating the NA. Types are taken from the first
item of L. If a column there is 'numeric' then in a later item L it's
character, that'll give rise to an NA.
Thinking about it, it can probably coerce the target to cope with the
later item ...
On 03.01.2013 20:30, patricknic wrote:
Apologies, I forgot to switch the directories in the code. Corrected
on
nabble and below.
# Directories
tempwd <- tempdir()
setwd(tempwd)
# Packages
library(dataframe)
library(data.table)
library(foreign)
# Get blocks and coordinates
state.fips <- as.character(c(paste0(0, c(1:2, 4:6, 8:9)), 10:13,
15:42,
44:51, 53:56))
tmpf <- tempfile(fileext=".zip")
dtlist <- lapply(state.fips, function(fips) {
cat("State", fips, ":\t")
nm <- paste0("tl_2011_", fips, "_tabblock")
dbfname <- paste0(nm, ".dbf")
if (!file.exists(file.path(tempwd, dbfname))) {
cat("Downloading...\t")
url <-
paste0("http://www2.census.gov/geo/tiger/TIGER2011/TABBLOCK/",
nm, ".zip")
download.file(url, destfile=tmp, quiet=FALSE)
unzip(tmp, exdir=tempwd)
}
del <- dir(tempwd, pattern=nm)
invisible(lapply(del[grep("dbf", del, invert=TRUE)], file.remove))
cat("Reading...\t")
df <- read.dbf(dbfname, as.is=TRUE)
dt <- as.data.table(df)
cat("Done\n")
dt[, list(blockfips = GEOID, land_area = ALAND, water_area =
AWATER, long
= as.numeric(INTPTLON),
lat = as.numeric(INTPTLAT))]
})
b <- rbindlist(dtlist)
### No NA problem:
dtlist2 <- lapply(dtlist, as.data.frame)
b2 <- do.call("rbind", dtlist2)
--
View this message in context:
http://r.789695.n4.nabble.com/NAs-introduced-by-coercion-in-rbindlist-tp4654576p4654577.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help