Hi
I have searched the last couple of days for a way to do this but have not found
a solution. With real data, I have used tidyr to do the task but:1. It has
used all available memory (12gb on older desktop)2. Future tables will be
even larger so would need to be split
3. It is is s l ow, perhaps due to lack of free memory.
The data is provided in a format such that a variable "name" (and there are
several like this) actually contains the variable name and indices, i.e.
var_09 is the ninth level of that variable. The data analysis needs that
level as a separate variable. Code and toy data set are below.
# column split test
library(data.table)
library(tidyr)
# data table for melt and columns split
dt1 <- data.table(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
"rich","dennis","frank", "derrick","paul","fred","numnuts"),
a_2 = 2:11, b_1 = 21:30)
melt(dt1, id = "folks") # so far so good
dt1[,c("a") := tstrsplit(c(a_1),"_",fixed = TRUE)][,c("a") := tstrsplit(c(a_2),
"_",fixed = TRUE)][]
# That is not producing what I want
# tidyr gives what I want
df <- data.frame(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
"rich","dennis","frank", "derrick","paul","fred","numnuts"),
a_2 = 2:11, b_1 = 21:30)
df %>% gather(value, nums, -folks) %>%
separate(value, c("varTYpe","varIndex")) Carl Sutton
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help