[datatable-help] split data table column aka tidyr separate function

Carl Sutton Sun, 18 Dec 2016 19:17:14 -0800

Hi
I have searched the last couple of days for a way to do this but have not found 
a solution.   With real data, I have used tidyr to do the task but:1.   It has 
used all available memory (12gb on older desktop)2.   Future tables will be 
even larger so would need to be split
3.  It is is s l ow, perhaps due to lack of free memory.
The data is provided in a format such that a variable "name" (and there are 
several like this) actually contains the variable name and  indices, i.e. 
var_09 is the ninth level of that variable.   The data analysis needs that 
level as a separate variable.  Code and toy data set are below.
#  column split test
library(data.table)
library(tidyr)
#  data table for melt and columns split
dt1 <- data.table(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
                "rich","dennis","frank", "derrick","paul","fred","numnuts"),
                  a_2 = 2:11, b_1 = 21:30)
melt(dt1, id = "folks")  #  so far so good
dt1[,c("a") := tstrsplit(c(a_1),"_",fixed = TRUE)][,c("a") := tstrsplit(c(a_2),
                          "_",fixed = TRUE)][]
#  That is not producing what I want


#  tidyr gives what I want
df <- data.frame(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
                "rich","dennis","frank", "derrick","paul","fred","numnuts"),
                 a_2 = 2:11, b_1 = 21:30)
df %>% gather(value, nums, -folks) %>%
        separate(value, c("varTYpe","varIndex")) Carl Sutton

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

[datatable-help] split data table column aka tidyr separate function

Reply via email to