Hi
I had thought I was finished with this aspect of the project but yesterday this
error appeared.
Error in strsplit(as.character(x), ...) : object 'variable' not found
This occurs immediately after melting a 363 column data table. A head of the
data shows :
> head(races_1$variable)
[1] raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1 raceDate_1
116 Levels: raceDate_1 raceDate_2 raceDate_3 raceDate_4 ... Winner_7
Now admittedly melt returns the variable column as a factor, but that works
just fine with toy data. I am
perplexed why it bombs on real data. The error message appears non nonsensical
because it is obvious the
column does exist. Also, since the error specifies STRSPLIT, not TSTRSPLIT, it
just may be one of the base
nonsensical error messages. Helpful to know there is an error, but....
Here is the code leading up to this error:
suppressMessages(library(data.table))
races.names <- colnames(races)
id_vars <- races.names[1:14]
measure_vars <- races.names[15:363] # yes, I have to reset the mode
afterwords.
system.time(races_1 <-melt(races, id = id_vars, measure =
measure_vars))
# Separate variable name from prior race numbers
# sequence (1:10)
races_1 <- races[, c("MPdata","PriorRaceSeq") :=
tstrsplit(variable, "_")]
> races_1 <- races[, c("MPdata","PriorRaceSeq") :=
+ tstrsplit(variable, "_")]
Error in strsplit(as.character(x), ...) : object 'variable' not found
And data str after this
> str(races_1)
Classes ‘data.table’ and 'data.frame': 48511 obs. of 16 variables:
$ TrackToday : chr "AQU" "AQU" "AQU" "AQU" ...
$ DateToday : int 20120101 20120101 20120101 20120101 20120101
20120101 20120101 20120101 20120101 20120101 ...
$ RaceNumberToday : int 1 1 1 1 1 1 1 1 2 2 ...
$ PostPositionToday : int 1 2 3 4 5 6 7 8 1 2 ...
$ DistanceToday : int 1320 1320 1320 1320 1320 1320 1320 1320 1320 1320
...
$ SurfaceToday : chr "d" "d" "d" "d" ...
$ RaceTypeToday : chr "AO" "AO" "AO" "AO" ...
$ RaceClassToday : chr "OClm 50000nw1" "OClm 50000nw1" "OClm 50000nw1"
"OClm 50000nw1" ...
$ PurseToday : int 51000 51000 51000 51000 51000 51000 51000 51000
60000 60000 ...
$ ClaimingPriceToday: int 50000 50000 50000 50000 50000 50000 50000 50000 NA
NA ...
$ MorningLineOdds : num 2 20 4 30 2.5 8 5 15 20 2 ...
$ HorseName : chr "FUNKY MUNKY MAMA" "STARSHIP WARPSPEED" "SIGGI THE
ALIEN" "SHANDREA" ...
$ HDWrunStyle : chr "E " "P " "EP " "E " ...
$ DaysSinceLastRace : int 78 45 31 17 39 30 17 50 NA NA ...
$ variable : Factor w/ 349 levels "raceDate_1","raceDate_2",..: 1 1 1
1 1 1 1 1 1 1 ...
$ value : chr "20111015" "20111117" "20111201" "20111215" ...
- attr(*, ".internal.selfref")=<externalptr>
Toy data that works thanks to prior help request:
library(data.table)
library(tidyr)
# data table for melt and columns split
dt1 <- data.table(a_1 = 1:10, b_2 = 20:29,folks = c("art","brian","ed",
"rich","dennis","frank", "derrick","paul","fred","numnuts"),
a_2 = 2:11, b_1 = 21:30)
melted <- melt(dt1, id = "folks")[,c("varType","varIndex") :=
tstrsplit(variable,"_")][,variable:=NULL]
What is also puzzling is that the next statement sets column "variable" to
NULL and that works. Thus logic
says it is not the column that is missing but "something" else entirely.
Any ideas greatly appreciated.
Carl Sutton
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help