Hi,
More questions in my ongoing quest to convert from RapidMiner to R.
One thing has become VERY CLEAR: None of the issues I'm asking about
here are addressed in RapidMiner. How it handles misisng values,
scaling, etc. is hidden within the "black box". Using R is forcing me
to take a much deeper look at my data and how my experiments are
constructed. (That's a very "Good Thing")
So, on to the question...
I'm scaling data based on groups. I have it working well in a nice
loop. (This WORKS, but if someone has a faster/cleaner way, I'd be
curious.)
#group-wide normailzation
groups <- unique(rawdata$group)
group_names = grep('norm_',names(rawdata))
for(group in groups){
for(name in group_names){
rawdata[rawdata$code==group, name] <-
c(scale(rawdata[rawdata$code==group, name]))
}
}
My problem is that if the particular list of data I'm scoring is all 0,
then scale returns NaN for all of them, subsequently breaking my SVM
training.
>foo <- c(0,0,0,0,0)
>scale(foo)
[,1]
[1,] NaN
[2,] NaN
[3,] NaN
[4,] NaN
attr(,"scaled:center")
[1] 0
attr(,"scaled:scale")
[1] 0
I would have expected scale to just return back 0 for all the values.
Is there some trick to fixing this?
Thanks!
-Noah
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.