subject:"StandardScaler in spark.ml.feature requires vector input\?"

Re: StandardScaler in spark.ml.feature requires vector input?

2016-01-11 Thread Yanbo Liang

Hi Kristina, The input column of StandardScaler must be vector type, because it's usually used as feature scaling before model training and the type of feature column should be vector in most cases. If you only want to standardize a numeric column, you can wrap it as a vector and feed into

StandardScaler in spark.ml.feature requires vector input?

2016-01-09 Thread Kristina Rogale Plazonic

Hi, The code below gives me an unexpected result. I expected that StandardScaler (in ml, not mllib) will take a specified column of an input dataframe and subtract the mean of the column and divide the difference by the standard deviation of the dataframe column. However, Spark gives me the