Re: Standard preprocessing/scaling

dataginjaninja Thu, 29 May 2014 05:33:07 -0700

I do see the issue for centering sparse data. Actually, the centering is less
important than the scaling by the standard deviation. Not having unit
variance causes the convergence issues and long runtimes.


RowMatrix will compute variance of a column?



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Standard-preprocessing-scaling-tp6826p6849.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Standard preprocessing/scaling

Reply via email to