You probably need to scale the values in the data set so that they are
all of comparable ranges and translate them so that their means get to 0.
You can use pyspark.mllib.feature.StandardScaler(True, True) object for
that.
On 28.5.2015. 6:08, Maheshakya Wijewardena wrote:
Hi,
I'm trying to use Sparks' *LinearRegressionWithSGD* in PySpark with
the attached dataset. The code is attached. When I check the model
weights vector after training, it contains `nan` values.
[nan,nan,nan,nan,nan,nan,nan,nan]
But for some data sets, this problem does not occur. What might be the reason
for this?
Is this an issue with the data I'm using or a bug?
Best regards.
--
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: mahesha...@wso2.com <mailto:mahesha...@wso2.com>
Mobile: +94711228855/*
*/
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org