Liang, Can you do me a favor and run the predictOnvalues on a sample test data, and see if it is working on your end, it is not working for me. It keeps predicting 0.
My code: val conf = new SparkConf().setMaster("local[2]").setAppName("StreamingLinearRegression") val ssc = new StreamingContext(conf, Seconds(args(2).toLong)) val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse) val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse) val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001) model.trainOn(trainingData) model.predictOnValues(testData.map(lp => (lp.label, lp.features))).print() ssc.start() ssc.awaitTermination() Thanks Tri From: Bui, Tri [mailto:tri....@verizonwireless.com.INVALID] Sent: Tuesday, November 25, 2014 9:52 AM To: Yanbo Liang Cc: user@spark.apache.org Subject: RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 4:08 AM To: Bui, Tri Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 1416908900000 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.9999999999998588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri <tri....@verizonwireless.com.invalid<mailto:tri....@verizonwireless.com.invalid>>: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri