RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Thanks Yanbo! That works! The only issue is that it won’t print the predicted value from lp.features, from code line below. model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() It prints the test input data correctly, but it keeps on printing “0.0” as the predicted values, which is the lp.features. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: Thursday, November 27, 2014 12:22 AM To: Bui, Tri Cc: user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Hi Tri, Maybe my latest responds for your problem is lost, whatever, the following code snippet can run correctly. val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.algorithm.setIntercept(true) Because that all setXXX() function in StreamingLinearRegressionWithSGD will return this.type which is an instance of itself, so we need set other configuration in a separate line w/o return value. 2014-11-27 1:04 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Thanks Yanbo! Modified code below: val conf = new SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression) val ssc = new StreamingContext(conf, Seconds(args(2).toLong)) val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse) val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse) val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true) model.trainOn(trainingData) model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() ssc.start() ssc.awaitTermination() But I am getting compile error: [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54: value trainOn is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.trainOn(trainingData) [error] ^ [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55: value predictOnValues is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() [error] ^ [error] two errors found [error] (compile:compile) Compilation failed Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 8:57 PM To: Bui, Tri Cc: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Hi Tri, setIntercept() is not a member function of StreamingLinearRegressionWithSGD, it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD. So you need to change your code to: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .algorithm.setIntercept(true) Thanks Yanbo 2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 4:08 AM To: Bui, Tri Cc: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20
RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Liang, Can you do me a favor and run the predictOnvalues on a sample test data, and see if it is working on your end, it is not working for me. It keeps predicting 0. My code: val conf = new SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression) val ssc = new StreamingContext(conf, Seconds(args(2).toLong)) val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse) val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse) val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001) model.trainOn(trainingData) model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() ssc.start() ssc.awaitTermination() Thanks Tri From: Bui, Tri [mailto:tri@verizonwireless.com.INVALID] Sent: Tuesday, November 25, 2014 9:52 AM To: Yanbo Liang Cc: user@spark.apache.org Subject: RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 4:08 AM To: Bui, Tri Cc: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri
RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Thanks Yanbo! Modified code below: val conf = new SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression) val ssc = new StreamingContext(conf, Seconds(args(2).toLong)) val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse) val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse) val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true) model.trainOn(trainingData) model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() ssc.start() ssc.awaitTermination() But I am getting compile error: [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54: value trainOn is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.trainOn(trainingData) [error] ^ [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55: value predictOnValues is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() [error] ^ [error] two errors found [error] (compile:compile) Compilation failed Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 8:57 PM To: Bui, Tri Cc: user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Hi Tri, setIntercept() is not a member function of StreamingLinearRegressionWithSGD, it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD. So you need to change your code to: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .algorithm.setIntercept(true) Thanks Yanbo 2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 4:08 AM To: Bui, Tri Cc: user@spark.apache.orgmailto:user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri
Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Hi Tri, Maybe my latest responds for your problem is lost, whatever, the following code snippet can run correctly. val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.algorithm.setIntercept(true) Because that all setXXX() function in StreamingLinearRegressionWithSGD will return this.type which is an instance of itself, so we need set other configuration in a separate line w/o return value. 2014-11-27 1:04 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Thanks Yanbo! Modified code below: val conf = new SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression) val ssc = new StreamingContext(conf, Seconds(args(2).toLong)) val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse) val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse) val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true) model.trainOn(trainingData) model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() ssc.start() ssc.awaitTermination() But I am getting compile error: [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54: value trainOn is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.trainOn(trainingData) [error] ^ [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55: value predictOnValues is not a member of org.apache.spark.mllib.regression.LinearRegressionWithSGD [error] model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print() [error] ^ [error] two errors found [error] (compile:compile) Compilation failed Thanks Tri *From:* Yanbo Liang [mailto:yanboha...@gmail.com] *Sent:* Tuesday, November 25, 2014 8:57 PM *To:* Bui, Tri *Cc:* user@spark.apache.org *Subject:* Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD Hi Tri, setIntercept() is not a member function of StreamingLinearRegressionWithSGD, it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD. So you need to change your code to: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .algorithm.setIntercept(true) Thanks Yanbo 2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri *From:* Yanbo Liang [mailto:yanboha...@gmail.com] *Sent:* Tuesday, November 25, 2014 4:08 AM *To:* Bui, Tri *Cc:* user@spark.apache.org *Subject:* Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri
Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri
RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri From: Yanbo Liang [mailto:yanboha...@gmail.com] Sent: Tuesday, November 25, 2014 4:08 AM To: Bui, Tri Cc: user@spark.apache.org Subject: Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri
Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Hi Tri, setIntercept() is not a member function of StreamingLinearRegressionWithSGD, it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD. So you need to change your code to: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .algorithm.setIntercept(true) Thanks Yanbo 2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Thanks Liang! It was my bad, I fat finger one of the data point, correct it and the result match with yours. I am still not able to get the intercept. I am getting [error] /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47: value setIntercept mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD I try code below: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) model.setIntercept(addIntercept = true).trainOn(trainingData) and: val model = new StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)) .setIntercept(true) But still get compilation error. Thanks Tri *From:* Yanbo Liang [mailto:yanboha...@gmail.com] *Sent:* Tuesday, November 25, 2014 4:08 AM *To:* Bui, Tri *Cc:* user@spark.apache.org *Subject:* Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD The case run correctly in my environment. 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model updated at time 141690890 ms 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current model: weights, [0.8588] Can you provide more detail information if it is convenience? Turn on the intercept value can be set as following: val model = new StreamingLinearRegressionWithSGD() .algorithm.setIntercept(true) 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid: Hi, I am getting incorrect weights model from StreamingLinearRegressionwith SGD. One feature Input data is: (1,[1]) (2,[2]) … . (20,[20]) The result from the Current model: weights is [-4.432]….which is not correct. Also, how do I turn on the intercept value for the StreamingLinearRegression ? Thanks Tri