spark git commit: [SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicRegression user guide

meng Tue, 18 Aug 2015 12:56:40 -0700

Repository: spark
Updated Branches:
  refs/heads/branch-1.5 7ff0e5d2f -> 80debff12



[SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicRegression 
user guide

Add Python examples for mllib IsotonicRegression user guide

Author: Yanbo Liang <yblia...@gmail.com>

Closes #8225 from yanboliang/spark-10029.

(cherry picked from commit f4fa61effe34dae2f0eab0bef57b2dee220cf92f)
Signed-off-by: Xiangrui Meng <m...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80debff1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80debff1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80debff1

Branch: refs/heads/branch-1.5
Commit: 80debff123e0b5dcc4e6f5899753a736de2c8e75
Parents: 7ff0e5d
Author: Yanbo Liang <yblia...@gmail.com>
Authored: Tue Aug 18 12:55:36 2015 -0700
Committer: Xiangrui Meng <m...@databricks.com>
Committed: Tue Aug 18 12:55:42 2015 -0700

----------------------------------------------------------------------
 docs/mllib-isotonic-regression.md | 35 ++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/80debff1/docs/mllib-isotonic-regression.md
----------------------------------------------------------------------
diff --git a/docs/mllib-isotonic-regression.md 
b/docs/mllib-isotonic-regression.md
index 5732bc4..6aa881f 100644
--- a/docs/mllib-isotonic-regression.md
+++ b/docs/mllib-isotonic-regression.md
@@ -160,4 +160,39 @@ model.save(sc.sc(), "myModelPath");
 IsotonicRegressionModel sameModel = IsotonicRegressionModel.load(sc.sc(), 
"myModelPath");
 {% endhighlight %}
 </div>
+
+<div data-lang="python" markdown="1">
+Data are read from a file where each line has a format label,feature
+i.e. 4710.28,500.00. The data are split to training and testing set.
+Model is created using the training set and a mean squared error is calculated 
from the predicted
+labels and real labels in the test set.
+
+{% highlight python %}
+import math
+from pyspark.mllib.regression import IsotonicRegression, 
IsotonicRegressionModel
+
+data = sc.textFile("data/mllib/sample_isotonic_regression_data.txt")
+
+# Create label, feature, weight tuples from input data with weight set to 
default value 1.0.
+parsedData = data.map(lambda line: tuple([float(x) for x in line.split(',')]) 
+ (1.0,))
+
+# Split data into training (60%) and test (40%) sets.
+training, test = parsedData.randomSplit([0.6, 0.4], 11)
+
+# Create isotonic regression model from training data.
+# Isotonic parameter defaults to true so it is only shown for demonstration
+model = IsotonicRegression.train(training)
+
+# Create tuples of predicted and real labels.
+predictionAndLabel = test.map(lambda p: (model.predict(p[1]), p[0]))
+
+# Calculate mean squared error between predicted and real labels.
+meanSquaredError = predictionAndLabel.map(lambda pl: math.pow((pl[0] - pl[1]), 
2)).mean()
+print("Mean Squared Error = " + str(meanSquaredError))
+
+# Save and load model
+model.save(sc, "myModelPath")
+sameModel = IsotonicRegressionModel.load(sc, "myModelPath")
+{% endhighlight %}
+</div>
 </div>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicRegression user guide

Reply via email to