Repository: spark Updated Branches: refs/heads/branch-1.5 7ff0e5d2f -> 80debff12
[SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicRegression user guide Add Python examples for mllib IsotonicRegression user guide Author: Yanbo Liang <yblia...@gmail.com> Closes #8225 from yanboliang/spark-10029. (cherry picked from commit f4fa61effe34dae2f0eab0bef57b2dee220cf92f) Signed-off-by: Xiangrui Meng <m...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/80debff1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/80debff1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/80debff1 Branch: refs/heads/branch-1.5 Commit: 80debff123e0b5dcc4e6f5899753a736de2c8e75 Parents: 7ff0e5d Author: Yanbo Liang <yblia...@gmail.com> Authored: Tue Aug 18 12:55:36 2015 -0700 Committer: Xiangrui Meng <m...@databricks.com> Committed: Tue Aug 18 12:55:42 2015 -0700 ---------------------------------------------------------------------- docs/mllib-isotonic-regression.md | 35 ++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/80debff1/docs/mllib-isotonic-regression.md ---------------------------------------------------------------------- diff --git a/docs/mllib-isotonic-regression.md b/docs/mllib-isotonic-regression.md index 5732bc4..6aa881f 100644 --- a/docs/mllib-isotonic-regression.md +++ b/docs/mllib-isotonic-regression.md @@ -160,4 +160,39 @@ model.save(sc.sc(), "myModelPath"); IsotonicRegressionModel sameModel = IsotonicRegressionModel.load(sc.sc(), "myModelPath"); {% endhighlight %} </div> + +<div data-lang="python" markdown="1"> +Data are read from a file where each line has a format label,feature +i.e. 4710.28,500.00. The data are split to training and testing set. +Model is created using the training set and a mean squared error is calculated from the predicted +labels and real labels in the test set. + +{% highlight python %} +import math +from pyspark.mllib.regression import IsotonicRegression, IsotonicRegressionModel + +data = sc.textFile("data/mllib/sample_isotonic_regression_data.txt") + +# Create label, feature, weight tuples from input data with weight set to default value 1.0. +parsedData = data.map(lambda line: tuple([float(x) for x in line.split(',')]) + (1.0,)) + +# Split data into training (60%) and test (40%) sets. +training, test = parsedData.randomSplit([0.6, 0.4], 11) + +# Create isotonic regression model from training data. +# Isotonic parameter defaults to true so it is only shown for demonstration +model = IsotonicRegression.train(training) + +# Create tuples of predicted and real labels. +predictionAndLabel = test.map(lambda p: (model.predict(p[1]), p[0])) + +# Calculate mean squared error between predicted and real labels. +meanSquaredError = predictionAndLabel.map(lambda pl: math.pow((pl[0] - pl[1]), 2)).mean() +print("Mean Squared Error = " + str(meanSquaredError)) + +# Save and load model +model.save(sc, "myModelPath") +sameModel = IsotonicRegressionModel.load(sc, "myModelPath") +{% endhighlight %} +</div> </div> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org