Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/4675#discussion_r24935325
--- Diff: docs/ml-guide.md ---
@@ -300,19 +302,21 @@ List<LabeledPoint> localTest = Lists.newArrayList(
new LabeledPoint(1.0, Vectors.dense(-1.0, 1.5, 1.3)),
new LabeledPoint(0.0, Vectors.dense(3.0, 2.0, -0.1)),
new LabeledPoint(1.0, Vectors.dense(0.0, 2.2, -1.5)));
-JavaSchemaRDD test = jsql.createDataFrame(jsc.parallelize(localTest),
LabeledPoint.class);
+DataFrame test = jsql.createDataFrame(jsc.parallelize(localTest),
LabeledPoint.class);
// Make predictions on test documents using the Transformer.transform()
method.
// LogisticRegression.transform will only use the 'features' column.
-// Note that model2.transform() outputs a 'probability' column instead of
the usual 'score'
-// column since we renamed the lr.scoreCol parameter previously.
-model2.transform(test).registerAsTable("results");
-JavaSchemaRDD results =
- jsql.sql("SELECT features, label, probability, prediction FROM
results");
+// Note that model2.transform() outputs a 'myProbability' column instead
of the usual
+// 'probability' column since we renamed the lr.probabilityCol parameter
previously.
+model2.transform(test).registerTempTable("results");
+DataFrame results =
+ jsql.sql("SELECT features, label, myProbability, prediction FROM
results");
--- End diff --
With the DataFrame API, we don't need to call SQL now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]