Github user thunterdb commented on a diff in the pull request:
https://github.com/apache/spark/pull/11549#discussion_r55597240
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala
---
@@ -569,9 +572,46 @@ class GeneralizedLinearRegressionModel private[ml] (
familyAndLink.fitted(eta)
}
+ private var trainingSummary: Option[GeneralizedLinearRegressionSummary]
= None
+
+ private[regression] def setSummary(summary:
GeneralizedLinearRegressionSummary): this.type = {
+ this.trainingSummary = Some(summary)
+ this
+ }
+
+ /**
+ * Gets summary of model on training set. An exception is
+ * thrown if `trainingSummary == None`.
+ */
+ @Since("2.0.0")
+ def summary: GeneralizedLinearRegressionSummary = trainingSummary match {
+ case Some(summ) => summ
+ case None =>
+ throw new SparkException(
+ "No training summary available for this
GeneralizedLinearRegressionModel",
+ new NullPointerException())
+ }
+
@Since("2.0.0")
override def copy(extra: ParamMap): GeneralizedLinearRegressionModel = {
copyValues(new GeneralizedLinearRegressionModel(uid, coefficients,
intercept), extra)
.setParent(parent)
}
}
+
+/**
+ * :: Experimental ::
+ * GeneralizedLinearRegressionModel results evaluated on a dataset.
+ *
+ * @param predictions dataframe outputted by the model's `transform`
method.
+ * @param predictionCol field in "predictions" which gives the prediction
of each instance.
+ * @param labelCol field in "predictions" which gives the true label of
each instance.
+ * @param featuresCol field in "predictions" which gives the features of
each instance as a vector.
+ */
+@Experimental
+@Since("2.0.0")
+class GeneralizedLinearRegressionSummary private[regression] (
+ @Since("2.0.0") @transient val predictions: DataFrame,
+ @Since("2.0.0") val predictionCol: String,
--- End diff --
Based on a private discussion with @jkbradley , we should not expose the
name of the columns in the public API. Users should be referring to the
original model to get access to the column. In `LinearRegressionSummary`, they
are passed around because they are required for metrics. We do not need them
here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]