Github user yanboliang commented on a diff in the pull request:
https://github.com/apache/spark/pull/9413#discussion_r44254161
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala ---
@@ -474,6 +487,75 @@ class LinearRegressionSummary private[regression] (
predictions.select(t(col(predictionCol),
col(labelCol)).as("residuals"))
}
+ /** Number of instances in DataFrame predictions */
+ lazy val numInstances: Long = predictions.count()
+
+ /** Degrees of freedom */
+ private val degreesOfFreedom: Long = if (model.getFitIntercept) {
+ numInstances - model.coefficients.size - 1
+ } else {
+ numInstances - model.coefficients.size
+ }
+
+ /**
+ * The weighted residuals, the usual residuals rescaled by
+ * the square root of the instance weights.
+ */
+ lazy val devianceResiduals: Array[Double] = {
--- End diff --
@jkbradley There is "residuals" already exist, so I call it
```devianceResiduals```. I agree your opinion about adding other types of
residuals later, so I think we can try to combine the two functions into one
with different arguments. We also need do some code clean up for
```LinearRegressionSummary``` due to redundant arguments, I can finish it in a
follow up PR. @mengxr
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]