viirya commented on a change in pull request #28133: [SPARK-31156][SQL]
DataFrameStatFunctions API to be consistent with respect to Column type
URL: https://github.com/apache/spark/pull/28133#discussion_r404532547
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
##########
@@ -132,7 +156,28 @@ final class DataFrameStatFunctions private[sql](df:
DataFrame) {
* @since 1.4.0
*/
def cov(col1: String, col2: String): Double = {
- StatFunctions.calculateCov(df, Seq(col1, col2))
+ cov(df.col(col1), df.col(col2))
+ }
+
+ /**
+ * Calculate the sample covariance of two numerical columns of a DataFrame.
+ * This version of cov accepts [[Column]] rather than names.
Review comment:
I think we don't need to explicitly mention it. The function signature
already tells it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]