Jakob Bach created SPARK-21900:
----------------------------------

             Summary: Error in Skewness Computation
                 Key: SPARK-21900
                 URL: https://issues.apache.org/jira/browse/SPARK-21900
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Jakob Bach


The skewness() aggregate SQL function in the Scala implementation 
(org.apache.spark.sql.skewness) seems to be buggy .The following code

{code:scala}
import org.apache.spark.sql.functions
import org.apache.spark.sql.SparkSession

object SkewTest {
  def main(args: Array[String]): Unit = {
    val spark = SparkSession.
      builder().
      appName("Skewness example").
      master("local[1]").
      getOrCreate()
    
spark.createDataFrame(Seq(4,1,2,3).map(Tuple1(_))).agg(functions.skewness("_1")).show()
  }
}
{code}

should ouput 0 (as it does for Seq(1,2,3,4)), but outputs

+--------------------+
|        skewness(_1)|
+--------------------+
|5.958081967793454...|
+--------------------+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to