Oops seems I made a mistake. The error message is : Exception in thread "main" org.apache.spark.sql.AnalysisException: undefined function countDistinct On 27 Oct 2015 15:49, "Shagun Sodhani" <sshagunsodh...@gmail.com> wrote:
> Hi! I was trying out some aggregate functions in SparkSql and I noticed > that certain aggregate operators are not working. This includes: > > approxCountDistinct > countDistinct > mean > sumDistinct > > For example using countDistinct results in an error saying > *Exception in thread "main" org.apache.spark.sql.AnalysisException: > undefined function cosh;* > > I had a similar issue with cosh operator > <http://apache-spark-developers-list.1001551.n3.nabble.com/Exception-when-using-cosh-td14724.html> > as well some time back and it turned out that it was not registered in the > registry: > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala > > > *I* *think it is the same issue again and would be glad to send over a PR > if someone can confirm if this is an actual bug and not some mistake on my > part.* > > > Query I am using: SELECT countDistinct(`age`) as `data` FROM `table` > Spark Version: 10.4 > SparkSql Version: 1.5.1 > > I am using the standard example of (name, age) schema (though I am setting > age as Double and not Int as I am trying out maths functions). > > The entire error stack can be found here <http://pastebin.com/G6YzQXnn>. > > Thanks! >