Re: df.count() returns one more count than SELECT COUNT()

2017-04-06 Thread Mohamed Nadjib MAMI
That was the case. Thanks for the quick and clean answer, Hemanth.

*Regards, Grüße, **Cordialement,** Recuerdos, Saluti, προσρήσεις, 问候,
تحياتي.*
*Mohamed Nadjib Mami*
*Research Associate @ Fraunhofer IAIS - PhD Student @ Bonn University*
*About me! *
*LinkedIn *

On Thu, Apr 6, 2017 at 7:33 PM, Hemanth Gudela 
wrote:

> Nulls are excluded with *spark.sql("SELECT count(distinct col) FROM
> Table").show()*
>
> I think it is ANSI SQL behaviour.
>
>
>
> scala> spark.sql("select distinct count(null)").show(false)
>
> +---+
>
> |count(NULL)|
>
> +---+
>
> |0  |
>
> +---+
>
>
>
> scala> spark.sql("select distinct null").count
>
> res1: Long = 1
>
>
>
> Regards,
>
> Hemanth
>
>
>
> *From: *Mohamed Nadjib Mami 
> *Date: *Thursday, 6 April 2017 at 20.29
> *To: *"user@spark.apache.org" 
> *Subject: *df.count() returns one more count than SELECT COUNT()
>
>
>
> *spark.sql("SELECT count(distinct col) FROM Table").show()*
>


Re: df.count() returns one more count than SELECT COUNT()

2017-04-06 Thread Hemanth Gudela
Nulls are excluded with spark.sql("SELECT count(distinct col) FROM 
Table").show()
I think it is ANSI SQL behaviour.

scala> spark.sql("select distinct count(null)").show(false)
+---+
|count(NULL)|
+---+
|0  |
+---+

scala> spark.sql("select distinct null").count
res1: Long = 1

Regards,
Hemanth

From: Mohamed Nadjib Mami 
Date: Thursday, 6 April 2017 at 20.29
To: "user@spark.apache.org" 
Subject: df.count() returns one more count than SELECT COUNT()

spark.sql("SELECT count(distinct col) FROM Table").show()