[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

2018-10-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22728


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

2018-10-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22728#discussion_r225299298
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/count.sql ---
@@ -0,0 +1,21 @@
+-- Test data.
+CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
+(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null)
+AS testData(a, b);
+
+-- count with single expression
+SELECT count(a), count(b), count(a + b), count((a, b)) FROM testData;
+
+-- distinct count with single expression
+SELECT
+  count(DISTINCT a),
+  count(DISTINCT b),
+  count(DISTINCT (a + b)),
+  count(DISTINCT (a, b))
+FROM testData;
+
+-- count with multiple expressions
+SELECT count(a, b), count(b, a), count(testData.*) FROM testData;
+
+-- distinct count with multiple expressions
+SELECT count(DISTINCT a, b), count(DISTINCT b, a), count(DISTINCT 
testData.*) FROM testData;
--- End diff --

Also include `count(DISTINCT *)` 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

2018-10-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22728#discussion_r225298840
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/count.sql ---
@@ -0,0 +1,21 @@
+-- Test data.
+CREATE OR REPLACE TEMPORARY VIEW testData AS SELECT * FROM VALUES
+(1, 1), (1, 2), (2, 1), (1, 1), (null, 2), (1, null), (null, null)
+AS testData(a, b);
+
+-- count with single expression
+SELECT count(a), count(b), count(a + b), count((a, b)) FROM testData;
+
+-- distinct count with single expression
+SELECT
+  count(DISTINCT a),
+  count(DISTINCT b),
+  count(DISTINCT (a + b)),
+  count(DISTINCT (a, b))
+FROM testData;
+
+-- count with multiple expressions
+SELECT count(a, b), count(b, a), count(testData.*) FROM testData;
--- End diff --

Please also include `count(*)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22728: [SPARK-25736][SQL][TEST] add tests to verify the ...

2018-10-15 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/22728

[SPARK-25736][SQL][TEST] add tests to verify the behavior of multi-column 
count

## What changes were proposed in this pull request?

AFAIK multi-column count is not widely supported by the mainstream 
databases(postgres doesn't support), and the SQL standard doesn't define it 
clearly, as near as I can tell.

Since Spark supports it, we should clearly document the current behavior 
and add tests to verify it.

## How was this patch tested?

N/A


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22728.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22728


commit 62b4b84f135c2f71ecc8192deabec3d694b6bbc9
Author: Wenchen Fan 
Date:   2018-10-15T15:25:14Z

add tests to verify the behavior of count for corner cases




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org