[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15532


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-19 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84007392
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,13 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+"Sort order in second argument requires a boolean literal.")
--- End diff --

I just checked the error msg for ExpectsInputTypes and it is not consistent 
anymore. Anyway I think this is fine. I will go through this and fix them 
myself later.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004834
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,15 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type as " +
--- End diff --

Actually, I wrote this as I was worried of the case `cast(NULL as boolean)`.

> Sort order in second argument requires boolean type as non-null, however, 
it is 'CAST(NULL AS BOOLEAN)' as boolean type.

I will just make it short. Actually, that sentence was copied from the 
default one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004551
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,15 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type as " +
--- End diff --

(also technically your current english sentence is grammatically incorrect; 
you can't use comma to separate however here)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004506
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,15 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type as " +
--- End diff --

maybe even ignore the actual value to keep it short.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004457
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,15 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type as " +
--- End diff --

Even better, say "requires a boolean literal"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004431
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,15 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type as " +
--- End diff --

i think you should just say "requires boolean type" and remove "as 
non-null".

I don't know what that was saying ...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84004039
  
--- Diff: sql/core/src/test/resources/sql-tests/results/array.sql.out ---
@@ -124,8 +124,23 @@ struct,sort_array(tinyint_array,
 -- !query 8 output
 [true] [1,2]   [1,2]   [1,2]   [1,2]   
[9223372036854775808,9223372036854775809]   [1.0,2.0]   [1.0,2.0]   
[2016-03-13,2016-03-14] [2016-11-12 20:54:00.0,2016-11-15 20:54:00.0]
 
-
 -- !query 9
+select sort_array(array('b', 'd'), '1')
+-- !query 9 schema
+struct<>
+-- !query 9 output
+org.apache.spark.sql.AnalysisException
+cannot resolve 'sort_array(array('b', 'd'), '1')' due to data type 
mismatch: Sort order in second argument of sort_array() requires boolean type 
as non-null, however, it is ''1'' as string type.; line 1 pos 7
--- End diff --

Thank you for confirming @tejasapatil 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84003975
  
--- Diff: sql/core/src/test/resources/sql-tests/results/array.sql.out ---
@@ -124,8 +124,23 @@ struct,sort_array(tinyint_array,
 -- !query 8 output
 [true] [1,2]   [1,2]   [1,2]   [1,2]   
[9223372036854775808,9223372036854775809]   [1.0,2.0]   [1.0,2.0]   
[2016-03-13,2016-03-14] [2016-11-12 20:54:00.0,2016-11-15 20:54:00.0]
 
-
 -- !query 9
+select sort_array(array('b', 'd'), '1')
+-- !query 9 schema
+struct<>
+-- !query 9 output
+org.apache.spark.sql.AnalysisException
+cannot resolve 'sort_array(array('b', 'd'), '1')' due to data type 
mismatch: Sort order in second argument of sort_array() requires boolean type 
as non-null, however, it is ''1'' as string type.; line 1 pos 7
--- End diff --

Yes. Now that I see the final error message, its fine to omit that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84001844
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/array.sql ---
@@ -71,6 +71,10 @@ select
   sort_array(timestamp_array)
 from primitive_arrays;
 
+select sort_array(array('b', 'd'), '1');
--- End diff --

Sure, I will.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r84001226
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/array.sql ---
@@ -71,6 +71,10 @@ select
   sort_array(timestamp_array)
 from primitive_arrays;
 
+select sort_array(array('b', 'd'), '1');
--- End diff --

add some comment explaining what this is testing? (error reporting)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r83930227
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,14 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type, however," +
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread tejasapatil
Github user tejasapatil commented on a diff in the pull request:

https://github.com/apache/spark/pull/15532#discussion_r83927813
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -124,7 +124,14 @@ case class SortArray(base: Expression, ascendingOrder: 
Expression)
 
   override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
 case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
-  TypeCheckResult.TypeCheckSuccess
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+s"Sort order in second argument requires 
${BooleanType.simpleString} type, however," +
--- End diff --

Can you explicitly mention that the error is in context of `sort_array()` ? 
For SQL query which spans 100+ lines, its easy to get lost while tracking why 
it failed. Having better error messages would make that easier for users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...

2016-10-18 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/15532

[SPARK-17989][SQL] Check ascendingOrder type in sort_array function ahead

## What changes were proposed in this pull request?

This PR proposes to check the second argument, `ascendingOrder` ahead not 
during execution with `ClassCastException` exception message.

```scala
spark.range(1).selectExpr("array(1,2,3) as a").selectExpr("sort_array(a, 
'a')").show()
```

**Before**

```
java.lang.ClassCastException: org.apache.spark.unsafe.types.UTF8String 
cannot be cast to java.lang.Boolean
at scala.runtime.BoxesRunTime.unboxToBoolean(BoxesRunTime.java:85)
at 
org.apache.spark.sql.catalyst.expressions.SortArray.nullSafeEval(collectionOperations.scala:185)
at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:416)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:50)
at 
org.apache.spark.sql.catalyst.optimizer.ConstantFolding$$anonfun$apply$1$$anonfun$applyOrElse$1.applyOrElse(expressions.scala:43)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:292)
at 
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:291)
at 
org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:297)

**After**

```
Sort order in second argument requires boolean type, however, ''a'' is of 
string type.;
```

## How was this patch tested?

Unit test in `DataFrameFunctionsSuite`.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark SPARK-17989

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15532.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15532


commit 343719bfef501702367b3a38c813ae8e4067
Author: hyukjinkwon 
Date:   2016-10-18T16:01:16Z

Check ahead

commit 5f7324580caa6bdbe1e658fe332c82c12aef9dbf
Author: hyukjinkwon 
Date:   2016-10-18T16:06:50Z

Add a constant folding one in test just in case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org