[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: URL: https://github.com/apache/spark/pull/27991#discussion_r413771594 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala ## @@ -1532,6 +1532,13 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { assert(e.getMessage.contains("string, binary or array")) } + test("SPARK-31227: Non-nullable null type should not coerce to nullable type in concat") { Review comment: Yea, the output has the same type with array(null); ``` scala> sql("select concat(array(), array(NULL))").printSchema root |-- concat(array(), array(NULL)): array (nullable = false) ||-- element: null (containsNull = true) scala> sql("select array()").printSchema root |-- array(): array (nullable = false) ||-- element: null (containsNull = false) scala> sql("select array(null)").printSchema root |-- array(NULL): array (nullable = false) ||-- element: null (containsNull = true) ``` Any concern? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type URL: https://github.com/apache/spark/pull/27991#discussion_r397799044 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -210,8 +206,15 @@ object Cast { case _ => false // overflow } + /** + * Returns `true` iff it should change the nullability of this type in map type, + * array type, expressions such as cast, etc. + * + * Note that you should take the nullability context into account. + * For example, it should not force to nullable type when null type in array + * type is non-nullable which means an empty array of null type. + */ def forceNullable(from: DataType, to: DataType): Boolean = (from, to) match { -case (NullType, _) => true Review comment: nit: I think its better to add more tests other than integer types (e.g., `null types -> date/timestamp types` discussed above) in https://github.com/apache/spark/pull/27991/files#diff-0ced8bd3a8e8459e7f66333b0d936771R1169 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type URL: https://github.com/apache/spark/pull/27991#discussion_r397552138 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -210,8 +206,15 @@ object Cast { case _ => false // overflow } + /** + * Returns `true` iff it should change the nullability of this type in map type, + * array type, expressions such as cast, etc. + * + * Note that you should take the nullability context into account. + * For example, it should not force to nullable type when null type in array + * type is non-nullable which means an empty array of null type. + */ def forceNullable(from: DataType, to: DataType): Boolean = (from, to) match { -case (NullType, _) => true Review comment: We can do this now? ``` case (NullType, _) => false // empty array or map case ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type URL: https://github.com/apache/spark/pull/27991#discussion_r396820238 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -210,8 +206,13 @@ object Cast { case _ => false // overflow } + /** + * Returns `true` iif it should change the nullability of this type in the container, + * e.g. map type, array type, expression, etc. This function should take the nullability + * in the container into account. For example, it should not force to a nullable type + * when null type is non-nullable, which means an empty array of null type. Review comment: nit: `iif` -> `iff`, and `container` -> `complex type` for consistency? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type URL: https://github.com/apache/spark/pull/27991#discussion_r396820238 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -210,8 +206,13 @@ object Cast { case _ => false // overflow } + /** + * Returns `true` iif it should change the nullability of this type in the container, + * e.g. map type, array type, expression, etc. This function should take the nullability + * in the container into account. For example, it should not force to a nullable type + * when null type is non-nullable, which means an empty array of null type. Review comment: nit: `iif` -> `iff` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type
maropu commented on a change in pull request #27991: [SPARK-31227][SQL] Non-nullable null type in complex types should not coerce to nullable type URL: https://github.com/apache/spark/pull/27991#discussion_r396820174 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -77,7 +77,8 @@ object Cast { resolvableNullability(fn || forceNullable(fromType, toType), tn) case (MapType(fromKey, fromValue, fn), MapType(toKey, toValue, tn)) => - canCast(fromKey, toKey) && canCastMapKeyNullSafe(fromKey, toKey) && + canCast(fromKey, toKey) && +(!forceNullable(fromKey, toKey)) && Review comment: nit: can we remove the outermost parentheses? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org