[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21713 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200324941 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, --- End diff -- yea it's wrong after this patch --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200319604 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, --- End diff -- The test result might be wrong? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200315553 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, --- End diff -- Seems like the behavior will be changed if we reuse there, and the test [TypeCoercionSuite.scala#L397-L400](https://github.com/apache/spark/blob/bed6849dbf51e1981772cd353ce1a7ae4f0626e2/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala#L397-L400) fails. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200281741 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, --- End diff -- Maybe `findTightestCommonType` can also reuse `mergeComplexTypes` ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200277231 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, + mergeFunc: (DataType, DataType) => Option[DataType]): Option[DataType] = (t1, t2) match { +case (ArrayType(et1, containsNull1), ArrayType(et2, containsNull2)) => + mergeFunc(et1, et2).map(ArrayType(_, containsNull1 || containsNull2)) +case (MapType(kt1, vt1, valueContainsNull1), MapType(kt2, vt2, valueContainsNull2)) => + mergeFunc(kt1, kt2).flatMap { kt => +mergeFunc(vt1, vt2).map { vt => + MapType(kt, vt, valueContainsNull1 || valueContainsNull2) +} + } +case (StructType(fields1), StructType(fields2)) if fields1.length == fields2.length => + val resolver = SQLConf.get.resolver + fields1.zip(fields2).foldLeft(Option(new StructType())) { +case (Some(struct), (field1, field2)) if resolver(field1.name, field2.name) => --- End diff -- Ah, I see. `findTightestCommonType` only picks first field name when caseSensitive = false. This does the same thing actually. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200275396 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( + t1: DataType, + t2: DataType, + mergeFunc: (DataType, DataType) => Option[DataType]): Option[DataType] = (t1, t2) match { +case (ArrayType(et1, containsNull1), ArrayType(et2, containsNull2)) => + mergeFunc(et1, et2).map(ArrayType(_, containsNull1 || containsNull2)) +case (MapType(kt1, vt1, valueContainsNull1), MapType(kt2, vt2, valueContainsNull2)) => + mergeFunc(kt1, kt2).flatMap { kt => +mergeFunc(vt1, vt2).map { vt => + MapType(kt, vt, valueContainsNull1 || valueContainsNull2) +} + } +case (StructType(fields1), StructType(fields2)) if fields1.length == fields2.length => + val resolver = SQLConf.get.resolver + fields1.zip(fields2).foldLeft(Option(new StructType())) { +case (Some(struct), (field1, field2)) if resolver(field1.name, field2.name) => --- End diff -- In case the names between two fields are different, `findTightestCommonType` uses the name of first field. This seems stricter? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200251401 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -166,6 +166,30 @@ object TypeCoercion { case (l, r) => None } + private def mergeComplexTypes( --- End diff -- nit: I would use similar naming conversion such as `findWiderTypeForComplex` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/21713#discussion_r200109841 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -185,6 +185,15 @@ object TypeCoercion { MapType(kt, vt, valueContainsNull1 || valueContainsNull2) } } +case (StructType(fields1), StructType(fields2)) if fields1.length == fields2.length => --- End diff -- is it time to reduce code duplication now? ``` def mergeComplexTypes( t1: DataType, t2: DataType, mergeFunc: (DataType, DataType) => Option[DataType]): Option[DataType] = (t1, t2) match { case (ArrayType(et1, containsNull1), ArrayType(et2, containsNull2)) => mergeFunc(et1, et2).map(ArrayType(_, containsNull1 || containsNull2)) ... } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21713: [SPARK-24737][SQL] Type coercion between StructTy...
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/21713 [SPARK-24737][SQL] Type coercion between StructTypes. ## What changes were proposed in this pull request? We can support type coercion between `StructType`s where all the internal types are compatible. ## How was this patch tested? Added tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-24737/structtypecoercion Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21713.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21713 commit 5b9b9013826d126b8d7ce986515f395d147acb91 Author: Takuya UESHIN Date: 2018-07-04T08:25:50Z Type coercion between StructTypes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org