[GitHub] [spark] cloud-fan commented on a change in pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

GitBox Sun, 12 Jul 2020 23:32:15 -0700


cloud-fan commented on a change in pull request #28996:
URL: https://github.com/apache/spark/pull/28996#discussion_r453460284




##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -2048,19 +2088,34 @@ class Dataset[T] private[sql](
     // Builds a project list for `other` based on `logicalPlan` output names
     val rightProjectList = leftOutputAttrs.map { lattr =>
       rightOutputAttrs.find { rattr => resolver(lattr.name, rattr.name) 
}.getOrElse {
-        throw new AnalysisException(
-          s"""Cannot resolve column name "${lattr.name}" among """ +
-            s"""(${rightOutputAttrs.map(_.name).mkString(", ")})""")
+        if (allowMissingColumns) {

Review comment:
       I think the major problem here is we put the by-name logic in the API 
method, not in the `Analyzer`. Shall we add a boolean parameter to `Union`, and 
move the by-name logic to the type coercion rules?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #28996: [SPARK-29358][SQL] Make unionByName optionally fill missing columns with nulls

Reply via email to