cloud-fan commented on a change in pull request #28996:
URL: https://github.com/apache/spark/pull/28996#discussion_r453460284
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
##########
@@ -2048,19 +2088,34 @@ class Dataset[T] private[sql](
// Builds a project list for `other` based on `logicalPlan` output names
val rightProjectList = leftOutputAttrs.map { lattr =>
rightOutputAttrs.find { rattr => resolver(lattr.name, rattr.name)
}.getOrElse {
- throw new AnalysisException(
- s"""Cannot resolve column name "${lattr.name}" among """ +
- s"""(${rightOutputAttrs.map(_.name).mkString(", ")})""")
+ if (allowMissingColumns) {
Review comment:
I think the major problem here is we put the by-name logic in the API
method, not in the `Analyzer`. Shall we add a boolean parameter to `Union`, and
move the by-name logic to the type coercion rules?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]