Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/21403#discussion_r190160677
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
---
@@ -45,6 +46,10 @@ object RewritePredicateSubquery extends
Rule[LogicalPlan] with PredicateHelper {
private def getValueExpression(e: Expression): Seq[Expression] = {
e match {
case cns : CreateNamedStruct => cns.valExprs
+ case Literal(struct: InternalRow, dt: StructType) if
dt.isInstanceOf[StructType] =>
+ dt.zipWithIndex.map { case (field, idx) => Literal(struct.get(idx,
field.dataType)) }
+ case a @ AttributeReference(_, dt: StructType, _, _) =>
--- End diff --
I am not sure if we should unpack the struct and do a field by field
comparison. The reason for this is that the field by field comparison can yield
a `null` value, and the struct level comparison cannot. This matters a lot for
null aware anti joins.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]