[GitHub] spark pull request #21403: [SPARK-24313][WIP][SQL] Support IN subqueries wit...

hvanhovell Wed, 23 May 2018 01:22:41 -0700

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21403#discussion_r190160677
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
    @@ -45,6 +46,10 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
       private def getValueExpression(e: Expression): Seq[Expression] = {
         e match {
           case cns : CreateNamedStruct => cns.valExprs
    +      case Literal(struct: InternalRow, dt: StructType) if 
dt.isInstanceOf[StructType] =>
    +        dt.zipWithIndex.map { case (field, idx) => Literal(struct.get(idx, 
field.dataType)) }
    +      case a @ AttributeReference(_, dt: StructType, _, _) =>
    --- End diff --
    
    I am not sure if we should unpack the struct and do a field by field 
comparison. The reason for this is that the field by field comparison can yield 
a `null` value, and the struct level comparison cannot. This matters a lot for 
null aware anti joins.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21403: [SPARK-24313][WIP][SQL] Support IN subqueries wit...

Reply via email to