Github user dilipbiswal commented on the pull request:

    https://github.com/apache/spark/pull/8983#issuecomment-145948447
  
    @cloud-fan , please confirm my understanding of the code (fairly new to the 
codebase..:-)
    In the code we go through the entire in list and run evaluate the 
expression flagging hasNull. But
    we continue with next items and return true if we see a match. If we 
haven't seen it then we look at the hasNull flag and return Null or False.
    
    To confirm if there is a issue, i tried to run the following two queries 
again. The output looks 
    ok to me.. 
    
    select * from inttab where 1 in (1,2,NULL)
    var2: org.apache.spark.sql.DataFrame = [c1: int]
    +---+
    | c1|
    +---+
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    +---+
    
    == Parsed Logical Plan ==
    'Project [unresolvedalias(*)]
     'Filter 1 IN (1,2,null)
      'UnresolvedRelation [inttab], None
    
    == Analyzed Logical Plan ==
    c1: int
    Project [c1#0]
     Filter 1 IN (cast(1 as int),cast(2 as int),cast(null as int))
      Subquery inttab
       LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
<console>:26
    
    == Optimized Logical Plan ==
    LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
<console>:26
    
    == Physical Plan ==
    Scan PhysicalRDD[c1#0]
    
    Code Generation: true
    
    select * from inttab where 1 in (NULL,1,2)
    var2: org.apache.spark.sql.DataFrame = [c1: int]
    +---+
    | c1|
    +---+
    |  1|
    |  2|
    |  3|
    |  4|
    |  5|
    +---+
    
    == Parsed Logical Plan ==
    'Project [unresolvedalias(*)]
     'Filter 1 IN (null,1,2)
      'UnresolvedRelation [inttab], None
    
    == Analyzed Logical Plan ==
    c1: int
    Project [c1#0]
     Filter 1 IN (cast(null as int),cast(1 as int),cast(2 as int))
      Subquery inttab
       LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
<console>:26
    
    == Optimized Logical Plan ==
    LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
<console>:26
    
    == Physical Plan ==
    Scan PhysicalRDD[c1#0]
    
    Code Generation: true
    
    Please let me know your thoughts ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to