Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23057#discussion_r234409212
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
---
    @@ -1280,4 +1281,34 @@ class SubquerySuite extends QueryTest with 
SharedSQLContext {
           assert(subqueries.length == 1)
         }
       }
    +
    +  test("SPARK-26078: deduplicate fake self joins for IN subqueries") {
    +    withTempView("a", "b") {
    +      val a = 
spark.createDataFrame(spark.sparkContext.parallelize(Seq(Row("a", 2), Row("b", 
1))),
    +        StructType(Seq(StructField("id", StringType), StructField("num", 
IntegerType))))
    +      val b = 
spark.createDataFrame(spark.sparkContext.parallelize(Seq(Row("a", 2), Row("b", 
1))),
    +        StructType(Seq(StructField("id", StringType), StructField("num", 
IntegerType))))
    --- End diff --
    
    Two schema is the same. We can define it just once?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to