[GitHub] [spark] mingjialiu commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

GitBox Thu, 10 Sep 2020 11:19:11 -0700


mingjialiu commented on pull request #29564:
URL: https://github.com/apache/spark/pull/29564#issuecomment-690597364



   > The fix LGTM, can you add a test?
   
   Hi, it's a bit tricky to repro in unit test. Can I get some pointers on 
populating different expression ids for the same column?
   
   - CAN'T repro example in unit test: 
   
       val df = spark.read.format(classOf[AdvancedDataSourceV2].getName).load()
       val q1 = df.select(($"i" + 1).as("k"), ($"i" - 1).as("j")).filter('i > 5)
       val q2 = df.select(($"i" + 1).as("k"), ($"i" - 1).as("j")).filter('i > 5)
       val scans1 = getV2ScanExecs(q1.join(q2, "j"))
       assert(scans1(0).sameResult(scans1(1))) 
   
      scans1(0).sameResult(scans1(1)) will always return true even if filtered 
columns are not properly canonicalized (as circled in screenshots)
      
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] mingjialiu commented on pull request #29564: [WIP][SPARK-32708] Query optimization fails to reuse exchange with DataSourceV2

Reply via email to