dramaticlly opened a new issue, #5224:
URL: https://github.com/apache/iceberg/issues/5224

   Hey Iceberg Community:
   
   we recently migrated from using iceberg 13 with Spark 3.1 to Spark 3.2 and 
realized a some existing SQL delete job are producing a lot more shuffling data 
than it was in spark 3.1, when explain the SQL statement with logical plan, we 
realized the 
https://github.com/apache/iceberg/blob/master/spark/v3.1/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/DynamicFileFilter.scala
 is missing from the Spark 3.2 extensions and want some help to understand why.
   
   Looks like dynamic file filter was introduced 
https://github.com/apache/iceberg/pull/3415/ on 10/31/2021 
   and initial spark 3.2 support was merged in 
https://github.com/apache/iceberg/pull/3335/files on 10/22/2021, so want to 
check if there's any time implication 
   
   delete SQL
   ```sql
   DELETE FROM $table1
   WHERE $table1.date <= '20211228' AND $table1.date >= '20220627'
   AND upper($table1.$column1) IN (SELECT * FROM $table2)
   ```
   
   Spark logic plan screenshot
   
![privacy-deletion-job_-_Details_for_Query_3_and_privacy-deletion-job_-_Details_for_Query_3](https://user-images.githubusercontent.com/5961173/177842058-574c871d-9f45-43b1-81b1-77f94bf9ac89.jpg)
   
   
   Iceberg Version: 0.13.0 (did not turn on merge-on-read for this yet)
   
   Spark Version: 3.2.0 (too many shuffle data) vs 3.1.1(works as expected with 
DynamicFileFilter)
   
   Appreciate your help!
   CC @szehon-ho @rdblue @aokolnychyi @wypoon 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to