aokolnychyi commented on a change in pull request #3764:
URL: https://github.com/apache/iceberg/pull/3764#discussion_r772553485



##########
File path: 
spark/v3.2/spark-extensions/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelCommandDynamicPruning.scala
##########
@@ -86,6 +89,16 @@ case class RowLevelCommandDynamicPruning(spark: 
SparkSession) extends Rule[Logic
     val matchingRowsPlan = command match {
       case d: DeleteFromIcebergTable =>
         Filter(d.condition.get, relation)
+      case u: UpdateIcebergTable =>
+        // UPDATEs with subqueries may be rewritten using a UNION with two 
identical scan relations
+        // each scan relation will get its own dynamic filter that will be 
shared during execution
+        // the analyzer will assign different expr IDs for each scan relation 
output attributes
+        // that's why the condition may refer to invalid attr expr IDs and 
must be transformed

Review comment:
       I've tried to improve the comment to make it clear and also made a note 
about the rewrite rule. Let me know if its any better.
   
   > For example, UPDATE t SET id = 1 WHERE true doesn't need to project id 
from the table. If id is not projected, then the attrMap is incorrect.
   
   It was working correctly as we don't support schema pruning in copy-on-write 
operations. The only case where it is applicable if we have `true` update 
condition and then we can discard only columns assigned to literals. Seemed 
like a very narrow use case. I think we can handle it later as we did not 
support it in the old implementation. If the condition is not trivial, the 
assigned expression will be `if ... else oldValue` so we won't be able to prune 
the column.
   
   That being said, I did modify the logic for building the attribute map. It 
should work even if we add schema pruning.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to