rdblue commented on a change in pull request #2021:
URL: https://github.com/apache/iceberg/pull/2021#discussion_r560577678



##########
File path: 
spark3-extensions/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteMergeInto.scala
##########
@@ -106,5 +113,18 @@ case class RewriteMergeInto(conf: SQLConf) extends 
Rule[LogicalPlan] with Rewrit
   private def getClauseCondition(clause: MergeAction): Expression = {
     clause.condition.getOrElse(TRUE_LITERAL)
   }
+
+  private def isCountCheckEnabled(table: Table, actions: Seq[MergeAction]): 
Boolean = {
+    def hasUnconditionalDelete(action: Option[MergeAction]): Boolean = {
+      action match {
+        case Some(DeleteAction(None)) => true
+        case _ => false
+      }
+    }
+    val propertyVal = PropertyUtil.propertyAsBoolean(table.properties(),
+      MERGE_WRITE_CARDINALITY_CHECK, MERGE_WRITE_CARDINALITY_CHECK_DEFAULT)
+    val hasOneUnconditionalDelete = actions.size == 1 && 
hasUnconditionalDelete(actions.headOption)
+    !hasOneUnconditionalDelete && propertyVal

Review comment:
       If I understand this logic correctly, the idea is that if the action 
does not depend on the source row at all, then don't perform the cardinality 
check?
   
   That makes sense to me, but I would move the logic out of this method that 
returns whether the check is enabled. I consider the check enabled if the table 
property is enabled. The logic to skip the check based on the structure of the 
MERGE command should be separate so that we can maintain it separately.
   
   So I'd split this into `isCardinalityCheckEnabled(table)` and 
`isCardinalityCheckNeeded(actions)`, then combine those in the rule: `val 
runCheck = isCardinalityCheckEnabled(table) && 
isCardinalityCheckNeeded(actions)`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to