This is an automated email from the ASF dual-hosted git repository.
cloud-fan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 992290bc09c3 [SPARK-56868][SQL][FOLLOW-UP] Clarify scaladoc for V2
runtime filter helpers
992290bc09c3 is described below
commit 992290bc09c3b0974e3ab804bf515ba37d5065ef
Author: Szehon Ho <[email protected]>
AuthorDate: Thu May 21 17:07:00 2026 +0800
[SPARK-56868][SQL][FOLLOW-UP] Clarify scaladoc for V2 runtime filter helpers
### What changes were proposed in this pull request?
Clarify scaladoc for `PushDownUtils.pushRuntimeFilters` and
`PushDownUtils.replanWithRuntimeFilters` introduced in
[SPARK-56868](https://issues.apache.org/jira/browse/SPARK-56868)
([#55887](https://github.com/apache/spark/pull/55887)):
- Document the mutating-scan constraint on `pushRuntimeFilters`, where
`SupportsRuntimeV2Filtering.filter` is invoked.
- Replace the mixed Note/Precondition text on `replanWithRuntimeFilters`
with a concise Notes bullet list.
### Why are the changes needed?
The original helper scaladoc mixed concerns (execute-time requirements,
caller caching, and preconditions) and referenced `pushRuntimeFilters` from the
wrong method. This follow-up makes the documentation easier to read without
changing behavior.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Documentation-only change; existing tests cover runtime filter behavior.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #55958 from szehon-ho/spark-56868-cleanup.
Authored-by: Szehon Ho <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
---
.../sql/execution/datasources/v2/PushDownUtils.scala | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala
index dc6de6f29af9..c54cc98014d7 100644
---
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala
+++
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala
@@ -141,6 +141,9 @@ object PushDownUtils extends Logging {
* the first pass are used to derive PartitionPredicates in the second pass,
avoiding duplicate
* pushdown.
*
+ * Note: Do not call multiple times for the same `scan` instance;
+ * [[SupportsRuntimeV2Filtering.filter]] is mutating.
+ *
* @return true if any filters were pushed to the data source
*/
def pushRuntimeFilters(
@@ -195,14 +198,11 @@ object PushDownUtils extends Logging {
* preserved the original partitioning and pads with `None` to preserve key
alignment with the
* pre-filter partition set.
*
- * Must be called at execute time: runtime filters carry
[[DynamicPruningExpression]] and
- * scalar-subquery references whose values are only resolved after their
broadcast/subquery
- * side completes. The mutating [[pushRuntimeFilters]] call must run at most
once per scan
- * instance; callers are responsible for caching the result.
- *
- * Precondition: when `outputPartitioning` is a [[KeyedPartitioning]], every
element of
- * `originalPartitions` (and every partition re-planned by the data source)
must implement
- * [[HasPartitionKey]].
+ * Notes:
+ * - Do not call multiple times for the same `scan` instance;
+ * [[SupportsRuntimeV2Filtering.filter]] is mutating.
+ * - When `outputPartitioning` is a [[KeyedPartitioning]], every split from
+ * `planInputPartitions()` used on this path must implement
[[HasPartitionKey]].
*
* @param scan the V2 scan to push filters into
* @param runtimeFilters runtime filters to translate and push
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]