Kontinuation opened a new pull request, #641: URL: https://github.com/apache/sedona-db/pull/641
## Summary - Adds a `KnnQuerySideFilterPushdown` optimizer rule that automatically pushes query-side-only filters below the `SpatialJoinPlanNode` extension node for KNN inner joins - Only handles `INNER JOIN` (conservative start); outer join support can be added later - Updates docs to document the automatic pushdown behavior and clarify when `barrier()` is still needed ## Background Previously, KNN joins blocked ALL filter pushdown (both query-side and object-side) because the `SpatialJoinPlanNode` extension node's default `prevent_predicate_push_down_columns()` returns all columns. Object-side pushdown must remain blocked (it changes KNN candidate sets), but query-side pushdown is safe and should be automatic. DataFusion's built-in `PushDownFilter` pushes the same predicate to ALL children of an extension node, so a query-side filter like `h.stars >= 4` would fail when applied to the object-side child that doesn't have column `h.stars`. This requires a custom optimizer rule instead. ## Implementation The `KnnQuerySideFilterPushdown` rule: 1. Pattern matches `Filter(predicate, Extension(SpatialJoinPlanNode))` where the join filter contains `ST_KNN` 2. Uses `find_knn_query_side()` to determine which child is the query side (from the first argument of `ST_KNN`) 3. Splits the filter predicate into conjuncts; pushes query-side-only conjuncts below the extension node; keeps the rest above 4. Runs **before** DataFusion's `PushDownFilter` so the pushed-down filters are further optimized into scan nodes in the same pass ## Testing - 12 unit tests for `find_st_knn_call` and `find_knn_query_side` - 3 integration tests verifying correct plan structure (filter pushed into query-side child, object-side filters stay above) Depends on #635 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
