This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/sedona.git


The following commit(s) were added to refs/heads/master by this push:
     new 85e6107d2 [DOC] Update docs to explain the case of filtering after KNN 
(#1575)
85e6107d2 is described below

commit 85e6107d20bd39920767bba4893f5a3c04b578c1
Author: Feng Zhang <[email protected]>
AuthorDate: Tue Sep 3 16:23:53 2024 -0700

    [DOC] Update docs to explain the case of filtering after KNN (#1575)
---
 docs/api/sql/NearestNeighbourSearching.md | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/docs/api/sql/NearestNeighbourSearching.md 
b/docs/api/sql/NearestNeighbourSearching.md
index 224c63e44..bc65777cb 100644
--- a/docs/api/sql/NearestNeighbourSearching.md
+++ b/docs/api/sql/NearestNeighbourSearching.md
@@ -19,6 +19,30 @@ In case there are ties in the distance, the result will 
include all the tied geo
 spark.sedona.join.knn.includeTieBreakers=true
 ```
 
+Filter Pushdown Considerations:
+
+When using ST_KNN with filters applied to the resulting DataFrame, some of 
these filters may be pushed down to the object side of the kNN join. This means 
the filters will be applied to the object side reader before the kNN join is 
executed. If you want the filters to be applied after the kNN join, ensure that 
you first materialize the kNN join results and then apply the filters.
+
+For example, you can use the following approach:
+
+Scala Example:
+
+```
+val knnResult = knnJoinDF.cache()
+val filteredResult = knnResult.filter(condition)
+```
+
+SQL Example:
+
+```
+CREATE OR REPLACE TEMP VIEW knnResult AS
+SELECT * FROM (
+  -- Your KNN join SQL here
+) AS knnView;
+CACHE TABLE knnResult;
+SELECT * FROM knnResult WHERE condition;
+```
+
 SQL Example
 
 Suppose we have two tables `QUERIES` and `OBJECTS` with the following data:

Reply via email to