zhangfengcdt opened a new issue, #2356:
URL: https://github.com/apache/sedona/issues/2356
We need to implement a similar barrier udf function in sedona similar to the
one in sedona db.
## Optimization Barrier
Use the `barrier` function to prevent filter pushdown and control predicate
evaluation order in complex spatial joins. This function creates an
optimization barrier by evaluating boolean expressions at runtime.
The `barrier` function takes a boolean expression as a string, followed by
pairs of variable names and their values that will be substituted into the
expression:
```sql
barrier(expression, var_name1, var_value1, var_name2, var_value2, ...)
```
The placement of filters relative to KNN joins changes the semantic meaning
of the query:
- **Filter before KNN**: First filters the data, then finds K nearest
neighbors from the filtered subset. This answers "What are the K nearest
high-rated restaurants?"
- **Filter after KNN**: First finds K nearest neighbors from all data, then
filters those results. This answers "Of the K nearest restaurants, which ones
are high-rated?"
### Example
Find the 3 nearest high-rated restaurants to luxury hotels, ensuring the KNN
join completes before filtering.
```sql
SELECT
h.name AS hotel,
r.name AS restaurant,
r.rating
FROM hotels AS h
INNER JOIN restaurants AS r
ON ST_KNN(h.geometry, r.geometry, 3, false)
WHERE barrier('rating > 4.0 AND stars >= 4',
'rating', r.rating,
'stars', h.stars)
```
With the barrier function, this query first finds the 3 nearest restaurants
to each hotel (regardless of rating), then filters to keep only those pairs
where the restaurant has rating > 4.0 and the hotel has stars >= 4. Without the
barrier, an optimizer might push the filters down, changing the query to first
filter for high-rated restaurants and luxury hotels, then find the 3 nearest
among those filtered sets.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]