Kristin Cowalcijk created SEDONA-260:
----------------------------------------
Summary: More intuitive configuration of partition and index-build
side of spatial joins in Sedona SQL
Key: SEDONA-260
URL: https://issues.apache.org/jira/browse/SEDONA-260
Project: Apache Sedona
Issue Type: Improvement
Reporter: Kristin Cowalcijk
Apache Sedona allows configuring the dominant side of spatial partitioning and
the side on which Sedona builds spatial indices (See [Apache Sedona -
parameters|https://sedona.apache.org/latest-snapshot/api/sql/Parameter/]).
However, the way Apache Sedona defines the left and right sides of the join is
quite counterintuitive.
For queries such as {{{}SELECT * FROM A JOIN B ON ST_(left, right){}}}, or
{{{}df_a.join(df_b, expr("ST_{}}}{{{}(left, right)")){}}}, the left-side
relation and right-side relation were completely determined by the join
condition: {{{}ST_*(left, right){}}}. The relation {{left}} references to is
the left-side relation, and the relation {{right}} references to is the
right-side relation.
For example, the left side relation of the following query is `df_msb`, even
though it appears to be on the right side of the join:
{code:sql}
SELECT * FROM df_pickup JOIN df_msb ON ST_Contains(df_msb.geom,
df_pickup.pickup)
{code}
If we replace the join condition with {{{}ST_Within(df_pickup.pickup,
df_msb.geom){}}}, {{df_pickup}} becomes the left-side relation.
A more intuitive way is to always treat {{df_pickup}} as the left-side
relation, regardless of how the join condition is written.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)