jiayuasu commented on issue #854:
URL: https://github.com/apache/sedona/issues/854#issuecomment-1586053598

   @Kontinuation I like this idea.
   
   Let's break this proposal to 3 standalone PRs. I believe they can be 
implemented separately without relying on each other.
   
   Step 1: Move the sampling logic to `analyze()`
   
   1. Update `analyze()` function of `SpatialRDD` to include the poisson sampler
   2. Build a spatial partitioning grid using the samples we collected in 
analyze().
   
   Step 2: Add heuristics to determine the join side in 
`TraitJoinQueryExec.scala` 
(https://github.com/apache/sedona/blob/master/sql/common/src/main/scala/org/apache/spark/sql/sedona_sql/strategy/join/TraitJoinQueryExec.scala#L59)
   
   Step 3: `DynamicIndexLookupJudgement` automatically determines the stream 
side on a per-grid basis.
   
   @dfischercodethoughts What you are proposing is Step 2. You can take a stab 
if you want. Since the `v1.4.1` will be released soon, I expect this entire 
proposal will be completed in `v1.5.0`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to