zhangfengcdt opened a new pull request, #65:
URL: https://github.com/apache/sedona-db/pull/65
**Summary**
- Add integration tests for KNN join functionality with synthetic data
- Include cross-verification against PostGIS for correctness validation
- Add comprehensive benchmarking comparing SedonaDB, PostGIS, and DuckDB
- Test various scenarios: basic joins, polygon joins, edge cases, and
attribute preservation
- Performance results show SedonaDB is 8-655× faster than competitors
**Integration Tests Added**
- **Basic KNN Join**: Point-to-point KNN queries with configurable k values
- **Mixed Geometry Types**: Point-to-polygon KNN operations
- **Edge Cases**: Handle scenarios where k > available targets
- **Attribute Preservation**: Verify additional columns are maintained in
results
- **Correctness Validation**: Cross-verify results against PostGIS using
equivalent queries
**Benchmark Results**
Performance comparison across three engines using both small (100 trips ×
1000 buildings) and large (1000 trips × 2000
buildings) datasets:
### Large Dataset Results
| Engine | k=1 | k=5 | k=10 |
|----------|-----------|-----------|-----------|
| SedonaDB | 3.32ms | 6.50ms | 9.38ms |
| DuckDB | 211.19ms | 209.19ms | 202.81ms |
| PostGIS | 2171.37ms | 2237.47ms | 2221.19ms |
### Small Dataset Results
| Engine | k=1 | k=5 | k=10 |
|----------|----------|----------|----------|
| SedonaDB | 0.93ms | 1.16ms | 1.49ms |
| DuckDB | 11.52ms | 11.71ms | 12.21ms |
| PostGIS | 161.93ms | 162.62ms | 165.21ms |
**SedonaDB demonstrates 8-655× faster performance** than competitors
across all scenarios.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]