zhangfengcdt opened a new issue, #2100:
URL: https://github.com/apache/sedona/issues/2100
The current sjoin implementation in sedona's geopandas compatibility layer
is incomplete and lacks
essential functionality compared to the reference geopandas
implementation. The current
implementation only supports basic intersects operations with limited
parameter handling.
Current Limitations
- Only supports intersects predicate (hardcoded)
- Only supports inner join type
- Ignores most parameters (predicate, how, distance, on_attribute,
lsuffix, rsuffix)
- Incorrect type validation (rejects GeoDataFrame inputs)
- No support for distance-based operations (dwithin)
- No column suffix handling for overlapping columns
- Minimal test coverage (only 2 basic tests)
Implementation Tasks
- Enhance core sjoin function with complete parameter support
- Implement all spatial predicates using Sedona spatial functions
- Add support for all join types (inner, left, right)
- Implement distance-based operations for dwithin predicate
- Add column suffix handling for overlapping columns
- Support attribute-based joining with on_attribute parameter
- Fix type validation to accept both GeoSeries and GeoDataFrame
- Add comprehensive error handling and validation
- Create extensive test suite covering all functionality
Testing
The implementation includes comprehensive tests covering:
- Basic sjoin functionality for GeoSeries and GeoDataFrame
- All spatial predicates (intersects, contains, within, touches, crosses,
overlaps, dwithin)
- All join types (inner, left, right)
- Column suffix handling
- Distance-based operations
- Attribute-based joining
- Error handling for invalid inputs
- Edge cases and performance scenarios
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]