petern48 opened a new issue, #2230: URL: https://github.com/apache/sedona/issues/2230
This is an umbrella issue that will encompass many other issues for implementing additional geopandas functionality. Setup: - Read the [Sedona Geopandas developer guide](https://sedona.apache.org/latest-snapshot/community/geopandas/) - Setup build environment by following these [compile docs](https://github.com/apache/sedona/blob/master/docs/setup/compile.md) Implementation Steps 1. Pick a `GeoSeries` function to implement from the (subset) list below or from the [full list in the original documentation](https://geopandas.org/en/stable/docs/reference/geoseries.html). 2. Look for a similar function that has been implemented in Sedona Geopandas already - Chances are, you'll just need to use one of the helper functions `_query_geometry_column` or `_row_wise_operations` 3. Implement function in `geoseries.py` and `base.py`. Usually, this is as simple as calling the corresponding Sedona function (e.g `ST_IsRing` for `.is_ring()`. The `base.py` part is important so geodataframe can also execute it. 4. Add documentation under the `base.py` implementation by copy and pasting from geopandas docstrings. 5. Add tests to `test_geoseries.py` and `test_match_geopandas_series.py` following similar conventions. - In `test_geoseries.py`, include the example in the docstring as part of the test. - For `test_match_geopandas_series.py`, follow the convention of similar functions. Most of the time, you just run the function on each of the `self.geoms` and compare output with geopandas using the helper functions. 7. Create a new issue on GitHub. Either create a subissue of this [EPIC] issue or explicitly mention this issue. Then, submit a PR linked to the issue you just created (not this issue). Go ahead and CC @petern48 on your PR too, so I can review 8. Repeat! **Note about AI use**: I have zero problem with contributors using AI to help them write their code faster. HOWEVER, I strongly urge you to at least run your code and tests locally on your laptop, you're not going to have a good time trying to iterate on failed CI runs. If you need help fixing your code to pass the tests, it's fine to ask for help. But if you constantly are blindly pushing changes an LLM told you and waiting for CI to tell you if your code is correct, you will be putting an annoying pain on reviewers, since we need to manually approve your CI to run after each change. I also recommend you learn to do your first one on your own, by following an example PR or implementation. It should be fairly easy to figure out. Example PR: https://github.com/apache/sedona/pull/2232 List of unimplemented functions along with existing functions to model their implementations off of: Unary predicates (boolean operations): model off of `is_empty` - [x] is_ring - [ ] is_ccw (nontrivial) - [x] is_closed Binary predicates - [x] relate (ST_Relate) - [ ] relate_pattern (non-trivial) Set theoretic methods: model off of `difference` - [x] symmetric_difference (use ST_SymDifference) - [x] union (use ST_Union) - [ ] clip_by_rect Aggregations (model off of `union_all`) - [x] intersection_all (ST_Intersection_Aggr) - [ ] voronoi_polygons (ST_VoronoiPolygons) - [ ] polygonize (ST_Polygonize) - [ ] delauney_triangles (ST_DelaunayTriangles) - [ ] build_area (ST_BuildArea) - [ ] explode - [ ] dissolve Other - [x] convex_hull (ST_ConvexHull) - [ ] concave_hull (ST_ConcaveHull) - [x] force_2d (ST_Force_2D) - [x] force_3d (ST_Force3D) - [ ] frechet_distance (ST_FrechetDistance) - [ ] hausdorff_distance (ST_HausdorffDistance) - [x] minimum_bounding_circle (ST_MinimumBoundingCircle) - [x] minimum_bounding_radius (ST_MinimumBoundingRadius) Joins (model off of `sjoin`) - [ ] sjoin_nearest (use ST_KNN, though note this is more complicated) This doesn't include everything. It's also possible I misgrouped a few functions, but here's a good starting list. Ping me when you we need to add more to the list. --- ### Additional unimplemented functions Binary predicates (additional): model off of `contains` - [ ] geom_equals - [ ] geom_equals_exact - [ ] geom_almost_equals - [ ] disjoint - [ ] contains_properly General methods and attributes - [ ] count_coordinates - [ ] count_geometries - [ ] count_interior_rings - [ ] get_coordinates - [ ] get_precision - [ ] set_precision - [ ] interiors - [ ] exterior - [ ] representative_point - [ ] offset_curve Constructive methods and attributes - [ ] shortest_line - [ ] sample_points - [ ] reverse - [ ] remove_repeated_points - [ ] normalize - [ ] minimum_rotated_rectangle - [ ] minimum_clearance - [ ] extract_unique_points - [ ] transform Linestring operations - [ ] shared_paths - [ ] project - [ ] line_merge - [ ] interpolate Affine transformations - [ ] affine_transform - [ ] rotate - [ ] scale - [ ] skew - [ ] translate Overlay operations - [ ] GeoDataFrame.overlay - [ ] GeoDataFrame.clip - [ ] GeoSeries.clip Plotting - [ ] GeoDataFrame.plot - [ ] GeoDataFrame.explore - [ ] GeoSeries.plot - [ ] GeoSeries.explore Interface - [ ] GeoDataFrame.iterfeatures - [ ] GeoDataFrame.__geo_interface__ - [ ] GeoSeries.__geo_interface__ Indexing - [ ] GeoDataFrame.cx - [ ] GeoSeries.cx Tools (top-level functions) - [ ] geopandas.points_from_xy - [ ] geopandas.tools.collect - [ ] geopandas.tools.geocode - [ ] geopandas.tools.reverse_geocode - [ ] geopandas.clip - [ ] geopandas.overlay Serialization / IO / conversion - [ ] GeoDataFrame.to_postgis - [ ] GeoDataFrame.to_feather - [ ] GeoDataFrame.to_geo_dict - [ ] GeoDataFrame.from_postgis - [ ] GeoDataFrame.from_features Spatial joins (additional) - [ ] GeoDataFrame.sjoin_nearest For anyone interested, here's the epic for initial Geopandas Support, which links to all of the rest of the PRs: https://github.com/apache/sedona/issues/2001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
