Re: [PR] [GH-2731] Implement GeoSeries: frechet_distance, hausdorff_distance, geom_equals, interpolate, project [sedona]

via GitHub Fri, 13 Mar 2026 00:34:07 -0700


jiayuasu commented on code in PR #2732:
URL: https://github.com/apache/sedona/pull/2732#discussion_r2929580275



##########
python/tests/geopandas/test_geoseries.py:
##########
@@ -2635,6 +2635,177 @@ def test_relate(self):
         expected = pd.Series(["FF2F11212", "212101212"])
         self.check_pd_series_equal(result, expected)
 
+    def test_frechet_distance(self):
+        s1 = GeoSeries(
+            [
+                LineString([(0, 0), (1, 0), (2, 0)]),
+                LineString([(0, 0), (1, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                LineString([(0, 1), (1, 2), (2, 1)]),
+                LineString([(1, 0), (2, 1)]),
+            ]
+        )
+
+        result = s1.frechet_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)

Review Comment:
   Correction: empty geometries DO work in Sedona GeoSeries — my previous claim 
about a "Spark UDT limitation" was wrong. The match tests now include empty 
geometries and pass. We also fixed the upstream Java bugs 
(`ST_LineInterpolatePoint` crash, `ST_LineLocatePoint` returning `-Infinity`) 
and added Java + Scala Spark tests for empty geometry handling. See 2d1b4ac.



##########
python/tests/geopandas/test_geoseries.py:
##########
@@ -2635,6 +2635,177 @@ def test_relate(self):
         expected = pd.Series(["FF2F11212", "212101212"])
         self.check_pd_series_equal(result, expected)
 
+    def test_frechet_distance(self):
+        s1 = GeoSeries(
+            [
+                LineString([(0, 0), (1, 0), (2, 0)]),
+                LineString([(0, 0), (1, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                LineString([(0, 1), (1, 2), (2, 1)]),
+                LineString([(1, 0), (2, 1)]),
+            ]
+        )
+
+        result = s1.frechet_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test with single geometry
+        line = LineString([(0, 1), (1, 2), (2, 1)])
+        result = s1.frechet_distance(line)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test that GeoDataFrame works too
+        df_result = s1.to_geoframe().frechet_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(df_result, expected)
+
+        # Test that densify raises NotImplementedError
+        with pytest.raises(NotImplementedError):
+            s1.frechet_distance(s2, densify=0.5)
+
+    def test_hausdorff_distance(self):
+        s1 = GeoSeries(
+            [
+                LineString([(0, 0), (1, 0), (2, 0)]),
+                LineString([(0, 0), (1, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                LineString([(0, 1), (1, 2), (2, 1)]),
+                LineString([(1, 0), (2, 1)]),
+            ]
+        )
+
+        result = s1.hausdorff_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test with single geometry
+        line = LineString([(0, 1), (1, 2), (2, 1)])
+        result = s1.hausdorff_distance(line)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test that GeoDataFrame works too
+        df_result = s1.to_geoframe().hausdorff_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(df_result, expected)
+
+        # Test with densify parameter
+        result = s1.hausdorff_distance(s2, densify=0.5, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+    def test_geom_equals(self):
+        s1 = GeoSeries(
+            [
+                Point(0, 0),
+                Point(1, 1),
+                Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                Point(0, 0),
+                Point(2, 2),
+                Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
+            ]
+        )
+
+        result = s1.geom_equals(s2, align=False)
+        expected = pd.Series([True, False, True])
+        self.check_pd_series_equal(result, expected)

Review Comment:
   Correction: empty geometries DO work in Sedona GeoSeries — my previous claim 
about a "Spark UDT limitation" was wrong. The match tests now include empty 
geometries and pass. We also fixed the upstream Java bugs 
(`ST_LineInterpolatePoint` crash, `ST_LineLocatePoint` returning `-Infinity`) 
and added Java + Scala Spark tests for empty geometry handling. See 2d1b4ac.



##########
python/tests/geopandas/test_geoseries.py:
##########
@@ -2635,6 +2635,177 @@ def test_relate(self):
         expected = pd.Series(["FF2F11212", "212101212"])
         self.check_pd_series_equal(result, expected)
 
+    def test_frechet_distance(self):
+        s1 = GeoSeries(
+            [
+                LineString([(0, 0), (1, 0), (2, 0)]),
+                LineString([(0, 0), (1, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                LineString([(0, 1), (1, 2), (2, 1)]),
+                LineString([(1, 0), (2, 1)]),
+            ]
+        )
+
+        result = s1.frechet_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test with single geometry
+        line = LineString([(0, 1), (1, 2), (2, 1)])
+        result = s1.frechet_distance(line)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test that GeoDataFrame works too
+        df_result = s1.to_geoframe().frechet_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(df_result, expected)
+
+        # Test that densify raises NotImplementedError
+        with pytest.raises(NotImplementedError):
+            s1.frechet_distance(s2, densify=0.5)
+
+    def test_hausdorff_distance(self):
+        s1 = GeoSeries(
+            [
+                LineString([(0, 0), (1, 0), (2, 0)]),
+                LineString([(0, 0), (1, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                LineString([(0, 1), (1, 2), (2, 1)]),
+                LineString([(1, 0), (2, 1)]),
+            ]
+        )
+
+        result = s1.hausdorff_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test with single geometry
+        line = LineString([(0, 1), (1, 2), (2, 1)])
+        result = s1.hausdorff_distance(line)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+        # Test that GeoDataFrame works too
+        df_result = s1.to_geoframe().hausdorff_distance(s2, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(df_result, expected)
+
+        # Test with densify parameter
+        result = s1.hausdorff_distance(s2, densify=0.5, align=False)
+        expected = pd.Series([2.0, 1.0])
+        self.check_pd_series_equal(result, expected)
+
+    def test_geom_equals(self):
+        s1 = GeoSeries(
+            [
+                Point(0, 0),
+                Point(1, 1),
+                Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
+            ]
+        )
+        s2 = GeoSeries(
+            [
+                Point(0, 0),
+                Point(2, 2),
+                Polygon([(0, 0), (1, 0), (1, 1), (0, 1)]),
+            ]
+        )
+
+        result = s1.geom_equals(s2, align=False)
+        expected = pd.Series([True, False, True])
+        self.check_pd_series_equal(result, expected)
+
+        # Test with single geometry
+        result = s1.geom_equals(Point(0, 0))
+        expected = pd.Series([True, False, False])
+        self.check_pd_series_equal(result, expected)
+
+        # Test that GeoDataFrame works too
+        df_result = s1.to_geoframe().geom_equals(s2, align=False)
+        expected = pd.Series([True, False, True])
+        self.check_pd_series_equal(df_result, expected)
+
+    def test_interpolate(self):
+        s = GeoSeries(
+            [
+                LineString([(0, 0), (2, 0), (0, 2)]),
+                LineString([(0, 0), (2, 2)]),
+                LineString([(2, 0), (0, 2)]),
+            ]
+        )

Review Comment:
   Correction: empty geometries DO work in Sedona GeoSeries — my previous claim 
about a "Spark UDT limitation" was wrong. The match tests now include empty 
geometries and pass. We also fixed the upstream Java bugs 
(`ST_LineInterpolatePoint` crash, `ST_LineLocatePoint` returning `-Infinity`) 
and added Java + Scala Spark tests for empty geometry handling. See 2d1b4ac.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [GH-2731] Implement GeoSeries: frechet_distance, hausdorff_distance, geom_equals, interpolate, project [sedona]

Reply via email to