Copilot commented on code in PR #2732:
URL: https://github.com/apache/sedona/pull/2732#discussion_r2929662860
##########
common/src/main/java/org/apache/sedona/common/Functions.java:
##########
@@ -1485,7 +1486,8 @@ public static Geometry triangulatePolygon(Geometry geom) {
return ConstrainedDelaunayTriangulator.triangulate(geom);
}
- public static double lineLocatePoint(Geometry geom, Geometry point) {
+ public static Double lineLocatePoint(Geometry geom, Geometry point) {
+ if (geom.isEmpty()) return null;
double length = geom.getLength();
LengthIndexedLine indexedLine = new LengthIndexedLine(geom);
return indexedLine.indexOf(point.getCoordinate()) / length;
Review Comment:
`lineLocatePoint` can still return `NaN`/`Infinity` (or throw) for
valid-but-degenerate inputs. If `geom` is non-empty but has `length == 0`
(e.g., repeated coordinates), this does `index / 0`. Also, if `point` is empty,
`point.getCoordinate()` is null and `indexOf(...)` will NPE. Please add guards
(e.g., return 0.0 for zero-length lines and null for empty points) to avoid
producing invalid results / runtime failures and to better match
PostGIS/GeoPandas behavior.
##########
common/src/test/java/org/apache/sedona/common/FunctionsTest.java:
##########
@@ -4407,6 +4407,21 @@ public void lineLocatePoint() {
assertEquals(expectedResult3, actual3, FP_TOLERANCE);
}
+ @Test
+ public void lineLocatePointEmpty() {
+ LineString emptyLine = GEOMETRY_FACTORY.createLineString();
+ Geometry point = GEOMETRY_FACTORY.createPoint(new Coordinate(1, 1));
+ assertNull(Functions.lineLocatePoint(emptyLine, point));
+ }
+
+ @Test
Review Comment:
The new empty-geometry tests are helpful, but they don’t cover the closely
related degenerate case where a LineString is non-empty but has zero length
(e.g., repeated coordinates), which currently leads to `0/0 -> NaN` in
`Functions.lineLocatePoint`. Adding an assertion for that case (and for an
empty `point` argument) would prevent regressions and validate the intended
semantics.
##########
python/sedona/spark/geopandas/geoseries.py:
##########
@@ -1330,6 +1330,89 @@ def distance(self, other, align=None) -> pspd.Series:
)
return result
+ def frechet_distance(self, other, align=None, densify=None) -> pspd.Series:
+ if densify is not None:
+ raise NotImplementedError(
+ "Sedona does not support the densify parameter for
frechet_distance."
+ )
+
+ other_series, extended = self._make_series_of_val(other)
+ align = False if extended else align
+
+ spark_expr = stf.ST_FrechetDistance(F.col("L"), F.col("R"))
+ result = self._row_wise_operation(
+ spark_expr,
+ other_series,
+ align,
+ default_val=None,
+ )
+ return result
+
+ def hausdorff_distance(self, other, align=None, densify=None) ->
pspd.Series:
+ other_series, extended = self._make_series_of_val(other)
+ align = False if extended else align
+
+ if densify is not None:
+ spark_expr = stf.ST_HausdorffDistance(F.col("L"), F.col("R"),
densify)
+ else:
+ spark_expr = stf.ST_HausdorffDistance(F.col("L"), F.col("R"))
+ result = self._row_wise_operation(
+ spark_expr,
+ other_series,
+ align,
+ default_val=None,
+ )
+ return result
+
+ def geom_equals(self, other, align=None) -> pspd.Series:
+ other_series, extended = self._make_series_of_val(other)
+ align = False if extended else align
+
+ spark_expr = stp.ST_Equals(F.col("L"), F.col("R"))
+ result = self._row_wise_operation(
+ spark_expr,
+ other_series,
+ align,
+ returns_geom=False,
+ default_val=False,
+ )
+ return _to_bool(result)
+
+ def interpolate(self, distance, normalized=False) -> "GeoSeries":
+ other_series, extended = self._make_series_of_val(distance)
+ align = not extended
+
+ if normalized:
+ spark_expr = stf.ST_LineInterpolatePoint(F.col("L"), F.col("R"))
+ else:
+ length = stf.ST_Length(F.col("L"))
+ fraction = F.when(length == 0, F.lit(0.0)).otherwise(F.col("R") /
length)
+ spark_expr = stf.ST_LineInterpolatePoint(F.col("L"), fraction)
+ return self._row_wise_operation(
+ spark_expr,
+ other_series,
+ align=align,
+ returns_geom=True,
+ )
+
+ def project(self, other, normalized=False, align=None) -> pspd.Series:
+ other_series, extended = self._make_series_of_val(other)
+ align = False if extended else align
+
+ if normalized:
+ spark_expr = stf.ST_LineLocatePoint(F.col("L"), F.col("R"))
+ else:
+ spark_expr = stf.ST_LineLocatePoint(F.col("L"), F.col("R")) *
stf.ST_Length(
+ F.col("L")
+ )
Review Comment:
`project(normalized=False)` multiplies `ST_LineLocatePoint(...)` by
`ST_Length(L)`. For zero-length (non-empty) lines, `ST_LineLocatePoint`
currently computes `index/length` (0/0) which yields `NaN`, and `NaN * 0`
remains `NaN` (instead of the expected 0.0 absolute distance). Consider
explicitly handling `ST_Length(L) == 0` (return 0.0) for both normalized and
absolute modes to avoid propagating `NaN`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]