Copilot commented on code in PR #2332:
URL: https://github.com/apache/sedona/pull/2332#discussion_r2325822947


##########
python/sedona/spark/geopandas/sindex.py:
##########
@@ -170,10 +186,14 @@ def nearest(
 
         Note: Unlike Geopandas, Sedona does not support geometry input of type 
np.array or GeoSeries.
 
+        Note: nearest() has different behaviors depending on how the index is 
constructed.
+        When constructed from a np.array, this method returns indices like 
original geopandas.
+        When constructed from a GeoSeries or PySparkDataFrame, this method 
returns geometries.
+
         Returns
         -------
         list or tuple
-            List of indices of nearest geometries, optionally with distances.
+            List of the nearest geometries, optionally with distances.

Review Comment:
   Similar to the query method, this docstring states it returns geometries but 
should acknowledge that it returns indices when constructed from np.array, 
based on the behavioral notes above.
   ```suggestion
               If the index was constructed from a np.array, returns a list of 
indices (or a tuple of (indices, distances) if return_distance is True).
               If constructed from a GeoSeries or PySparkDataFrame, returns a 
list of geometries (or a tuple of (geometries, distances) if return_distance is 
True).
   ```



##########
python/tests/geopandas/test_sindex.py:
##########
@@ -63,6 +63,31 @@ def setup_method(self):
             ]
         )
 
+    def test_construct_from_geoseries(self):
+        # Construct from a GeoSeries
+        gs = GeoSeries([Point(x, x) for x in range(5)])
+        sindex = SpatialIndex(gs)
+        result = sindex.query(Point(2, 2))
+        # SpatialIndex constructed from GeoSeries return geometries
+        assert result == [Point(2, 2)]
+
+    def test_construct_from_pyspark_dataframe(self):
+        # Construct from PySparkDataFrame
+        df = self.spark.createDataFrame(
+            [(Point(x, x),) for x in range(5)], ["geometry"]
+        )
+        sindex = SpatialIndex(df, column_name="geometry")
+        result = sindex.query(Point(2, 2))
+        assert result == [Point(2, 2)]
+
+    def test_construct_from_nparray(self):
+        # Construct from np.array
+        array = np.array([Point(x, x) for x in range(5)])
+        sindex = SpatialIndex(array)
+        result = sindex.query(Point(2, 2))
+        # Returns indices like original geopandas
+        assert result.tolist() == np.array([2])

Review Comment:
   The assertion compares a list to a numpy array, which will always be False. 
Use `assert result.tolist() == [2]` or `assert result == np.array([2])` instead.
   ```suggestion
           assert result.tolist() == [2]
   ```



##########
python/sedona/spark/geopandas/sindex.py:
##########
@@ -220,20 +243,28 @@ def nearest(
 
     def intersection(self, bounds):
         """
-        Find geometries that intersect the given bounding box.
+        Find geometries that intersect the given bounding box. Similar to the 
Geopandas version,
+        this is a compatibility wrapper for rtree.index.Index.intersection, 
use query instead.
 
         Parameters
         ----------
         bounds : tuple
             Bounding box as (min_x, min_y, max_x, max_y).
 
+        Note: intersection() has different behaviors depending on how the 
index is constructed.
+        When constructed from a np.array, this method returns indices like 
original geopandas.
+        When constructed from a GeoSeries or PySparkDataFrame, this method 
returns geometries.
+
+        Note: Unlike Geopandas, Sedona does not support geometry input of type 
np.array or GeoSeries.
+        It is recommended to instead use GeoSeries.intersects directly.
+
         Returns
         -------
         list
-            List of indices of matching geometries.
+            List of matching geometries.

Review Comment:
   The return description should be consistent with the behavioral notes and 
acknowledge that indices are returned when constructed from np.array.
   ```suggestion
               List of matching indices (if constructed from a np.array) or 
list of matching geometries (if constructed from a GeoSeries or 
PySparkDataFrame).
   ```



##########
python/sedona/spark/geopandas/sindex.py:
##########
@@ -82,12 +93,17 @@ def query(self, geometry: BaseGeometry, predicate: str = 
None, sort: bool = Fals
         sort : bool, optional, default False
             Whether to sort the results.
 
+        Note: query() has different behaviors depending on how the index is 
constructed.
+        When constructed from a np.array, this method returns indices like 
original geopandas.
+        When constructed from a GeoSeries or PySparkDataFrame, this method 
returns geometries.
+
         Note: Unlike Geopandas, Sedona does not support geometry input of type 
np.array or GeoSeries.
+        It is recommended to instead use GeoSeries.intersects directly.
 
         Returns
         -------
         list
-            List of indices of matching geometries.
+            List of matching geometries.

Review Comment:
   The docstring states this returns 'List of matching geometries' but the note 
above indicates it returns indices when constructed from np.array. The return 
description should acknowledge both behaviors or be more generic.
   ```suggestion
               If the index was constructed from a np.array, returns a list of 
indices of matching geometries.
               If constructed from a GeoSeries or PySparkDataFrame, returns a 
list of matching geometries.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@sedona.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to