petern48 commented on code in PR #2040:
URL: https://github.com/apache/sedona/pull/2040#discussion_r2178731619
##########
python/sedona/geopandas/geoseries.py:
##########
@@ -132,6 +129,20 @@ def __init__(
"allow_override=True)' to overwrite CRS or "
"'GeoSeries.to_crs(crs)' to reproject geometries. "
)
+ # This is a temporary workaround since pyspark errors when
creating a ps.Series from a ps.Series
+ # This is NOT a scalable solution since we call to_pandas() on the
data and is a hacky solution
+ # but this should be resolved if/once
https://github.com/apache/spark/pull/51300 is merged in.
+ # For now, we reset self._anchor = data to have keep the geometry
information (e.g crs) that's lost in to_pandas()
+ super().__init__(
+ data=data.to_pandas(),
+ index=index,
+ dtype=dtype,
+ name=name,
+ copy=copy,
+ fastpath=fastpath,
+ )
+
+ self._anchor = data
Review Comment:
While it's not ideal that we have to do it this way for now, I'm pretty
happy that I managed to find a hacky way to get the constructor to work
properly. I will create a separate issue about the temporary fix if/after this
approach is approved, and hopefully the Spark PR is approved and backported.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]