Hi.
Im trying to convert geopandas dataframe gdf_lines to line strings in
sedona.
In: print(gdf_lines)
Out:
Name
Cuadrante 46 LINESTRING (-101.72470 21.13983, -101.72485 21...
Cuadrante 47 LINESTRING (-101.73533 21.13968, -101.73675 21...
Cuadrante 48 LINESTRING (-101.75310 21.15606, -101.75198 21...
In: type(gdf_lines)
Out: geopandas.geoseries.GeoSeries
In: gdf_lines.geom_type
Out:
Name
Cuadrante 46 LineString
Cuadrante 47 LineString
Cuadrante 48 LineString
And I want to do something like the example here next
In:
rom pyspark.sql.types import IntegerType, StructField, StructType
from sedona.sql.types import GeometryType
schema = StructType(
[
StructField("id", IntegerType(), False),
StructField("geom", GeometryType(), False)
]
)
line = [(40, 40), (30, 30), (40, 20), (30, 10)]
data = [[1, LineString(line)]]
gdf = spark.createDataFrame(data, schema)
gdf.show(1, False)
Out:
+---+---------------------------------------+
|id |geom |
+---+---------------------------------------+
|1 |LINESTRING (40 40, 30 30, 40 20, 30 10)|
+---+---------------------------------------+
but it looks like I actuallly got a GeoSeries without column name for Geometry.
Do I need to convert to dataframe with column name first and then try
to apply a schema like in the example?
Or could there be another shorter path?