Hi.
Im trying to convert geopandas dataframe gdf_lines to line strings in
sedona.

In: print(gdf_lines)
Out:

Name
Cuadrante 46                LINESTRING (-101.72470 21.13983, -101.72485 21...
Cuadrante 47                LINESTRING (-101.73533 21.13968, -101.73675 21...
Cuadrante 48                LINESTRING (-101.75310 21.15606, -101.75198 21...


In: type(gdf_lines)
Out: geopandas.geoseries.GeoSeries

In: gdf_lines.geom_type
Out:
Name
Cuadrante 46                LineString
Cuadrante 47                LineString
Cuadrante 48                LineString

And I want to do something like the example here next
In:
rom pyspark.sql.types import IntegerType, StructField, StructType
from sedona.sql.types import GeometryType
schema = StructType(
    [
        StructField("id", IntegerType(), False),
        StructField("geom", GeometryType(), False)
    ]
)
line = [(40, 40), (30, 30), (40, 20), (30, 10)]
data = [[1, LineString(line)]]
gdf = spark.createDataFrame(data, schema)
gdf.show(1, False)

Out:
+---+---------------------------------------+
|id |geom                                   |
+---+---------------------------------------+
|1  |LINESTRING (40 40, 30 30, 40 20, 30 10)|
+---+---------------------------------------+

but it looks like I actuallly got a GeoSeries without column name for Geometry.

Do I need to convert to dataframe with column name first and then try
to apply a schema like in the example?

Or could there be another shorter path?

Reply via email to