FengJiang2018 opened a new issue, #1059: URL: https://github.com/apache/sedona/issues/1059
## Expected behavior geoparquet should have geo metadata be generated and should not raise error during read by using ``` python df = sedona.read.format("geoparquet").load(path) ``` ## Actual behavior geoparquet was created without geo metadata and got error during read by using ``` python df = sedona.read.format("geoparquet").load(path) ``` ## Steps to reproduce the problem Seems like the issue is when I was using df.write to a geoparquet file, the geo metadata was not created for the Sedona geometry column. I am not sure if anything I missed. #1, I am using overture public dataset as input for the dataframe as following with Sedona Geometry column ``` python df_building = sedona.read.option("inferschema",True).parquet(inputpath) \ .withColumn("geometry2",expr("ST_GeomFromWKB(geometry)")) df_building.createOrReplaceTempView("rawdf") ``` #2, Yes I am using DataFrame to write a geoparquet file with Sedona Geometry Type column on databricks. ``` python newdf = spark.sql("select *, ST_GeoHash(geometry2, 5) as geohash from rawdf order by geohash").drop("geometry").withColumnRenamed("geometry2", "geometry") newdf.write.mode("overwrite").format("geoparquet") \ .save(path+"/final1.parquet") ``` Here is what I saw from the printSchema, it shows as geometry type, but the nullable is true seems like this is expected. Correct me if this is wrong. ``` cmd root |-- geometry: geometry (nullable = true) |-- geohash: string (nullable = true) ``` #3, I got that error when I am using following way to read the geoparquet from #2 ``` python df = sedona.read.format("geoparquet").load(newpath) ``` But there is read error if I use following code, but **no geo metadata** cound be found from df schema ``` python df = sedona.read.format("geoparquet").parquet(newpath) ``` ## Settings Sedona version = 1.5.0 Apache Spark version = 3.4.0 Apache Flink version = N/A API type = Python Scala version = 2.12 JRE version = 1.8 Python version = 3.10 Environment = Azure Databricks, notebook -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org