[GitHub] [sedona] jiayuasu commented on issue #1031: Errors reading data written with sedona 1.3.1

via GitHub Tue, 26 Sep 2023 12:40:25 -0700


jiayuasu commented on issue #1031:
URL: https://github.com/apache/sedona/issues/1031#issuecomment-1736177682


   @sebbegg I wonder how you read the Parquet file in Sedona 1.4.1. Spark 
`parquet` reader and writer do not understand GeometryUDT so in 1.3.1, the 
geometry column in a parquet file is just an array of bytes which represents 
the WKB format of the geometry.
   
   Maybe you can try to use ST_GeomFromWKB in 1.4.1 to read the geometry column 
directly
   
   ```
   df = spark.read("parquet").XXX
   df = df.select("ST_GeomFromWKB(geom)")
   df.show()
   ```
   
   If this does not work, you might have to write a UDF to convert 
ArrayType[binary] to BinaryType. This might help: 
https://stackoverflow.com/questions/57847527/how-do-i-convert-arrayfloattype-to-binarytype-in-spark-dataframes-using-scala


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [sedona] jiayuasu commented on issue #1031: Errors reading data written with sedona 1.3.1

Reply via email to