[ https://issues.apache.org/jira/browse/SEDONA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584550#comment-17584550 ]
Doug Dennis commented on SEDONA-153: ------------------------------------ I'm not sure about the connection with ST_GeomFromWKT but the exception I get when running test_null_deserializer is coming from the Python side. The deserialize method attempts to iterate on None. Here is some of the output from pytest that demonstrates: {code:java} def deserialize(self, datum): > bytes_data = b''.join([struct.pack('b', el) for el in datum]) E TypeError: 'NoneType' object is not iterable sedona/sql/types.py:40: TypeError {code} My solution would be to add guards to GeometryType in Python like what Spark does with their TimestampType: [https://github.com/apache/spark/blob/master/python/pyspark/sql/types.py#L266] [https://github.com/apache/spark/blob/master/python/pyspark/sql/types.py#L273] > Python Serialization Fails with Nulls > ------------------------------------- > > Key: SEDONA-153 > URL: https://issues.apache.org/jira/browse/SEDONA-153 > Project: Apache Sedona > Issue Type: Bug > Reporter: Doug Dennis > Priority: Major > > The following currently fail due to Shapely not liking nulls/Nones: > {code:python} > def test_null_deserializer(self): > result = self.spark.sql("select st_geomfromwkt(null)").collect()[0][0] > assert result is None > def test_null_serializer(self): > data = [ > [1, None] > ] > schema = t.StructType( > [ > t.StructField("id", IntegerType(), True), > t.StructField("geom", GeometryType(), True), > ] > ) > self.spark.createDataFrame( > data, > schema > ).createOrReplaceTempView("points") > count = self.spark.sql("select count from points").collect()[0][0] > assert count == 1 > {code} > The solution is to add some null guards to methods in the python GeometryType > class. I can make a PR for this but I wasn't sure if I needed to wait for > this issue to be approved or acknowledged or something :) > Edit: I adjusted the deserializer test. I accidentally used a previous > version that fails on analysis. This version fails when the None is attempted > to be iterated in Python. -- This message was sent by Atlassian Jira (v8.20.10#820010)