Feng Zhang created SEDONA-700: --------------------------------- Summary: ST_KNN fails on null and empty geometries Key: SEDONA-700 URL: https://issues.apache.org/jira/browse/SEDONA-700 Project: Apache Sedona Issue Type: Bug Reporter: Feng Zhang
ST_KNN in Sedona 1.7.0 fails on null geometries: {{df1 = spark.sql("SELECT ST_GeomFromText(col1) as geom1 from values ('POINT (0.0 0.0)'), (null)").cache() df2 = spark.sql("SELECT ST_Point(0.0, 0.0) as geom2").cache() df1.join(df2, f.expr("ST_KNN(geom1, geom2, 1)")).show()}} {{java.lang.NullPointerException: Cannot read the array length because "this.bytes" is null at org.apache.sedona.common.geometrySerde.UnsafeGeometryBuffer.getLength(UnsafeGeometryBuffer.java:86) at org.apache.sedona.common.geometrySerde.GeometrySerializer.checkBufferSize(GeometrySerializer.java:428) at org.apache.sedona.common.geometrySerde.GeometrySerializer.deserialize(GeometrySerializer.java:73) at org.apache.sedona.common.geometrySerde.GeometrySerializer.deserialize(GeometrySerializer.java:69) at org.apache.sedona.common.geometrySerde.GeometrySerializer.deserialize(GeometrySerializer.java:65) at org.apache.sedona.sql.utils.GeometrySerializer$.deserialize(GeometrySerializer.scala:50) at org.apache.spark.sql.sedona_sql.strategy.join.TraitJoinQueryBase.$anonfun$toSpatialRDD$1(TraitJoinQueryBase.scala:52)}} And empty geomtries: {{df1 = spark.sql("SELECT ST_GeomFromText('POINT EMPTY') as geom1").cache() df2 = spark.sql("SELECT ST_Point(0.0, 0.0) as geom2").cache() df1.join(df2, f.expr("ST_KNN(geom1, geom2, 1)")).show()}} {{java.lang.IllegalStateException: getX called on empty Point at org.locationtech.jts.geom.Point.getX(Point.java:100) at org.apache.sedona.common.utils.HalfOpenRectangle.contains(HalfOpenRectangle.java:32) at org.apache.sedona.core.spatialPartitioning.quadtree.ExtendedQuadTree.placeObject(ExtendedQuadTree.java:150) at org.apache.sedona.core.spatialPartitioning.QuadTreeRTPartitioner.placeObject(QuadTreeRTPartitioner.java:89) at org.apache.sedona.core.spatialRDD.SpatialRDD$3.call(SpatialRDD.java:299) at org.apache.sedona.core.spatialRDD.SpatialRDD$3.call(SpatialRDD.java:296) at org.apache.spark.api.java.JavaRDDLike.$anonfun$flatMapToPair$1(JavaRDDLike.scala:143) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)}} {{}} {{Converted from:}} {{}} https://github.com/apache/sedona/issues/1757{{{}{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010)