Martin Andersson created SEDONA-266:
---------------------------------------

             Summary: RS_Values throws UnsupportedOperationException for 
shuffled point arrays
                 Key: SEDONA-266
                 URL: https://issues.apache.org/jira/browse/SEDONA-266
             Project: Apache Sedona
          Issue Type: Bug
            Reporter: Martin Andersson


RS_Values(raster, array(st_point(1,1)) works well but if the point array is 
serialized (for instance because of a shuffle) an UnsupportedOperationException 
is thrown.

Spark has several ArrayData implementations. GenericArrayData is commonly 
returned from expressions. Once serialized it is converted to an 
UnsafeArrayData. UnsafeArrayData throws an exception if the "array" method is 
called. See 
[https://github.com/apache/spark/blob/94de3ca2942bb04852510abccf06df1fa8b2dab3/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeArrayData.java#L102]
 
The relevant stack trace in Sedona:
{code:java}
Caused by: java.lang.UnsupportedOperationException: Not supported on 
UnsafeArrayData.
    at 
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData.array(UnsafeArrayData.java:102)
    at 
org.apache.spark.sql.sedona_sql.expressions.raster.RS_Values.eval(Functions.scala:897)
{code}
 

It is possible to work around the bug by adding a bogus array operation to 
convert the UnsafeArrayData back to a GenericArrayData again after shuffling.
{code:java}
expr("RS_Values(raster, points)")
{code}
Becomes
{code:java}
expr("RS_Values(raster, filter(points, x->true)") 
{code}
The filter function won't change the array. The only purpose is to internally 
convert the UnsafeArrayData back to a GenericArrayData.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to