Kristin Cowalcijk created SEDONA-319:
----------------------------------------
Summary: RS_BandAsArray does not always produce serializable
rasters
Key: SEDONA-319
URL: https://issues.apache.org/jira/browse/SEDONA-319
Project: Apache Sedona
Issue Type: Bug
Reporter: Kristin Cowalcijk
Sometimes {{RS_BandAsArray}} produces non-serializable rasters. As far as we
know, adding a new band to a raster with UInt8 pixel values will always produce
non-serializable results. Here is a code snippet to reproduce this problem
using a GeoTiff image in core/src/test/resources/:
{code:scala}
var df = sparkSession.read.format("binaryFile").load(resourceFolder +
"raster/test3.tif")
df = df.selectExpr("RS_FromGeoTiff(content) as raster",
"RS_BandAsArray(RS_FromGeoTiff(content), 1) as band")
df = df.selectExpr("RS_AddBandFromArray(raster, band, 2)")
df.collect()
{code}
The stacktrace is as follows:
{code}
Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 0.0 (TID 0) (bogon executor driver):
java.lang.IllegalArgumentException: No Serializers available for the ColorModel.
at
javax.media.jai.remote.SerializableRenderedImage.<init>(SerializableRenderedImage.java:507)
at
javax.media.jai.remote.SerializableRenderedImage.<init>(SerializableRenderedImage.java:390)
at org.apache.sedona.common.raster.Serde.serialize(Serde.java:35)
at
org.apache.spark.sql.sedona_sql.expressions.raster.RS_AddBandFromArray.$anonfun$eval$1(MapAlgebra.scala:814)
at scala.Option.map(Option.scala:230)
at
org.apache.spark.sql.sedona_sql.expressions.raster.RS_AddBandFromArray.eval(MapAlgebra.scala:814)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
Source)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:365)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)