Jia Yu created SEDONA-494:
-----------------------------

             Summary: Raster data source cannot write to HDFS
                 Key: SEDONA-494
                 URL: https://issues.apache.org/jira/browse/SEDONA-494
             Project: Apache Sedona
          Issue Type: Bug
            Reporter: Jia Yu


h2. Reproduce

 

When run the following code

var df = spark.read.format("binaryFile").load("/user/spark/raster/input.tif")
df.write.format("raster").mode(org.apache.spark.sql.SaveMode.Overwrite).save("output")

 

Just a _SUCCESS file found in the path.

I can find tiff file created in HDFS audit.log , but there's not 'rename' cmd .
I can find "SparkHadoopMapRedUtil: No need to commit output of task because 
needsTaskCommit=false: attempt_xxx
BasicWriteTaskStatsTracker: Expected 1 files, but only saw 0. This could be due 
to the output format not writing empty files, or files being not immediately 
visible in the filesystem." in executor log.

 
h2. Solution:

in "org.apache.spark.sql.sedona_sql.io.raster.RasterFileFormat.scala"
val out = hfs.create(new Path(Paths.get(savePath, new 
Path(rasterFilePath).getName).toString))
=>
val out = hfs.create(new Path(savePath, new Path(rasterFilePath).getName))
will solve the problem

Paths.get should not be used in FileSystem implements



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to