JimShady commented on issue #1047:
URL: https://github.com/apache/sedona/issues/1047#issuecomment-1770368349
I am getting there I think. This is my complete code now.
```
my_files = sedona.read.format("binaryFile").load("/tmp/PE_FLRF_UD_Q*")
my_files.createOrReplaceTempView("my_files")
my_files_new = spark.sql("SELECT 1 as id, RS_FromGeoTiff(content) AS raster,
modificationTime, length, path FROM my_files")
from pyspark.sql.functions import first
pivot_df = my_files_new.groupBy("id").pivot("path").agg(first("raster"))
newColumns = []
problematic_chars = './,;{}()=:'
for column in pivot_df.columns:
column = column.lower()
column = column.replace(' ', '_')
for c in problematic_chars:
column = column.replace(c, '')
newColumns.append(column)
new_pivot_df = pivot_df.toDF(*newColumns)
new_pivot_df.createOrReplaceTempView("new_pivot_df")
added_df = sedona.sql("SELECT
RS_AddBand(RS_AddBand(RS_AddBand(RS_AddBand(RS_AddBand(dbfstmppe_flrf_ud_q20_re_02tif,
dbfstmppe_flrf_ud_q50_re_02tif), dbfstmppe_flrf_ud_q100_re_02tif),
dbfstmppe_flrf_ud_q200_re_02tif), dbfstmppe_flrf_ud_q500_re_02tif),
dbfstmppe_flrf_ud_q1500_re_02tif) as raster FROM new_pivot_df")
added_df.createOrReplaceTempView("added_df")
output = sedona.sql("SELECT RS_MapAlgebra(raster, 'D', 'out = rast[5] > 10
&& rast[4] < 4 ? (rast[0] + rast[1] + rast[2]) / rast[3] : 0') raster FROM
added_df")
output.createOrReplaceTempView("output")
new = sedona.sql("SELECT RS_AsGeoTiff(raster, 'LZW', '0.75') FROM output")
new.write.format("raster").save("my_raster_file.tif")
```
It seems to run fine until I try to save to file. I get "
java.lang.OutOfMemoryError: Java heap space":
```
Py4JJavaError: An error occurred while calling o6369.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3
in stage 52.0 failed 4 times, most recent failure: Lost task 3.3 in stage 52.0
(TID 91) (10.130.180.118 executor 8): java.lang.OutOfMemoryError: Java heap
space
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]