Hi Martin, Sorry for the late reply. I was totally swarmed by my other commitments in the past few days.
I think Option 2 makes more sense to me since we can assume we only have 1 RasterUDT. We will allow users to specify the file name column. In the current GeoTiff writer, there is an option reserved for this purpose: https://github.com/apache/sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/io/ImageWriteOptions.scala#L29 Do you think you can re-use it? We use the same GeoTiff writer. If the DF is RasterUDT, it will save to GeoTiff. If the DF is a DOUBLE array, it will also save to GeoTiff. Thanks, Jia On Fri, Feb 24, 2023 at 8:03 AM Martin Andersson < [email protected]> wrote: > Hi, > > The recent merge of the new raster type in > https://github.com/apache/sedona/pull/773 opens up the possibility of > adapting the geotiff data source to write rasters. To achieve this, we can > modify the data source to include two modes - classic (the current mode) > and raster. The mode selection can be automatic, with the data source > switching to raster mode if the data frame does not meet the requirements > of classic mode. > > In raster mode, the writer would require a raster column and an optional > filename column. If a filename is not provided, it could be generated using > the upper-left corner of the envelope and a uuid. For instance, > "ul_6490550_1338130_e89b4567-e89b-12d3-a456-426614174000.tiff". The > challenge is to inform the writer about the relevant columns to use. > > Option 1. Use columns with specific names like classic mode. Columns should > be named "filename" and "raster". If those are not found in the data frame > the writer will throw an exception. > > Option 2. If there is exactly one column of type raster use that, > regardless of it's name. Add a parameter to set the filename column. > > This would work for any data frame containing a raster column: > df.write.format("geotiff").save("DESTINATION_PATH") > > If you want to provide a filename: > df.write.format("geotiff").option("filenameColumn", > "my_filename_column").save("DESTINATION_PATH") > > Option 3. Other? > > What are you're thoughts? >
