Hi Martin,

Sorry for the late reply. I was totally swarmed by my other commitments in
the past few days.

I think Option 2 makes more sense to me since we can assume we only have 1
RasterUDT. We will allow users to specify the file name column. In the
current GeoTiff writer, there is an option reserved for this purpose:
https://github.com/apache/sedona/blob/master/sql/src/main/scala/org/apache/spark/sql/sedona_sql/io/ImageWriteOptions.scala#L29

Do you think you can re-use it? We use the same GeoTiff writer. If the DF
is RasterUDT, it will save to GeoTiff. If the DF is a DOUBLE array, it will
also save to GeoTiff.

Thanks,
Jia




On Fri, Feb 24, 2023 at 8:03 AM Martin Andersson <
[email protected]> wrote:

> Hi,
>
> The recent merge of the new raster type in
> https://github.com/apache/sedona/pull/773 opens up the possibility of
> adapting the geotiff data source to write rasters. To achieve this, we can
> modify the data source to include two modes - classic (the current mode)
> and raster. The mode selection can be automatic, with the data source
> switching to raster mode if the data frame does not meet the requirements
> of classic mode.
>
> In raster mode, the writer would require a raster column and an optional
> filename column. If a filename is not provided, it could be generated using
> the upper-left corner of the envelope and a uuid. For instance,
> "ul_6490550_1338130_e89b4567-e89b-12d3-a456-426614174000.tiff". The
> challenge is to inform the writer about the relevant columns to use.
>
> Option 1. Use columns with specific names like classic mode. Columns should
> be named "filename" and "raster". If those are not found in the data frame
> the writer will throw an exception.
>
> Option 2. If there is exactly one column of type raster use that,
> regardless of it's name. Add a parameter to set the filename column.
>
> This would work for any data frame containing a raster column:
> df.write.format("geotiff").save("DESTINATION_PATH")
>
> If you want to provide a filename:
> df.write.format("geotiff").option("filenameColumn",
> "my_filename_column").save("DESTINATION_PATH")
>
> Option 3. Other?
>
> What are you're thoughts?
>

Reply via email to