[sedona] 02/03: Add docs

jiayu Thu, 11 May 2023 01:39:01 -0700

This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch geotiff-enhance
in repository https://gitbox.apache.org/repos/asf/sedona.git


commit 7128ca72e1a6c96f92b1c2aafefa07c1984bccb8
Author: Jia Yu <[email protected]>
AuthorDate: Thu May 11 01:37:39 2023 -0700

    Add docs
---
 docs/api/sql/Raster-writer.md | 69 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/docs/api/sql/Raster-writer.md b/docs/api/sql/Raster-writer.md
index a2663c31..70f9584b 100644
--- a/docs/api/sql/Raster-writer.md
+++ b/docs/api/sql/Raster-writer.md
@@ -1,6 +1,75 @@
 !!!note
        Sedona writers are available in Scala, Java and Python and have the 
same APIs.
        
+## Write RasterUDT to raster files
+
+Introduction: You can write a Sedona Raster DataFrame to any raster formats 
using Sedona's built-in `raster` data source. With this, you can even read 
GeoTiff rasters and write them to ArcGrid rasters. Note that: `raster` data 
source does not support reading rasters. Please use Spark built-in `binaryFile` 
and Sedona RS constructors together to read rasters.
+
+Since: `v1.4.1`
+
+Available options:
+
+* rasterType
+       * mandatory
+       * Allowed values: `geotiff`, `arcgrid`
+* pathField
+       * optional. If you use this option, then the column specified in this 
option must exist in the DataFrame schema. If this option is not used, each 
produced raster image will have a random UUID file name.
+       * Allowed values: any column name that indicates the paths of each 
raster file
+
+The schema of the Raster dataframe to be written can be one of the following 
two schemas:
+
+```html
+root
+ |-- rs_fromgeotiff(content): raster (nullable = true)
+```
+
+or
+
+```html
+root
+ |-- rs_fromgeotiff(content): raster (nullable = true)
+ |-- path: string (nullable = true)
+```
+
+Spark SQL example 1:
+
+```scala
+sparkSession.write.format("raster").option("rasterType", 
"geotiff").mode(SaveMode.Overwrite).save("my_raster_file")
+```
+
+Spark SQL example 2:
+
+```scala
+sparkSession.write.format("raster").option("rasterType", 
"geotiff").option("pathField", 
"path").mode(SaveMode.Overwrite).save("my_raster_file")
+```
+
+The produced file structure will look like this:
+
+```html
+my_raster_file
+- part-00000-6c7af016-c371-4564-886d-1690f3b27ca8-c000
+       - test1.tiff
+       - .test1.tiff.crc
+- part-00001-6c7af016-c371-4564-886d-1690f3b27ca8-c000
+       - test2.tiff
+       - .test2.tiff.crc
+- part-00002-6c7af016-c371-4564-886d-1690f3b27ca8-c000
+       - test3.tiff
+       - .test3.tiff.crc
+- _SUCCESS
+```
+
+To read it back to Sedona Raster DataFrame, you can use the following command 
(note the `*` in the path):
+
+```scala
+sparkSession.read.format("binaryFile").load("my_raster_file/*")
+```
+
+Then you can create Raster type in Sedona like this `RS_FromGeoTiff(content)` 
(if the written data was in GeoTiff format).
+
+The newly created DataFrame can be written to disk again but must be under a 
different name such as `my_raster_file_modified`
+
+
 ## Write Array[Double] to GeoTiff files
 
 Introduction: You can write a GeoTiff dataframe as GeoTiff images using the 
spark `write` feature with the format `geotiff`. The geotiff raster column 
needs to be an array of double type data.

[sedona] 02/03: Add docs

Reply via email to