(sedona) branch master updated: [DOCS] Improve docs and fix deadlinks (#1846)

jiayu Mon, 10 Mar 2025 00:57:32 -0700

This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/sedona.git



The following commit(s) were added to refs/heads/master by this push:
     new e9d1f6fbc3 [DOCS] Improve docs and fix deadlinks (#1846)
e9d1f6fbc3 is described below

commit e9d1f6fbc3979491464087efa6a5644f8c26a61c
Author: Jia Yu <[email protected]>
AuthorDate: Mon Mar 10 00:57:20 2025 -0700

    [DOCS] Improve docs and fix deadlinks (#1846)
---
 .github/linters/.markdown-lint.yml                 |   6 +
 docs/api/sql/Function.md                           |   2 +-
 docs/api/sql/Raster-map-algebra.md                 |   4 +-
 docs/api/sql/Spider.md                             |   2 +-
 docs/setup/databricks.md                           |  57 +----
 docs/setup/docker.md                               |   4 +-
 docs/setup/maven-coordinates.md                    |  46 ----
 docs/setup/release-notes.md                        |   2 +-
 docs/tutorial/files/geoparquet-sedona-spark.md     |  10 +-
 docs/tutorial/files/shapefiles-sedona-spark.md     |   8 +-
 .../files/stac-sedona-spark.md}                    | 142 +++++------
 docs/tutorial/python-vector-osm.md                 | 159 ------------
 docs/tutorial/raster.md                            |  83 +------
 docs/tutorial/rdd.md                               |   4 +-
 docs/tutorial/snowflake/sql.md                     |   2 +-
 docs/tutorial/sql.md                               | 271 +++------------------
 mkdocs.yml                                         |  15 +-
 17 files changed, 135 insertions(+), 682 deletions(-)

diff --git a/.github/linters/.markdown-lint.yml 
b/.github/linters/.markdown-lint.yml
index 7c2ae00edc..9ce345ed0f 100644
--- a/.github/linters/.markdown-lint.yml
+++ b/.github/linters/.markdown-lint.yml
@@ -17,6 +17,9 @@
 
 # https://github.com/DavidAnson/markdownlint#rules--aliases
 
+# ul-style Unordered list style
+MD004: false
+
 # ul-indent - Unordered list indentation
 MD007: false
 
@@ -55,3 +58,6 @@ MD041: false
 
 # code-block-style - Code block style
 MD046: false
+
+# link-fragments Link fragments should be valid
+MD051: false
diff --git a/docs/api/sql/Function.md b/docs/api/sql/Function.md
index 51f63f3a13..2dc28f21ce 100644
--- a/docs/api/sql/Function.md
+++ b/docs/api/sql/Function.md
@@ -460,7 +460,7 @@ POINT ZM(1 1 1 1)
 ## ST_AsGeoJSON
 
 !!!note
-       This method is not recommended. Please use [Sedona GeoJSON data 
source](../../tutorial/sql.md#save-as-geojson) to write GeoJSON files.
+       This method is not recommended. Please use [Sedona GeoJSON data 
source](../../tutorial/sql.md#save-geojson) to write GeoJSON files.
 
 Introduction: Return the [GeoJSON](https://geojson.org/) string representation 
of a geometry
 
diff --git a/docs/api/sql/Raster-map-algebra.md 
b/docs/api/sql/Raster-map-algebra.md
index 5b22280ee6..70b7e52eef 100644
--- a/docs/api/sql/Raster-map-algebra.md
+++ b/docs/api/sql/Raster-map-algebra.md
@@ -34,7 +34,7 @@ RS_MapAlgebra(rast: Raster, pixelType: String, script: 
String, [noDataValue: Dou
 
 * `rast`: The raster to apply the map algebra expression to.
 * `pixelType`: The data type of the output raster. This can be one of `D` 
(double), `F` (float), `I` (integer), `S` (short), `US` (unsigned short) or `B` 
(byte). If specified `NULL`, the output raster will have the same data type as 
the input raster.
-* `script`: The map algebra script. [Refer here for more details on the 
format.](#:~:text=The Jiffle script is,current output pixel value)
+* `script`: The map algebra script. [Refer here for more details on the 
format.](https://github.com/geosolutions-it/jai-ext/wiki/Jiffle)
 * `noDataValue`: (Optional) The nodata value of the output raster.
 
 As of version `v1.5.1`, the `RS_MapAlgebra` function allows two raster column 
inputs, with multi-band rasters supported. The function accepts 5 parameters:
@@ -46,7 +46,7 @@ RS_MapAlgebra(rast0: Raster, rast1: Raster, pixelType: 
String, script: String, n
 * `rast0`: The first raster to apply the map algebra expression to.
 * `rast1`: The second raster to apply the map algebra expression to.
 * `pixelType`: The data type of the output raster. This can be one of `D` 
(double), `F` (float), `I` (integer), `S` (short), `US` (unsigned short) or `B` 
(byte). If specified `NULL`, the output raster will have the same data type as 
the input raster.
-* `script`: The map algebra script. [Refer here for more details on the 
format.](#:~:text=The Jiffle script is,current output pixel value)
+* `script`: The map algebra script. [Refer here for more details on the 
format.](https://github.com/geosolutions-it/jai-ext/wiki/Jiffle)
 * `noDataValue`: (Not optional) The nodata value of the output raster, `null` 
is allowed.
 
 Spark SQL Example for two raster input `RS_MapAlgebra`:
diff --git a/docs/api/sql/Spider.md b/docs/api/sql/Spider.md
index 5dfa6569fe..207259d10e 100644
--- a/docs/api/sql/Spider.md
+++ b/docs/api/sql/Spider.md
@@ -21,7 +21,7 @@ Sedona offers a spatial data generator called Spider. It is a 
data source that g
 
 ## Quick Start
 
-Once you have your [`SedonaContext` object created](../Overview#quick-start), 
you can create a DataFrame with the `spider` data source.
+Once you have your [`SedonaContext` object created](Overview.md#quick-start), 
you can create a DataFrame with the `spider` data source.
 
 ```python
 df_random_points = sedona.read.format("spider").load(n=1000, 
distribution="uniform")
diff --git a/docs/setup/databricks.md b/docs/setup/databricks.md
index 0a9e7cda9e..d0a8d332d9 100644
--- a/docs/setup/databricks.md
+++ b/docs/setup/databricks.md
@@ -17,60 +17,10 @@
  under the License.
  -->
 
-Please pay attention to the Spark version postfix and Scala version postfix on 
our [Maven Coordinate page](maven-coordinates.md). Databricks Spark and Apache 
Spark's compatibility can be found 
[here](https://docs.databricks.com/en/release-notes/runtime/index.html).
-
-## Community edition (free-tier)
-
-You just need to install the Sedona jars and Sedona Python on Databricks using 
Databricks default web UI. Then everything will work.
-
-### Install libraries
-
-1) From the Libraries tab install from Maven Coordinates
-
-```
-org.apache.sedona:sedona-spark-shaded-3.4_2.12:{{ sedona.current_version }}
-org.datasyslab:geotools-wrapper:{{ sedona.current_geotools }}
-```
-
-2) For enabling python support, from the Libraries tab install from PyPI
-
-```
-apache-sedona=={{ sedona.current_version }}
-geopandas==1.0.1
-keplergl==0.3.7
-pydeck==0.9.1
-```
-
-### Initialize
-
-After you have installed the libraries and started the cluster, you can 
initialize the Sedona `ST_*` functions and types by running from your code:
-
-(scala)
-
-```scala
-import org.apache.sedona.sql.utils.SedonaSQLRegistrator
-SedonaSQLRegistrator.registerAll(spark)
-```
-
-(or python)
-
-```python
-from sedona.register.geo_registrator import SedonaRegistrator
-
-SedonaRegistrator.registerAll(spark)
-```
-
-## Advanced editions
-
-In Databricks advanced editions, you need to install Sedona via [cluster 
init-scripts](https://docs.databricks.com/clusters/init-scripts.html) as 
described below. We recommend Databricks 10.x+. Sedona is not guaranteed to be 
100% compatible with `Databricks photon acceleration`. Sedona requires Spark 
internal APIs to inject many optimization strategies, which sometimes is not 
accessible in `Photon`.
-
-In Spark 3.2, `org.apache.spark.sql.catalyst.expressions.Generator` class 
added a field `nodePatterns`. Any SQL functions that rely on Generator class 
may have issues if compiled for a runtime with a differing spark version. For 
Sedona, those functions are:
-
-* ST_MakeValid
-* ST_SubDivideExplode
+In Databricks advanced editions, you need to install Sedona via [cluster 
init-scripts](https://docs.databricks.com/clusters/init-scripts.html) as 
described below. Sedona is not guaranteed to be 100% compatible with 
`Databricks photon acceleration`. Sedona requires Spark internal APIs to inject 
many optimization strategies, which sometimes is not accessible in `Photon`.
 
 !!!note
-    The following steps use DBR including Apache Spark 3.4.x as an example. 
Please change the Spark version according to your DBR version.
+    The following steps use DBR including Apache Spark 3.4.x as an example. 
Please change the Spark version according to your DBR version. Please pay 
attention to the Spark version postfix and Scala version postfix on our [Maven 
Coordinate page](maven-coordinates.md). Databricks Spark and Apache Spark's 
compatibility can be found 
[here](https://docs.databricks.com/en/release-notes/runtime/index.html).
 
 ### Download Sedona jars
 
@@ -91,9 +41,6 @@ Of course, you can also do the steps above manually.
 
 ### Create an init script
 
-!!!warning
-    Starting from December 2023, Databricks has disabled all DBFS based init 
script (/dbfs/XXX/<script-name>.sh). So you will have to store the init script 
from a workspace level (`/Workspace/Users/<user-name>/<script-name>.sh`) or 
Unity Catalog volume 
(`/Volumes/<catalog>/<schema>/<volume>/<path-to-script>/<script-name>.sh`). 
Please see [Databricks init 
scripts](https://docs.databricks.com/en/init-scripts/cluster-scoped.html#configure-a-cluster-scoped-init-script-using-the-ui)
 for more  [...]
-
 !!!note
     If you are creating a Shared cluster, you won't be able to use init 
scripts and jars stored under `Workspace`. Please instead store them in 
`Volumes`. The overall process should be the same.
 
diff --git a/docs/setup/docker.md b/docs/setup/docker.md
index 4af395020b..a5f708368e 100644
--- a/docs/setup/docker.md
+++ b/docs/setup/docker.md
@@ -68,7 +68,7 @@ This command will bind the container's ports 8888, 8080, 
8081, 4040, 8085 to the
 Example 2:
 
 ```bash
-docker run -p 8888:8888 -p 8080:8080 -p 8081:8081 -p 4040:4040 -p 8085:8085 
apache/sedona:{{ sedona.current_version }}
+docker run -d -e -p 8888:8888 -p 8080:8080 -p 8081:8081 -p 4040:4040 -p 
8085:8085 apache/sedona:{{ sedona.current_version }}
 ```
 
 This command will start a container with 4GB RAM for the driver and 4GB RAM 
for the executor and use Sedona {{ sedona.current_version }} image.
@@ -91,7 +91,7 @@ docker run -d -e DRIVER_MEM=6g -e EXECUTOR_MEM=8g \
 
 ### Start coding
 
-Open your browser and go to [http://localhost:8888/](http://localhost:8888/) 
to start coding with Sedona. You can also access Apache Zeppelin at 
[http://localhost:8085/classic/](http://localhost:8085/classic/  ) using your 
browser.
+Open your browser and go to [http://localhost:8888/](http://localhost:8888/) 
to start coding with Sedona in Jupyter Notebook. You can also access Apache 
Zeppelin at [http://localhost:8085/classic/](http://localhost:8085/classic/  ) 
using your browser.
 
 ### Notes
 
diff --git a/docs/setup/maven-coordinates.md b/docs/setup/maven-coordinates.md
index 90d5a2c419..e54870be7b 100644
--- a/docs/setup/maven-coordinates.md
+++ b/docs/setup/maven-coordinates.md
@@ -169,52 +169,6 @@ The optional GeoTools library is required if you want to 
use CRS transformation,
                </dependency>
                ```
 
-### netCDF-Java 5.4.2
-
-This is required only if you want to read HDF/NetCDF files using 
`RS_FromNetCDF`. Note that this JAR is not in Maven Central so you will need to 
add this repository to your pom.xml or build.sbt, or specify the URL in Spark 
Config `spark.jars.repositories` or spark-submit `--repositories` option.
-
-!!!warning
-       This jar was a required dependency due to a bug in Sedona 1.5.1. You 
will need to specify the URL of the repository in `spark.jars.repositories` if 
you use 1.5.1. This has been fixed in Sedona 1.5.2 and later.
-
-Under BSD 3-clause (compatible with Apache 2.0 license)
-
-!!! abstract "Add HDF/NetCDF dependency"
-
-       === "Sedona 1.3.1+"
-
-               Add unidata repo to your pom.xml
-
-               ```
-               <repositories>
-                   <repository>
-                       <id>unidata-all</id>
-                       <name>Unidata All</name>
-                       
<url>https://artifacts.unidata.ucar.edu/repository/unidata-all/</url>
-                   </repository>
-               </repositories>
-               ```
-
-               Then add cdm-core to your POM dependency.
-
-               ```xml
-               <dependency>
-                   <groupId>edu.ucar</groupId>
-                   <artifactId>cdm-core</artifactId>
-                   <version>5.4.2</version>
-               </dependency>
-               ```
-
-       === "Before Sedona 1.3.1"
-
-               ```xml
-               <!-- 
https://mvnrepository.com/artifact/org.datasyslab/sernetcdf -->
-               <dependency>
-                   <groupId>org.datasyslab</groupId>
-                   <artifactId>sernetcdf</artifactId>
-                   <version>0.1.0</version>
-               </dependency>
-               ```
-
 ## Use Sedona unshaded jars
 
 !!!warning
diff --git a/docs/setup/release-notes.md b/docs/setup/release-notes.md
index 7a5666846a..36c620beac 100644
--- a/docs/setup/release-notes.md
+++ b/docs/setup/release-notes.md
@@ -1164,7 +1164,7 @@ Sedona 1.4.0 is compiled against, Spark 3.3 / Flink 1.12, 
Java 8.
 
 * [X] **Sedona Spark & Flink** Serialize and deserialize geometries 3 - 7X 
faster
 * [X] **Sedona Spark & Flink** Google S2 based spatial join for fast 
approximate point-in-polygon join. See [Join query in 
Spark](../api/sql/Optimizer.md#google-s2-based-approximate-equi-join) and [Join 
query in Flink](../tutorial/flink/sql.md#join-query)
-* [X] **Sedona Spark** Pushdown spatial predicate on GeoParquet to reduce 
memory consumption by 10X: see 
[explanation](../api/sql/Optimizer.md#Push-spatial-predicates-to-GeoParquet)
+* [X] **Sedona Spark** Pushdown spatial predicate on GeoParquet to reduce 
memory consumption by 10X: see 
[explanation](../api/sql/Optimizer.md#push-spatial-predicates-to-geoparquet)
 * [X] **Sedona Spark** Automatically use broadcast index spatial join for 
small datasets
 * [X] **Sedona Spark** New RasterUDT added to Sedona GeoTiff reader.
 * [X] **Sedona Spark** A number of bug fixes and improvement to the Sedona R 
module.
diff --git a/docs/tutorial/files/geoparquet-sedona-spark.md 
b/docs/tutorial/files/geoparquet-sedona-spark.md
index 28da219474..11d95d9c6d 100644
--- a/docs/tutorial/files/geoparquet-sedona-spark.md
+++ b/docs/tutorial/files/geoparquet-sedona-spark.md
@@ -76,7 +76,7 @@ df.show(truncate=False)
 Here are the results:
 
 ```
-+---+---------------------+  
++---+---------------------+
 |id |geometry             |
 +---+---------------------+
 |a  |LINESTRING (2 5, 6 1)|
@@ -199,10 +199,10 @@ The value of `geoparquet.crs` and 
`geoparquet.crs.<column_name>` can be one of t
 * `""` (empty string): Omit the `crs` field. This implies that the CRS is 
[OGC:CRS84](https://www.opengis.net/def/crs/OGC/1.3/CRS84) for CRS-aware 
implementations.
 * `"{...}"` (PROJJSON string): The `crs` field will be set as the PROJJSON 
object representing the Coordinate Reference System (CRS) of the geometry. You 
can find the PROJJSON string of a specific CRS from here: https://epsg.io/ 
(click the JSON option at the bottom of the page). You can also customize your 
PROJJSON string as needed.
 
-Please note that Sedona currently cannot set/get a projjson string to/from a 
CRS. Its geoparquet reader will ignore the projjson metadata and you will have 
to set your CRS via [`ST_SetSRID`](../api/sql/Function.md#st_setsrid) after 
reading the file.
+Please note that Sedona currently cannot set/get a projjson string to/from a 
CRS. Its geoparquet reader will ignore the projjson metadata and you will have 
to set your CRS via [`ST_SetSRID`](../../api/sql/Function.md#st_setsrid) after 
reading the file.
 Its geoparquet writer will not leverage the SRID field of a geometry so you 
will have to always set the `geoparquet.crs` option manually when writing the 
file, if you want to write a meaningful CRS field.
 
-Due to the same reason, Sedona geoparquet reader and writer do NOT check the 
axis order (lon/lat or lat/lon) and assume they are handled by the users 
themselves when writing / reading the files. You can always use 
[`ST_FlipCoordinates`](../api/sql/Function.md#st_flipcoordinates) to swap the 
axis order of your geometries.
+Due to the same reason, Sedona geoparquet reader and writer do NOT check the 
axis order (lon/lat or lat/lon) and assume they are handled by the users 
themselves when writing / reading the files. You can always use 
[`ST_FlipCoordinates`](../../api/sql/Function.md#st_flipcoordinates) to swap 
the axis order of your geometries.
 
 ## Save GeoParquet with Covering Metadata
 
@@ -231,7 +231,7 @@ 
df_bbox.write.format("geoparquet").option("geoparquet.covering.geometry", "bbox"
 
 ## Sort then Save GeoParquet
 
-To maximize the performance of Sedona GeoParquet filter pushdown, we suggest 
that you sort the data by their geohash values (see 
[ST_GeoHash](../api/sql/Function.md#st_geohash)) and then save as a GeoParquet 
file. An example is as follows:
+To maximize the performance of Sedona GeoParquet filter pushdown, we suggest 
that you sort the data by their geohash values (see 
[ST_GeoHash](../../api/sql/Function.md#st_geohash)) and then save as a 
GeoParquet file. An example is as follows:
 
 ```
 SELECT col1, col2, geom, ST_GeoHash(geom, 5) as geohash
@@ -253,7 +253,7 @@ Let’s look at an example of a dataset with points and three 
bounding boxes.
 
 Now, let’s apply a spatial filter to read points within a particular area:
 
-![GeoParquet bbox filter](../../image/tutorial/files/geoparquet_bbox1.png)
+![GeoParquet bbox filter](../../image/tutorial/files/geoparquet_bbox2.png)
 
 Here is the query:
 
diff --git a/docs/tutorial/files/shapefiles-sedona-spark.md 
b/docs/tutorial/files/shapefiles-sedona-spark.md
index 3b24349b68..a7df23c521 100644
--- a/docs/tutorial/files/shapefiles-sedona-spark.md
+++ b/docs/tutorial/files/shapefiles-sedona-spark.md
@@ -196,11 +196,11 @@ Due to these limitations, other options are worth 
investigating.
 There are a variety of other file formats that are good for geometric data:
 
 * Iceberg
-* [GeoParquet](../geoparquet-sedona-spark)
+* [GeoParquet](geoparquet-sedona-spark.md)
 * FlatGeoBuf
-* [GeoPackage](../geopackage-sedona-spark)
-* [GeoJSON](../geojson-sedona-spark)
-* [CSV](../csv-geometry-sedona-spark)
+* [GeoPackage](geopackage-sedona-spark.md)
+* [GeoJSON](geojson-sedona-spark.md)
+* [CSV](csv-geometry-sedona-spark.md)
 * GeoTIFF
 
 ## Why Sedona does not support Shapefile writes
diff --git a/docs/api/sql/Stac.md b/docs/tutorial/files/stac-sedona-spark.md
similarity index 85%
rename from docs/api/sql/Stac.md
rename to docs/tutorial/files/stac-sedona-spark.md
index 8d56644e5e..062e6c5f55 100644
--- a/docs/api/sql/Stac.md
+++ b/docs/tutorial/files/stac-sedona-spark.md
@@ -17,6 +17,8 @@
  under the License.
  -->
 
+# STAC catalog with Apache Sedona and Spark
+
 The STAC data source allows you to read data from a SpatioTemporal Asset 
Catalog (STAC) API. The data source supports reading STAC items and collections.
 
 ## Usage
@@ -108,29 +110,29 @@ root
 
+------------+--------------------+-------+--------------------+--------------------+--------------------+-----+-----------+--------------------+--------------+------------+--------------------+--------------------+-----------+-----------+-------------+-------+----+--------------------+--------------------+--------------------+
 ```
 
-# Filter Pushdown
+## Filter Pushdown
 
 The STAC data source supports predicate pushdown for spatial and temporal 
filters. The data source can push down spatial and temporal filters to the 
underlying data source to reduce the amount of data that needs to be read.
 
-## Spatial Filter Pushdown
+### Spatial Filter Pushdown
 
 Spatial filter pushdown allows the data source to apply spatial predicates 
(e.g., st_contains, st_intersects) directly at the data source level, reducing 
the amount of data transferred and processed.
 
-## Temporal Filter Pushdown
+### Temporal Filter Pushdown
 
 Temporal filter pushdown allows the data source to apply temporal predicates 
(e.g., BETWEEN, >=, <=) directly at the data source level, similarly reducing 
the amount of data transferred and processed.
 
-# Examples
+## Examples
 
 Here are some examples demonstrating how to query a STAC data source that is 
loaded into a table named `STAC_TABLE`.
 
-## SQL Select Without Filters
+### SQL Select Without Filters
 
 ```sql
 SELECT id, datetime as dt, geometry, bbox FROM STAC_TABLE
 ```
 
-## SQL Select With Temporal Filter
+### SQL Select With Temporal Filter
 
 ```sql
   SELECT id, datetime as dt, geometry, bbox
@@ -140,7 +142,7 @@ SELECT id, datetime as dt, geometry, bbox FROM STAC_TABLE
 
 In this example, the data source will push down the temporal filter to the 
underlying data source.
 
-## SQL Select With Spatial Filter
+### SQL Select With Spatial Filter
 
 ```sql
   SELECT id, geometry
@@ -150,7 +152,7 @@ In this example, the data source will push down the 
temporal filter to the under
 
 In this example, the data source will push down the spatial filter to the 
underlying data source.
 
-## Sedona Configuration for STAC Reader
+### Sedona Configuration for STAC Reader
 
 When using the STAC reader in Sedona, several configuration options can be set 
to control the behavior of the reader. These configurations are typically set 
in a `Map[String, String]` and passed to the reader. Below are the key sedona 
configuration options:
 
@@ -192,73 +194,13 @@ These configurations can be combined into a single 
`Map[String, String]` and pas
 
 These options above provide fine-grained control over how the STAC data is 
read and processed in Sedona.
 
-# Python API
+## Python API
 
 The Python API allows you to interact with a SpatioTemporal Asset Catalog 
(STAC) API using the Client class. This class provides methods to open a 
connection to a STAC API, retrieve collections, and search for items with 
various filters.
 
-## Client Class
-
-## Methods
-
-### `open(url: str) -> Client`
-
-Opens a connection to the specified STAC API URL.
-
-**Parameters:**
-
-- `url` (*str*): The URL of the STAC API to connect to.
-  **Example:** `"https://planetarycomputer.microsoft.com/api/stac/v1"`
-
-**Returns:**
-
-- `Client`: An instance of the `Client` class connected to the specified URL.
-
----
-
-### `get_collection(collection_id: str) -> CollectionClient`
-
-Retrieves a collection client for the specified collection ID.
-
-**Parameters:**
-
-- `collection_id` (*str*): The ID of the collection to retrieve.
-  **Example:** `"aster-l1t"`
-
-**Returns:**
-
-- `CollectionClient`: An instance of the `CollectionClient` class for the 
specified collection.
-
----
-
-### `search(*ids: Union[str, list], collection_id: str, bbox: Optional[list] = 
None, datetime: Optional[Union[str, datetime.datetime, list]] = None, 
max_items: Optional[int] = None, return_dataframe: bool = True) -> 
Union[Iterator[PyStacItem], DataFrame]`
-
-Searches for items in the specified collection with optional filters.
-
-**Parameters:**
-
-- `ids` (*Union[str, list]*): A variable number of item IDs to filter the 
items.
-  **Example:** `"item_id1"` or `["item_id1", "item_id2"]`
-- `collection_id` (*str*): The ID of the collection to search in.
-  **Example:** `"aster-l1t"`
-- `bbox` (*Optional[list]*): A list of bounding boxes for filtering the items. 
Each bounding box is represented as a list of four float values: `[min_lon, 
min_lat, max_lon, max_lat]`.
-  **Example:** `[[ -180.0, -90.0, 180.0, 90.0 ]]`
-- `datetime` (*Optional[Union[str, datetime.datetime, list]]*): A single 
datetime, RFC 3339-compliant timestamp, or a list of date-time ranges for 
filtering the items.
-  **Example:**
-    - `"2020-01-01T00:00:00Z"`
-    - `datetime.datetime(2020, 1, 1)`
-    - `[["2020-01-01T00:00:00Z", "2021-01-01T00:00:00Z"]]`
-- `max_items` (*Optional[int]*): The maximum number of items to return from 
the search, even if there are more matching results.
-  **Example:** `100`
-- `return_dataframe` (*bool*): If `True` (default), return the result as a 
Spark DataFrame instead of an iterator of `PyStacItem` objects.
-  **Example:** `True`
-
-**Returns:**
+### Sample Code
 
-- *Union[Iterator[PyStacItem], DataFrame]*: An iterator of `PyStacItem` 
objects or a Spark DataFrame that matches the specified filters.
-
-## Sample Code
-
-### Initialize the Client
+#### Initialize the Client
 
 ```python
 from sedona.stac.client import Client
@@ -267,7 +209,7 @@ from sedona.stac.client import Client
 client = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1";)
 ```
 
-### Search Items on a Collection Within a Year
+#### Search Items on a Collection Within a Year
 
 ```python
 items = client.search(
@@ -275,7 +217,7 @@ items = client.search(
 )
 ```
 
-### Search Items on a Collection Within a Month and Max Items
+#### Search Items on a Collection Within a Month and Max Items
 
 ```python
 items = client.search(
@@ -283,7 +225,7 @@ items = client.search(
 )
 ```
 
-### Search Items with Bounding Box and Interval
+#### Search Items with Bounding Box and Interval
 
 ```python
 items = client.search(
@@ -295,14 +237,14 @@ items = client.search(
 )
 ```
 
-### Search Multiple Items with Multiple Bounding Boxes
+#### Search Multiple Items with Multiple Bounding Boxes
 
 ```python
 bbox_list = [[-180.0, -90.0, 180.0, 90.0], [-100.0, -50.0, 100.0, 50.0]]
 items = client.search(collection_id="aster-l1t", bbox=bbox_list, 
return_dataframe=False)
 ```
 
-### Search Items and Get DataFrame as Return with Multiple Intervals
+#### Search Items and Get DataFrame as Return with Multiple Intervals
 
 ```python
 interval_list = [
@@ -315,7 +257,7 @@ df = client.search(
 df.show()
 ```
 
-### Save Items in DataFrame to GeoParquet with Both Bounding Boxes and 
Intervals
+#### Save Items in DataFrame to GeoParquet with Both Bounding Boxes and 
Intervals
 
 ```python
 # Save items in DataFrame to GeoParquet with both bounding boxes and intervals
@@ -326,7 +268,51 @@ client.get_collection("aster-l1t").save_to_geoparquet(
 
 These examples demonstrate how to use the Client class to search for items in 
a STAC collection with various filters and return the results as either an 
iterator of PyStacItem objects or a Spark DataFrame.
 
-# References
+### Methods
+
+**`open(url: str) -> Client`**
+Opens a connection to the specified STAC API URL.
+
+Parameters:
+
+* `url` (*str*): The URL of the STAC API to connect to. Example: 
`"https://planetarycomputer.microsoft.com/api/stac/v1"`
+
+Returns:
+
+* `Client`: An instance of the `Client` class connected to the specified URL.
+
+---
+
+**`get_collection(collection_id: str) -> CollectionClient`**
+Retrieves a collection client for the specified collection ID.
+
+Parameters:
+
+* `collection_id` (*str*): The ID of the collection to retrieve. Example: 
`"aster-l1t"`
+
+Returns:
+
+* `CollectionClient`: An instance of the `CollectionClient` class for the 
specified collection.
+
+---
+
+**`search(*ids: Union[str, list], collection_id: str, bbox: Optional[list] = 
None, datetime: Optional[Union[str, datetime.datetime, list]] = None, 
max_items: Optional[int] = None, return_dataframe: bool = True) -> 
Union[Iterator[PyStacItem], DataFrame]`**
+Searches for items in the specified collection with optional filters.
+
+Parameters:
+
+* `ids` (*Union[str, list]*): A variable number of item IDs to filter the 
items. Example: `"item_id1"` or `["item_id1", "item_id2"]`
+* `collection_id` (*str*): The ID of the collection to search in. Example: 
`"aster-l1t"`
+* `bbox` (*Optional[list]*): A list of bounding boxes for filtering the items, 
represented as `[min_lon, min_lat, max_lon, max_lat]`. Example: `[[ -180.0, 
-90.0, 180.0, 90.0 ]]`
+* `datetime` (*Optional[Union[str, datetime.datetime, list]]*): A single 
datetime, RFC 3339-compliant timestamp, or a list of date-time ranges. Example: 
`"2020-01-01T00:00:00Z"`, `datetime.datetime(2020, 1, 1)`, 
`[["2020-01-01T00:00:00Z", "2021-01-01T00:00:00Z"]]`
+* `max_items` (*Optional[int]*): The maximum number of items to return. 
Example: `100`
+* `return_dataframe` (*bool*): If `True` (default), return the result as a 
Spark DataFrame instead of an iterator of `PyStacItem` objects. Example: `True`
+
+Returns:
+
+* *Union[Iterator[PyStacItem], DataFrame]*: An iterator of `PyStacItem` 
objects or a Spark DataFrame that matches the specified filters.
+
+## References
 
 - STAC Specification: https://stacspec.org/
 
diff --git a/docs/tutorial/python-vector-osm.md 
b/docs/tutorial/python-vector-osm.md
deleted file mode 100644
index 00f19f4322..0000000000
--- a/docs/tutorial/python-vector-osm.md
+++ /dev/null
@@ -1,159 +0,0 @@
-<!--
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements.  See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership.  The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License.  You may obtain a copy of the License at
-
-   http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied.  See the License for the
- specific language governing permissions and limitations
- under the License.
- -->
-
-# Example of spark + sedona + hdfs with slave nodes and OSM vector data 
consults
-
-```
-from IPython.display import display, HTML
-from pyspark.sql import SparkSession
-from pyspark import StorageLevel
-import pandas as pd
-from pyspark.sql.types import StructType, StructField,StringType, LongType, 
IntegerType, DoubleType, ArrayType
-from pyspark.sql.functions import regexp_replace
-from sedona.register import SedonaRegistrator
-from sedona.utils import SedonaKryoRegistrator, KryoSerializer
-from pyspark.sql.functions import col, split, expr
-from pyspark.sql.functions import udf, lit
-from sedona.utils import SedonaKryoRegistrator, KryoSerializer
-from pyspark.sql.functions import col, split, expr
-from pyspark.sql.functions import udf, lit, flatten
-from pywebhdfs.webhdfs import PyWebHdfsClient
-from datetime import date
-from pyspark.sql.functions import monotonically_increasing_id
-import json
-```
-
-## Registering spark session, adding node executor configurations and sedona 
registrator
-
-```
-spark = SparkSession.\
-    builder.\
-    appName("Overpass-API").\
-    enableHiveSupport().\
-    master("local[*]").\
-    master("spark://spark-master:7077").\
-    config("spark.executor.memory", "15G").\
-    config("spark.driver.maxResultSize", "135G").\
-    config("spark.sql.shuffle.partitions", "500").\
-    config(' spark.sql.adaptive.coalescePartitions.enabled', True).\
-    config('spark.sql.adaptive.enabled', True).\
-    config('spark.sql.adaptive.coalescePartitions.initialPartitionNum', 125).\
-    config("spark.sql.execution.arrow.pyspark.enabled", True).\
-    config("spark.sql.execution.arrow.fallback.enabled", True).\
-    config('spark.kryoserializer.buffer.max', 2047).\
-    config("spark.serializer", KryoSerializer.getName).\
-    config("spark.kryo.registrator", SedonaKryoRegistrator.getName).\
-    config("spark.jars.packages", 
"org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.4.0,org.datasyslab:geotools-wrapper:1.4.0-28.2")
 .\
-    enableHiveSupport().\
-    getOrCreate()
-
-SedonaRegistrator.registerAll(spark)
-sc = spark.sparkContext
-```
-
-## Connecting to Overpass API to search and downloading data for saving into 
HDFS
-
-```
-import requests
-import json
-
-overpass_url = "http://overpass-api.de/api/interpreter";
-overpass_query = """
-[out:json];
-area[name = "Foz do Iguaçu"];
-way(area)["highway"~""];
-out geom;
->;
-out skel qt;
-"""
-
-response = requests.get(overpass_url,
-                         params={'data': overpass_query})
-data = response.json()
-hdfs = PyWebHdfsClient(host='179.106.229.159',port='50070', user_name='root')
-file_name = "foz_roads_osm.json"
-hdfs.delete_file_dir(file_name)
-hdfs.create_file(file_name, json.dumps(data))
-
-```
-
-## Connecting spark sedona with saved hdfs file
-
-```
-path = "hdfs://776faf4d6a1e:8020/"+file_name
-df = spark.read.json(path, multiLine = "true")
-```
-
-## Consulting and organizing data for analysis
-
-```
-from pyspark.sql.functions import explode, arrays_zip
-
-df.createOrReplaceTempView("df")
-tb = spark.sql("select *, size(elements) total_nodes from df")
-tb.show(5)
-
-isolate_total_nodes = tb.select("total_nodes").toPandas()
-total_nodes = isolate_total_nodes["total_nodes"].iloc[0]
-print(total_nodes)
-
-isolate_ids = tb.select("elements.id").toPandas()
-ids = pd.DataFrame(isolate_ids["id"].iloc[0]).drop_duplicates()
-print(ids[0].iloc[1])
-
-formatted_df = tb\
-.withColumn("id", explode("elements.id"))
-
-formatted_df.show(5)
-
-formatted_df = tb\
-.withColumn("new", arrays_zip("elements.id", "elements.geometry", 
"elements.nodes", "elements.tags"))\
-.withColumn("new", explode("new"))
-
-formatted_df.show(5)
-
-# formatted_df.printSchema()
-
-formatted_df = 
formatted_df.select("new.0","new.1","new.2","new.3.maxspeed","new.3.incline","new.3.surface",
 "new.3.name", "total_nodes")
-formatted_df = 
formatted_df.withColumnRenamed("0","id").withColumnRenamed("1","geom").withColumnRenamed("2","nodes").withColumnRenamed("3","tags")
-formatted_df.createOrReplaceTempView("formatted_df")
-formatted_df.show(5)
-# TODO atualizar daqui para baixo para considerar a linha inteira na lógica
-points_tb = spark.sql("select geom, id from formatted_df where geom IS NOT 
NULL")
-points_tb = points_tb\
-.withColumn("new", arrays_zip("geom.lat", "geom.lon"))\
-.withColumn("new", explode("new"))
-
-points_tb = points_tb.select("new.0","new.1", "id")
-
-points_tb = points_tb.withColumnRenamed("0","lat").withColumnRenamed("1","lon")
-points_tb.printSchema()
-
-points_tb.createOrReplaceTempView("points_tb")
-
-points_tb.show(5)
-
-coordinates_tb = spark.sql("select (select 
collect_list(CONCAT(p1.lat,',',p1.lon)) from points_tb p1 where p1.id = p2.id 
group by p1.id) as coordinates, p2.id, p2.maxspeed, p2.incline, p2.surface, 
p2.name, p2.nodes, p2.total_nodes from formatted_df p2")
-coordinates_tb.createOrReplaceTempView("coordinates_tb")
-coordinates_tb.show(5)
-
-roads_tb = spark.sql("SELECT 
ST_LineStringFromText(REPLACE(REPLACE(CAST(coordinates as 
string),'[',''),']',''), ',') as geom, id, maxspeed, incline, surface, name, 
nodes, total_nodes FROM coordinates_tb WHERE coordinates IS NOT NULL")
-roads_tb.createOrReplaceTempView("roads_tb")
-roads_tb.show(5)
-```
diff --git a/docs/tutorial/raster.md b/docs/tutorial/raster.md
index 641b441428..65541ced6c 100644
--- a/docs/tutorial/raster.md
+++ b/docs/tutorial/raster.md
@@ -21,7 +21,7 @@
     Sedona uses 1-based indexing for all raster functions except [map algebra 
function](../api/sql/Raster-map-algebra.md), which uses 0-based indexing.
 
 !!!note
-    Since v`1.5.0`, Sedona assumes geographic coordinates to be in 
longitude/latitude order. If your data is lat/lon order, please use 
`ST_FlipCoordinates` to swap X and Y.
+    Sedona assumes geographic coordinates to be in longitude/latitude order. 
If your data is lat/lon order, please use `ST_FlipCoordinates` to swap X and Y.
 
 Starting from `v1.1.0`, Sedona SQL supports raster data sources and raster 
operators in DataFrame and SQL. Raster support is available in all Sedona 
language bindings including ==Scala, Java, Python, and R==.
 
@@ -67,8 +67,6 @@ Detailed SedonaSQL APIs are available here: [SedonaSQL 
API](../api/sql/Overview.
 
 Use the following code to create your Sedona config at the beginning. If you 
already have a SparkSession (usually named `spark`) created by Wherobots/AWS 
EMR/Databricks, please skip this step and use `spark` directly.
 
-==Sedona >= 1.4.1==
-
 You can add additional Spark runtime config to the config builder. For 
example, 
`SedonaContext.builder().config("spark.sql.autoBroadcastJoinThreshold", 
"10485760")`
 
 === "Scala"
@@ -114,65 +112,10 @@ You can add additional Spark runtime config to the config 
builder. For example,
        ```
     Please replace the `3.3` in the package name of sedona-spark-shaded with 
the corresponding major.minor version of Spark, such as 
`sedona-spark-shaded-3.4_2.12:{{ sedona.current_version }}`.
 
-==Sedona < 1.4.1==
-
-The following method has been deprecated since Sedona 1.4.1. Please use the 
method above to create your Sedona config.
-
-=== "Scala"
-
-       ```scala
-       var sparkSession = SparkSession.builder()
-       .master("local[*]") // Delete this if run in cluster mode
-       .appName("readTestScala") // Change this to a proper name
-       // Enable Sedona custom Kryo serializer
-       .config("spark.serializer", classOf[KryoSerializer].getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
classOf[SedonaKryoRegistrator].getName)
-       .getOrCreate() // org.apache.sedona.core.serde.SedonaKryoRegistrator
-       ```
-       If you use SedonaViz together with SedonaSQL, please use the following 
two lines to enable Sedona Kryo serializer instead:
-       ```scala
-       .config("spark.serializer", classOf[KryoSerializer].getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
classOf[SedonaVizKryoRegistrator].getName) // 
org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
-       ```
-
-=== "Java"
-
-       ```java
-       SparkSession sparkSession = SparkSession.builder()
-       .master("local[*]") // Delete this if run in cluster mode
-       .appName("readTestScala") // Change this to a proper name
-       // Enable Sedona custom Kryo serializer
-       .config("spark.serializer", KryoSerializer.class.getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", SedonaKryoRegistrator.class.getName)
-       .getOrCreate() // org.apache.sedona.core.serde.SedonaKryoRegistrator
-       ```
-       If you use SedonaViz together with SedonaSQL, please use the following 
two lines to enable Sedona Kryo serializer instead:
-       ```scala
-       .config("spark.serializer", KryoSerializer.class.getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
SedonaVizKryoRegistrator.class.getName) // 
org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
-       ```
-
-=== "Python"
-
-       ```python
-       sparkSession = SparkSession. \
-           builder. \
-           appName('appName'). \
-           config("spark.serializer", KryoSerializer.getName). \
-           config("spark.kryo.registrator", SedonaKryoRegistrator.getName). \
-           config('spark.jars.packages',
-                  'org.apache.sedona:sedona-spark-shaded-3.3_2.12:{{ 
sedona.current_version }},'
-                  'org.datasyslab:geotools-wrapper:{{ sedona.current_geotools 
}}'). \
-           getOrCreate()
-       ```
-    Please replace the `3.3` in the package name of sedona-spark-shaded with 
the corresponding major.minor version of Spark, such as 
`sedona-spark-shaded-3.4_2.12:{{ sedona.current_version }}`.
-
 ## Initiate SedonaContext
 
 Add the following line after creating the Sedona config. If you already have a 
SparkSession (usually named `spark`) created by Wherobots/AWS EMR/Databricks, 
please call `SedonaContext.create(spark)` instead.
 
-==Sedona >= 1.4.1==
-
 === "Scala"
 
        ```scala
@@ -197,30 +140,6 @@ Add the following line after creating the Sedona config. 
If you already have a S
        sedona = SedonaContext.create(config)
        ```
 
-==Sedona < 1.4.1==
-
-The following method has been deprecated since Sedona 1.4.1. Please use the 
method above to create your SedonaContext.
-
-=== "Scala"
-
-       ```scala
-       SedonaSQLRegistrator.registerAll(sparkSession)
-       ```
-
-=== "Java"
-
-       ```java
-       SedonaSQLRegistrator.registerAll(sparkSession)
-       ```
-
-=== "Python"
-
-       ```python
-       from sedona.register import SedonaRegistrator
-
-       SedonaRegistrator.registerAll(spark)
-       ```
-
 You can also register everything by passing `--conf 
spark.sql.extensions=org.apache.sedona.sql.SedonaSqlExtensions` to 
`spark-submit` or `spark-shell`.
 
 ## Load data from files
diff --git a/docs/tutorial/rdd.md b/docs/tutorial/rdd.md
index fd5a1fcfde..d3b27896f7 100644
--- a/docs/tutorial/rdd.md
+++ b/docs/tutorial/rdd.md
@@ -765,9 +765,9 @@ Distance join can only accept `COVERED_BY` and `INTERSECTS` 
as spatial predicate
 
 The details of spatial partitioning in join query is 
[here](#use-spatial-partitioning).
 
-The details of using spatial indexes in join query is 
[here](#use-spatial-indexes-2).
+The details of using spatial indexes in join query is 
[here](#use-spatial-indexes_2).
 
-The output format of the distance join query is [here](#output-format-2).
+The output format of the distance join query is [here](#output-format_2).
 
 !!!note
        Distance join query is equal to the following query in Spatial SQL:
diff --git a/docs/tutorial/snowflake/sql.md b/docs/tutorial/snowflake/sql.md
index 02ef7c7e01..ba42f23138 100644
--- a/docs/tutorial/snowflake/sql.md
+++ b/docs/tutorial/snowflake/sql.md
@@ -302,7 +302,7 @@ Please use the following steps:
 
 ### 1. Generate S2 ids for both tables
 
-Use [ST_S2CellIds](../../api/snowflake/vector-data/Function.md#ST_S2CellIDs) 
to generate cell IDs. Each geometry may produce one or more IDs.
+Use [ST_S2CellIds](../../api/snowflake/vector-data/Function.md#st_s2cellids) 
to generate cell IDs. Each geometry may produce one or more IDs.
 
 ```sql
 SELECT * FROM lefts, TABLE(FLATTEN(ST_S2CellIDs(lefts.geom, 15))) s1
diff --git a/docs/tutorial/sql.md b/docs/tutorial/sql.md
index 4ea1ff0754..bd5327f675 100644
--- a/docs/tutorial/sql.md
+++ b/docs/tutorial/sql.md
@@ -20,7 +20,7 @@
 The page outlines the steps to manage spatial data using SedonaSQL.
 
 !!!note
-    Since v`1.5.0`, Sedona assumes geographic coordinates to be in 
longitude/latitude order. If your data is lat/lon order, please use 
`ST_FlipCoordinates` to swap X and Y.
+    Sedona assumes geographic coordinates to be in longitude/latitude order. 
If your data is lat/lon order, please use `ST_FlipCoordinates` to swap X and Y.
 
 SedonaSQL supports SQL/MM Part3 Spatial SQL Standard. It includes four kinds 
of SQL operators as follows. All these operators can be directly called through:
 
@@ -64,8 +64,6 @@ Detailed SedonaSQL APIs are available here: [SedonaSQL 
API](../api/sql/Overview.
 
 Use the following code to create your Sedona config at the beginning. If you 
already have a SparkSession (usually named `spark`) created by AWS 
EMR/Databricks/Microsoft Fabric, please ==skip this step==.
 
-==Sedona >= 1.4.1==
-
 You can add additional Spark runtime config to the config builder. For 
example, 
`SedonaContext.builder().config("spark.sql.autoBroadcastJoinThreshold", 
"10485760")`
 
 === "Scala"
@@ -111,65 +109,10 @@ You can add additional Spark runtime config to the config 
builder. For example,
        ```
     If you are using a different Spark version, please replace the `3.3` in 
package name of sedona-spark-shaded with the corresponding major.minor version 
of Spark, such as `sedona-spark-shaded-3.4_2.12:{{ sedona.current_version }}`.
 
-==Sedona < 1.4.1==
-
-The following method has been deprecated since Sedona 1.4.1. Please use the 
method above to create your Sedona config.
-
-=== "Scala"
-
-       ```scala
-       var sparkSession = SparkSession.builder()
-       .master("local[*]") // Delete this if run in cluster mode
-       .appName("readTestScala") // Change this to a proper name
-       // Enable Sedona custom Kryo serializer
-       .config("spark.serializer", classOf[KryoSerializer].getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
classOf[SedonaKryoRegistrator].getName)
-       .getOrCreate() // org.apache.sedona.core.serde.SedonaKryoRegistrator
-       ```
-       If you use SedonaViz together with SedonaSQL, please use the following 
two lines to enable Sedona Kryo serializer instead:
-       ```scala
-       .config("spark.serializer", classOf[KryoSerializer].getName) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
classOf[SedonaVizKryoRegistrator].getName) // 
org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
-       ```
-
-=== "Java"
-
-       ```java
-       SparkSession sparkSession = SparkSession.builder()
-       .master("local[*]") // Delete this if run in cluster mode
-       .appName("readTestJava") // Change this to a proper name
-       // Enable Sedona custom Kryo serializer
-       .config("spark.serializer", KryoSerializer.class.getName()) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", SedonaKryoRegistrator.class.getName())
-       .getOrCreate() // org.apache.sedona.core.serde.SedonaKryoRegistrator
-       ```
-       If you use SedonaViz together with SedonaSQL, please use the following 
two lines to enable Sedona Kryo serializer instead:
-       ```java
-       .config("spark.serializer", KryoSerializer.class.getName()) // 
org.apache.spark.serializer.KryoSerializer
-       .config("spark.kryo.registrator", 
SedonaVizKryoRegistrator.class.getName()) // 
org.apache.sedona.viz.core.Serde.SedonaVizKryoRegistrator
-       ```
-
-=== "Python"
-
-       ```python
-       sparkSession = SparkSession. \
-           builder. \
-           appName('readTestPython'). \
-           config("spark.serializer", KryoSerializer.getName()). \
-           config("spark.kryo.registrator", SedonaKryoRegistrator.getName()). \
-           config('spark.jars.packages',
-                  'org.apache.sedona:sedona-spark-shaded-3.3_2.12:{{ 
sedona.current_version }},'
-                  'org.datasyslab:geotools-wrapper:{{ sedona.current_geotools 
}}'). \
-           getOrCreate()
-       ```
-    If you are using Spark versions >= 3.4, please replace the `3.0` in 
package name of sedona-spark-shaded with the corresponding major.minor version 
of Spark, such as `sedona-spark-shaded-3.4_2.12:{{ sedona.current_version }}`.
-
 ## Initiate SedonaContext
 
 Add the following line after creating Sedona config. If you already have a 
SparkSession (usually named `spark`) created by AWS EMR/Databricks/Microsoft 
Fabric, please call `sedona = SedonaContext.create(spark)` instead. For 
==Databricks==, the situation is more complicated, please refer to [Databricks 
setup guide](../setup/databricks.md), but generally you don't need to create 
SedonaContext.
 
-==Sedona >= 1.4.1==
-
 === "Scala"
 
        ```scala
@@ -194,33 +137,9 @@ Add the following line after creating Sedona config. If 
you already have a Spark
        sedona = SedonaContext.create(config)
        ```
 
-==Sedona < 1.4.1==
-
-The following method has been deprecated since Sedona 1.4.1. Please use the 
method above to create your SedonaContext.
-
-=== "Scala"
-
-       ```scala
-       SedonaSQLRegistrator.registerAll(sparkSession)
-       ```
-
-=== "Java"
-
-       ```java
-       SedonaSQLRegistrator.registerAll(sparkSession)
-       ```
-
-=== "Python"
-
-       ```python
-       from sedona.register import SedonaRegistrator
-
-       SedonaRegistrator.registerAll(spark)
-       ```
-
 You can also register everything by passing `--conf 
spark.sql.extensions=org.apache.sedona.sql.SedonaSqlExtensions` to 
`spark-submit` or `spark-shell`.
 
-## Load data from files
+## Load data from text files
 
 Assume we have a WKT file, namely `usa-county.tsv`, at Path 
`/Download/usa-county.tsv` as follows:
 
@@ -233,6 +152,8 @@ POLYGON (..., ...)  Lancaster County
 
 The file may have many other columns.
 
+### Load the raw DataFrame
+
 Use the following code to load the data and create a raw DataFrame:
 
 === "Scala"
@@ -267,7 +188,7 @@ The output will be like this:
 |POLYGON ((-96.910...| 31|109|00835876|31109|  Lancaster|    Lancaster County| 
06| H1|G4020| 339|30700|null|   A|2169240202|22877180|+40.7835474|-096.6886584|
 ```
 
-## Create a Geometry type column
+### Create a Geometry type column
 
 All geometrical operations in SedonaSQL are on Geometry type objects. 
Therefore, before any kind of queries, you need to create a Geometry type 
column on a DataFrame.
 
@@ -315,56 +236,6 @@ root
 
 Since `v1.6.1`, Sedona supports reading GeoJSON files using the `geojson` data 
source. It is designed to handle JSON files that use [GeoJSON 
format](https://datatracker.ietf.org/doc/html/rfc7946) for their geometries.
 
-This includes SpatioTemporal Asset Catalog (STAC) files, GeoJSON features, 
GeoJSON feature collections and other variations.
-The key functionality lies in the way 'geometry' fields are processed: these 
are specifically read as Sedona's `GeometryUDT` type, ensuring integration with 
Sedona's suite of spatial functions.
-
-### Key features
-
-- Broad Support: The reader and writer are versatile, supporting all 
GeoJSON-formatted files, including STAC files, feature collections, and more.
-- Geometry Transformation: When reading, fields named 'geometry' are 
automatically converted from GeoJSON format to Sedona's `GeometryUDT` type and 
vice versa when writing.
-
-### Load MultiLine GeoJSON FeatureCollection
-
-Suppose we have a GeoJSON FeatureCollection file as follows.
-This entire file is considered as a single GeoJSON FeatureCollection object.
-Multiline format is preferable for scenarios where files need to be 
human-readable or manually edited.
-
-```json
-{ "type": "FeatureCollection",
-    "features": [
-      { "type": "Feature",
-        "geometry": {"type": "Point", "coordinates": [102.0, 0.5]},
-        "properties": {"prop0": "value0"}
-        },
-      { "type": "Feature",
-        "geometry": {
-          "type": "LineString",
-          "coordinates": [
-            [102.0, 0.0], [103.0, 1.0], [104.0, 0.0], [105.0, 1.0]
-            ]
-          },
-        "properties": {
-          "prop0": "value1",
-          "prop1": 0.0
-          }
-        },
-      { "type": "Feature",
-         "geometry": {
-           "type": "Polygon",
-           "coordinates": [
-             [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
-               [100.0, 1.0], [100.0, 0.0] ]
-             ]
-         },
-         "properties": {
-           "prop0": "value2",
-           "prop1": {"this": "that"}
-           }
-         }
-       ]
-}
-```
-
 Set the `multiLine` option to `True` to read multiline GeoJSON files.
 
 === "Python"
@@ -402,81 +273,7 @@ Set the `multiLine` option to `True` to read multiline 
GeoJSON files.
     df.printSchema();
     ```
 
-The output is as follows:
-
-```
-+--------------------+------+
-|            geometry| prop0|
-+--------------------+------+
-|     POINT (102 0.5)|value0|
-|LINESTRING (102 0...|value1|
-|POLYGON ((100 0, ...|value2|
-+--------------------+------+
-
-root
- |-- geometry: geometry (nullable = false)
- |-- prop0: string (nullable = true)
-
-```
-
-### Load Single Line GeoJSON Features
-
-Suppose we have a single-line GeoJSON Features dataset as follows. Each line 
is a single GeoJSON Feature.
-This format is efficient for processing large datasets where each line is a 
separate, self-contained GeoJSON object.
-
-```json
-{"type":"Feature","geometry":{"type":"Point","coordinates":[102.0,0.5]},"properties":{"prop0":"value0"}}
-{"type":"Feature","geometry":{"type":"LineString","coordinates":[[102.0,0.0],[103.0,1.0],[104.0,0.0],[105.0,1.0]]},"properties":{"prop0":"value1"}}
-{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100.0,0.0],[101.0,0.0],[101.0,1.0],[100.0,1.0],[100.0,0.0]]]},"properties":{"prop0":"value2"}}
-```
-
-By default, when `option` is not specified, Sedona reads a GeoJSON file as a 
single line GeoJSON.
-
-=== "Python"
-
-       ```python
-       df = sedona.read.format("geojson").load("PATH/TO/MYFILE.json")
-          .withColumn("prop0", 
f.expr("properties['prop0']")).drop("properties").drop("type")
-
-       df.show()
-       df.printSchema()
-       ```
-
-=== "Scala"
-
-       ```scala
-       val df = sedona.read.format("geojson").load("PATH/TO/MYFILE.json")
-          .withColumn("prop0", 
expr("properties['prop0']")).drop("properties").drop("type")
-
-       df.show()
-       df.printSchema()
-       ```
-
-=== "Java"
-
-       ```java
-       Dataset<Row> df = 
sedona.read.format("geojson").load("PATH/TO/MYFILE.json")
-          .withColumn("prop0", 
expr("properties['prop0']")).drop("properties").drop("type")
-
-       df.show()
-       df.printSchema()
-       ```
-
-The output is as follows:
-
-```
-+--------------------+------+
-|            geometry| prop0|
-+--------------------+------+
-|     POINT (102 0.5)|value0|
-|LINESTRING (102 0...|value1|
-|POLYGON ((100 0, ...|value2|
-+--------------------+------+
-
-root
- |-- geometry: geometry (nullable = false)
- |-- prop0: string (nullable = true)
-```
+See [this page](files/geojson-sedona-spark.md) for more information on loading 
GeoJSON files.
 
 ## Load Shapefile
 
@@ -502,7 +299,7 @@ Since v`1.7.0`, Sedona supports loading Shapefile as a 
DataFrame.
 
 The input path can be a directory containing one or multiple shapefiles, or 
path to a `.shp` file.
 
-See [this page](../files/shapefile-sedona-spark) for more information on 
loading Shapefiles.
+See [this page](files/shapefiles-sedona-spark.md) for more information on 
loading Shapefiles.
 
 ## Load GeoParquet
 
@@ -550,7 +347,31 @@ Please refer to [Reading Legacy Parquet 
Files](../api/sql/Reading-legacy-parquet
        GeoParquet file reader does not work on Databricks runtime when Photon 
is enabled. Please disable Photon when using
        GeoParquet file reader on Databricks runtime.
 
-See [this page](../files/geoparquet-sedona-spark) for more information on 
loading GeoParquet.
+See [this page](files/geoparquet-sedona-spark.md) for more information on 
loading GeoParquet.
+
+## Load data from STAC catalog
+
+Sedona STAC data source allows you to read data from a SpatioTemporal Asset 
Catalog (STAC) API. The data source supports reading STAC items and collections.
+
+You can load a STAC collection from a s3 collection file object:
+
+```python
+df = sedona.read.format("stac").load(
+    "s3a://example.com/stac_bucket/stac_collection.json"
+)
+```
+
+You can also load a STAC collection from an HTTP/HTTPS endpoint:
+
+```python
+df = sedona.read.format("stac").load(
+    
"https://earth-search.aws.element84.com/v1/collections/sentinel-2-pre-c1-l2a";
+)
+```
+
+The STAC data source supports predicate pushdown for spatial and temporal 
filters. The data source can push down spatial and temporal filters to the 
underlying data source to reduce the amount of data that needs to be read.
+
+See [this page](files/stac-sedona-spark.md) for more information on loading 
data from STAC.
 
 ## Load data from JDBC data sources
 
@@ -612,7 +433,7 @@ For Postgis there is no need to add a query to convert 
geometry types since it's
                .withColumn("geom", f.expr("ST_GeomFromWKB(geom)")))
        ```
 
-## Load from GeoPackage
+## Load GeoPackage
 
 Since v1.7.0, Sedona supports loading Geopackage file format as a DataFrame.
 
@@ -634,9 +455,9 @@ Since v1.7.0, Sedona supports loading Geopackage file 
format as a DataFrame.
        df = sedona.read.format("geopackage").option("tableName", 
"tab").load("/path/to/geopackage")
        ```
 
-See [this page](../files/geopackage-sedona-spark) for more information on 
loading GeoPackage.
+See [this page](files/geopackage-sedona-spark.md) for more information on 
loading GeoPackage.
 
-## Load from OSM PBF
+## Load OSM PBF
 
 Since v1.7.1, Sedona supports loading OSM PBF file format as a DataFrame.
 
@@ -732,14 +553,6 @@ and for relation
 
+-----+--------+--------+--------------------+--------------------+--------------------+--------------------+
 ```
 
-Known limitations (v1.7.0):
-
-- webp rasters are not supported
-- ewkb geometries are not supported
-- filtering based on geometries envelopes are not supported
-
-All points above should be resolved soon, stay tuned !
-
 ## Transform the Coordinate Reference System
 
 Sedona doesn't control the coordinate unit (degree-based or meter-based) of 
all geometries in a Geometry column. The unit of all related distances in 
SedonaSQL is same as the unit of all geometries in a Geometry column.
@@ -828,7 +641,7 @@ The output will look like this:
 +----------------+---+------+-------+
 ```
 
-See [this page](../concepts/clustering-algorithms) for more information on the 
DBSCAN algorithm.
+See [this page](concepts/clustering-algorithms.md) for more information on the 
DBSCAN algorithm.
 
 ## Calculate the Local Outlier Factor (LOF)
 
@@ -1393,7 +1206,7 @@ SELECT ST_AsText(countyshape)
 FROM polygondf
 ```
 
-## Save as GeoJSON
+## Save GeoJSON
 
 Since `v1.6.1`, the GeoJSON data source in Sedona can be used to save a 
Spatial DataFrame to a single-line JSON file, with geometries written in 
GeoJSON format.
 
@@ -1401,13 +1214,7 @@ Since `v1.6.1`, the GeoJSON data source in Sedona can be 
used to save a Spatial
 df.write.format("geojson").save("YOUR/PATH.json")
 ```
 
-The structure of the generated file will be like this:
-
-```json
-{"type":"Feature","geometry":{"type":"Point","coordinates":[102.0,0.5]},"properties":{"prop0":"value0"}}
-{"type":"Feature","geometry":{"type":"LineString","coordinates":[[102.0,0.0],[103.0,1.0],[104.0,0.0],[105.0,1.0]]},"properties":{"prop0":"value1"}}
-{"type":"Feature","geometry":{"type":"Polygon","coordinates":[[[100.0,0.0],[101.0,0.0],[101.0,1.0],[100.0,1.0],[100.0,0.0]]]},"properties":{"prop0":"value2"}}
-```
+See [this page](files/geojson-sedona-spark.md) for more information on writing 
to GeoJSON.
 
 ## Save GeoParquet
 
@@ -1417,7 +1224,7 @@ Since v`1.3.0`, Sedona natively supports writing 
GeoParquet file. GeoParquet can
 df.write.format("geoparquet").save(geoparquetoutputlocation + 
"/GeoParquet_File_Name.parquet")
 ```
 
-See [this page](../files/geoparquet-sedona-spark) for more information on 
writing to GeoParquet.
+See [this page](files/geoparquet-sedona-spark.md) for more information on 
writing to GeoParquet.
 
 ## Save to Postgis
 
diff --git a/mkdocs.yml b/mkdocs.yml
index eb478fc515..6f1cf1c5e1 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -66,6 +66,10 @@ nav:
               - GeoParquet: tutorial/files/geoparquet-sedona-spark.md
               - GeoJSON: tutorial/files/geojson-sedona-spark.md
               - Shapefiles: tutorial/files/shapefiles-sedona-spark.md
+              - STAC catalog: tutorial/files/stac-sedona-spark.md
+          - Concepts:
+              - Spatial Joins: tutorial/concepts/spatial-joins.md
+              - Clustering Algorithms: 
tutorial/concepts/clustering-algorithms.md
           - Map visualization SQL app:
               - Scala/Java: tutorial/viz.md
               - Use Apache Zeppelin: tutorial/zeppelin.md
@@ -81,9 +85,6 @@ nav:
       - Examples:
           - Scala/Java: tutorial/demo.md
           - Python: tutorial/jupyter-notebook.md
-      - Concepts:
-          - Spatial Joins: tutorial/concepts/spatial-joins.md
-          - Clustering Algorithms: tutorial/concepts/clustering-algorithms.md
   - API Docs:
       - Sedona with Apache Spark:
           - SQL:
@@ -97,7 +98,6 @@ nav:
                   - Query optimization: api/sql/Optimizer.md
                   - Nearest-Neighbour searching: 
api/sql/NearestNeighbourSearching.md
                   - "Spider:Spatial Data Generator": api/sql/Spider.md
-                  - Reading STAC Data Source: api/sql/Stac.md
                   - Reading Legacy Parquet Files: 
api/sql/Reading-legacy-parquet.md
                   - Visualization:
                       - SedonaPyDeck: api/sql/Visualization_SedonaPyDeck.md
@@ -145,11 +145,6 @@ nav:
           - Make a release: community/publish.md
           - Vote a release: community/vote.md
       - Publications: community/publication.md
-  - Use cases:
-      - Spatially aggregate airports per country: 
usecases/ApacheSedonaSQL_SpatialJoin_AirportsPerCountry.ipynb
-      - Match foot traffic to Seattle coffee shops: 
usecases/contrib/foot-traffic.ipynb
-      - Raster image manipulation: usecases/ApacheSedonaRaster.ipynb
-      - Read Overture Maps data: usecases/Sedona_OvertureMaps_GeoParquet.ipynb
   - Apache Software Foundation:
       - Foundation: asf/asf.md
       - License: https://www.apache.org/licenses/"; target="_blank
@@ -239,7 +234,5 @@ plugins:
   - macros
   - git-revision-date-localized:
       type: datetime
-  - mkdocs-jupyter:
-      include_source: True
   - mike:
       canonical_version: "latest"

(sedona) branch master updated: [DOCS] Improve docs and fix deadlinks (#1846)

Reply via email to