This is an automated email from the ASF dual-hosted git repository. jiayu pushed a commit to branch branch-0.1.0 in repository https://gitbox.apache.org/repos/asf/sedona-db.git
commit acde5cd0afd7f03f28c4cb2fa9552e71d5c3defc Author: Kelly-Ann Dolor <[email protected]> AuthorDate: Tue Sep 23 17:17:58 2025 -0700 [DOCS] Changing navigation, fixing typos (#141) Co-authored-by: Copilot <[email protected]> --- CONTRIBUTING.md | 241 ++++++++++++++++++++++++++++++++++++++++++--- docs/contributors-guide.md | 4 +- docs/crs-examples.ipynb | 6 +- docs/crs-examples.md | 8 +- mkdocs.yml | 65 ++++++++---- 5 files changed, 282 insertions(+), 42 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 940d7b1..0fda9ad 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -17,28 +17,243 @@ under the License. --> -# How to contribute to Apache SedonaDB +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at -Welcome! We'd love to have you contribute to Apache SedonaDB! + http://www.apache.org/licenses/LICENSE-2.0 -## Did you find a bug? + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> -Create an issue with a reproducible example. Please specify the SedonaDB version with a code snippet and error message. +# Contributors Guide -## Did you create a PR to fix a bug? +This guide details how to set up your development environment as a SedonaDB Contributor. -See [here](https://sedona.apache.org/latest/community/rule/#make-a-pull-request) for instructions on how to open PRs. +## Fork and clone the repository -We appreciate bug fixes - thank you in advance! +Your first step is to create a personal copy of the repository and connect it to the main project. -## Would you like to add a new feature or change existing code? +1. Fork the repository -If you would like to add a feature or change existing behavior, please make sure to create an issue ticket and get the planned work approved by the core team first! + * Navigate to the official [SedonaDB GitHub repository](https://github.com/apache/sedona-db). + * Click the **Fork** button in the top-right corner. This creates a complete copy of the project in your own GitHub account. -It's always better to get aligned with the core devs before writing any code. +1. Clone your fork -## Do you have questions about the source code? + * Next, clone your newly created fork to your local machine. This command downloads the repository into a new folder named `sedona-db`. + * Replace `YourUsername` with your actual GitHub username. -Feel free to create an issue or join the [Discord](https://discord.gg/9A3k5dEBsY) with questions! + ```shell + git clone https://github.com/YourUsername/sedona-db.git + cd sedona-db + ``` -Thanks for reading and looking forward to collaborating with you! +1. Configure the remotes + + * Your local repository needs to know where the original project is so you can pull in updates. You'll add a remote link, traditionally named **`upstream`**, to the main SedonaDB repository. + * Your fork is automatically configured as the **`origin`** remote. + + ```shell + # Add the main repository as the "upstream" remote + git remote add upstream https://github.com/apache/sedona-db.git + ``` + +1. Verify the configuration + + * Run the following command to verify that you have two remotes configured correctly: `origin` (your fork) and `upstream` (the main repository). + + ```shell + git remote -v + ``` + + * The output should look like this: + + ```shell + origin https://github.com/YourUsername/sedona-db.git (fetch) + origin https://github.com/YourUsername/sedona-db.git (push) + upstream https://github.com/apache/sedona-db.git (fetch) + upstream https://github.com/apache/sedona-db.git (push) + ``` + +## Rust + +SedonaDB is written in Rust and is a standard `cargo` workspace. + +You can install a recent version of the Rust compiler and cargo from +[rustup.rs](https://rustup.rs/) and run tests using `cargo test`. + +A local development version of the CLI can be run with `cargo run --bin sedona-cli`. + +### Test data setup + +Some tests require submodules that contain test data or pinned versions of +external dependencies. These submodules can be initialized with: + +```shell +git submodule init +git submodule update --recursive +``` + +Additionally, some of the data required in the tests can be downloaded by running the following script. + +```bash +python submodules/download-assets.py +``` + +### System dependencies + +Some crates wrap external native libraries and require system dependencies +to build. + +!!!note "`sedona-s2geography`" + At this time, the only crate that requires this is the `sedona-s2geography` + crate, which requires [CMake](https://cmake.org), + [Abseil](https://github.com/abseil/abseil-cpp) and OpenSSL. + +#### macOS + +These can be installed on macOS with [Homebrew](https://brew.sh): + +```shell +brew install abseil openssl cmake geos +``` + +#### Linux and Windows + +On Linux and Windows, it is recommended to use [vcpkg](https://github.com/microsoft/vcpkg) +to provide external dependencies. This can be done by setting the `CMAKE_TOOLCHAIN_FILE` +environment variable: + +```shell +export CMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake +``` + +#### Visual Studio Code (VSCode) Configuration + +When using VSCode, it may be necessary to set this environment variable in `settings.json` +such that it can be found by rust-analyzer when running build/run tasks: + +```json +{ + "rust-analyzer.runnables.extraEnv": { + "CMAKE_TOOLCHAIN_FILE": "/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake" + }, + "rust-analyzer.cargo.extraEnv": { + "CMAKE_TOOLCHAIN_FILE": "/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake" + } +} +``` + +## Python + +Python bindings to SedonaDB are built with the [Maturin](https://www.maturin.rs) build +backend. + +To install a development version of the main Python bindings for the first time, run the following commands: + +```shell +cd python/sedonadb +pip install -e ".[test]" +``` + +If editing Rust code in either SedonaDB or the Python bindings, you can recompile the +native component with: + +```shell +maturin develop +``` + +## Debugging + +### Rust + +Debugging Rust code is most easily done by writing or finding a test that triggers +the desired behavior and running it using the *Debug* selection in +[VSCode](https://code.visualstudio.com/) with the +[rust-analyzer](https://marketplace.visualstudio.com/items?itemName=rust-lang.rust-analyzer) +extension. Rust code can also be debugged using the CLI by finding the `main()` function in +`sedona-cli` and choosing the *Debug* run option. + +### Python, C, and C++ + +Installation of Python bindings with `maturin develop` ensures a debug-friendly build for +debugging Rust, Python, or C/C++ code. Python code can be debugged using breakpoints in +any IDE that supports debugging an editable Python package installation (e.g., VSCode); +Rust, C, or C++ code can be debugged using the +[CodeLLDB](https://marketplace.visualstudio.com/items?itemName=vadimcn.vscode-lldb) +*Attach to Process...* command from the command palette in VSCode. + +## Low-level benchmarking + +Low-level Rust benchmarks use [criterion](https://github.com/bheisler/criterion.rs). +In general, there is at least one benchmark for every implementation of a function +(some functions have more than one implementation provided by different libraries), +and a few other benchmarks for low-level iteration where work was done to optimize +specific cases. + +### Running benchmarks + +Benchmarks for a specific crate can be run with `cargo bench`: + +```shell +cd rust/sedona-geo +cargo bench +``` + +Benchmarks for a specific function can be run with a filter. These can be run +from the workspace or a specific crate (although the output is usually easier +to read for a specific crate). + +```shell +cargo bench -- st_area +``` + +### Managing results + +By default, criterion saves the last run and will report the difference between the +current benchmark and the last time it was run (although there are options to +save and load various baselines). + +A report of the latest results for all benchmarks can be opened with the following command: + +=== "macOS" + ```shell + open target/criterion/report/index.html + ``` +=== "Ubuntu" + ```shell + xdg-open target/criterion/report/index.html + ``` + +All previous saved benchmark runs can be cleared with: + +```shell +rm -rf target/criterion +``` + +## Documentation + +To contribute to the SedonaDB documentation: + +1. Clone the repository and create a fork. +1. Install the Documentation dependencies: + ```sh + pip install -r docs/requirements.txt + ``` +1. Make your changes to the documentation files. +1. Preview your changes locally using these commands: + * `mkdocs serve` - Start the live-reloading docs server. + * `mkdocs build` - Build the documentation site. + * `mkdocs -h` - Print help message and exit. +1. Push your changes and open a pull request. diff --git a/docs/contributors-guide.md b/docs/contributors-guide.md index 2183c65..34589e7 100644 --- a/docs/contributors-guide.md +++ b/docs/contributors-guide.md @@ -27,7 +27,7 @@ Your first step is to create a personal copy of the repository and connect it to 1. Fork the repository - * Navigate to the official [Apache SedonaDB GitHub repository](https://github.com/apache/sedona-db). + * Navigate to the official [SedonaDB GitHub repository](https://github.com/apache/sedona-db). * Click the **Fork** button in the top-right corner. This creates a complete copy of the project in your own GitHub account. 1. Clone your fork @@ -42,7 +42,7 @@ Your first step is to create a personal copy of the repository and connect it to 1. Configure the remotes - * Your local repository needs to know where the original project is so you can pull in updates. You'll add a remote link, traditionally named **`upstream`**, to the main Apache SedonaDB repository. + * Your local repository needs to know where the original project is so you can pull in updates. You'll add a remote link, traditionally named **`upstream`**, to the main SedonaDB repository. * Your fork is automatically configured as the **`origin`** remote. ```shell diff --git a/docs/crs-examples.ipynb b/docs/crs-examples.ipynb index 0548891..920f836 100644 --- a/docs/crs-examples.ipynb +++ b/docs/crs-examples.ipynb @@ -5,7 +5,7 @@ "id": "91910e50-a5ae-4d5a-a431-62ac5fbc11ca", "metadata": {}, "source": [ - "# Joining Geospatial Data with Different CRSs.\n", + "# Joining Spatial Data with Different Coordinate Systems\n", "\n", "> Note: Before running this notebook, ensure that you have installed SedonaDB: `pip install \"sedona[db]\"`\n", "\n", @@ -156,7 +156,7 @@ "id": "561b3c8c-4952-4fa7-9fe1-3fa0522b0d9f", "metadata": {}, "source": [ - "### Join with mismatched CRSs\n", + "### Join with mismatched Coordinate Reference Systems\n", "\n", "The cities and countries tables have different CRSs.\n", "\n", @@ -293,7 +293,7 @@ "\n", "The example highlights the following features:\n", "\n", - "1. SedonaDB reads the CRS stored in the files/\n", + "1. SedonaDB reads the CRS stored in the files.\n", "2. SedonaDB protects you from accidentally joining files with mismatched CRSs.\n", "3. It's easy to convert a GeoPandas DataFrame to a SedonaDB DataFrame and maintain the CRS." ] diff --git a/docs/crs-examples.md b/docs/crs-examples.md index 472469c..a5fbd02 100644 --- a/docs/crs-examples.md +++ b/docs/crs-examples.md @@ -17,7 +17,7 @@ under the License. --> -# Joining Geospatial Data with Different CRSs. +# Joining Spatial Data with Different Coordinate Systems > Note: Before running this notebook, ensure that you have installed SedonaDB: > `pip install "sedona[db]"` @@ -106,7 +106,7 @@ cities.to_view("cities", overwrite=True) countries.to_view("countries", overwrite=True) ``` -### Join with mismatched CRSs +### Join with mismatched Coordinate Reference Systems The cities and countries tables have different CRSs. @@ -138,7 +138,7 @@ where ST_Intersects(cities.geometry, countries.geometry) ----> 6 """).show() - File ~/sedona-db/python/sedonadb/python/sedonadb/dataframe.py:380, in DataFrame.show(self, limit, width, ascii) + File ~/sedona-db/sedona-db/python/sedonadb/python/sedonadb/dataframe.py:380, in DataFrame.show(self, limit, width, ascii) 356 """Print the first limit rows to the console 357 358 Args: @@ -214,7 +214,7 @@ This example shows how to join a `vermont` table with an EPSG 32618 CRS with a ` The example highlights the following features: -1. SedonaDB reads the CRS stored in the files/ +1. SedonaDB reads the CRS stored in the files. 2. SedonaDB protects you from accidentally joining files with mismatched CRSs. 3. It's easy to convert a GeoPandas DataFrame to a SedonaDB DataFrame and maintain the CRS. diff --git a/mkdocs.yml b/mkdocs.yml index 314b68b..6b8babe 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -15,29 +15,58 @@ # specific language governing permissions and limitations # under the License. +extra: + version: + provider: mike + default: + - latest + social: + - icon: fontawesome/brands/github + link: 'https://github.com/apache/sedona-db/' + - icon: fontawesome/brands/twitter + link: 'https://twitter.com/ApacheSedona' + - icon: fontawesome/brands/discord + link: 'https://discord.gg/9A3k5dEBsY' + community_links: &community_links + - Get Involved: + - Sedona OSS Blog: "https://sedona.apache.org/latest/blog/" + - Community: "https://sedona.apache.org/latest/community/contact/" + - Apache Software Foundation: "https://sedona.apache.org/latest/asf/asf/" + site_name: SedonaDB -site_description: "Documentation for Apache SedonaDB" +site_description: "Documentation for SedonaDB" site_url: https://sedona.apache.org/sedonadb/ nav: - - SedonaDB: index.md - - Quickstart: quickstart-python.md - - SedonaDB Guides: - - Working with Vector Data: programming-guide.md - - Working with GeoPandas: geopandas-interop.md - - Working with Overture: overture-examples.md - - Working with Parquet Files: working-with-parquet-files.md - - Joining Geospatial Data with Different CRSs: crs-examples.md - - Contributors Guide: contributors-guide.md - - SedonaDB Reference: + # This becomes the 'SedonaDB' tab + - SedonaDB: + - Home: index.md + # The alias (*) works perfectly from the 'extra' block + - <<: *community_links + + # This becomes the 'Quickstart' tab + - Python Quickstart: + - quickstart-python.md + - <<: *community_links + + # This becomes the 'SedonaDB Guides' tab + - SedonaDB Guides: + - Working with Vector Data: programming-guide.md + - Working with GeoPandas: geopandas-interop.md + - Working with Overture: overture-examples.md + - Working with Parquet Files: working-with-parquet-files.md + - Contributors Guide: contributors-guide.md + - Joining Spatial Data with Different Coordinate Systems: crs-examples.md + - <<: *community_links + + # This becomes the 'SedonaDB Reference' tab + - SedonaDB Reference: - Python: - Python Functions: reference/python.md - SQL: - SQL Functions: reference/sql.md - Spatial Joins: reference/sql-joins.md - - Blog: "https://sedona.apache.org/latest/blog/" - - Community: "https://sedona.apache.org/latest/community/contact/" - - Apache Software Foundation: "https://sedona.apache.org/latest/asf/asf/" - - Sedona Homepage: "https://sedona.apache.org/latest/" + - <<: *community_links + - Sedona Homepage: "https://sedona.apache.org/" repo_url: https://github.com/apache/sedona-db edit_uri: https://github.com/apache/sedona-db/blob/main/docs/ @@ -67,11 +96,7 @@ theme: - navigation.sections - navigation.tabs - navigation.tabs.sticky -extra: - version: - provider: mike - default: - - latest + extra_css: - stylesheets/extra.css
