jiayuasu commented on code in PR #1963: URL: https://github.com/apache/sedona/pull/1963#discussion_r2114920224
########## docs/setup/databricks.md: ########## @@ -17,38 +17,93 @@ under the License. --> -In Databricks advanced editions, you need to install Sedona via [cluster init-scripts](https://docs.databricks.com/clusters/init-scripts.html) as described below. Sedona is not guaranteed to be 100% compatible with `Databricks photon acceleration`. Sedona requires Spark internal APIs to inject many optimization strategies, which sometimes is not accessible in `Photon`. - -The following steps use DBR including Apache Spark 3.5.x as an example. Please change the Spark version according to your DBR version. Please pay attention to the Spark version postfix and Scala version postfix on our [Maven Coordinate page](maven-coordinates.md). Databricks Spark and Apache Spark's compatibility can be found [here](https://docs.databricks.com/en/release-notes/runtime/index.html). - -!!! bug - Databricks Runtime 16.2 (non-LTS) introduces a change in the json4s dependency, which may lead to compatibility issues with Apache Sedona. We recommend using a currently supported LTS version, such as Databricks Runtime 15.4 LTS or 14.3 LTS, to ensure stability. A patch will be provided once an official Databricks Runtime 16 LTS version is released. - -### Download Sedona jars - -Download the Sedona jars to a DBFS location. You can do that manually via UI or from a notebook by executing this code in a cell: - -```bash +You can run Sedona in Databricks to leverage the functionality that Sedona provides. Here’s an example of a Databricks notebook that’s running Sedona code: + + + +Sedona isn’t available in all Databricks environments because of the platform's limitations. This post explains how and where you can run Sedona in Databricks. + +This post also demonstrates how to read Databricks Delta tables in environments outside Databricks. + +## Databricks and Sedona version requirements + +Databricks and Sedona depend on Spark, Scala, and other libraries. + +For example, one Databricks Runtime 16.4 depends on Scala 2.12 and Spark 3.5. Here are the version requirements for a few Databricks runtimes. + + + +If you use a Databricks Runtime compiled with Spark 3.5 and Scala 2.12, then you should use a Sedona version compiled with Spark 3.5 and Scala 2.12. You need to make sure the Scala versions are aligned, even if you’re using the Python or SQL APIs. + +Here are the recommended Databricks Runtimes for different Spark and Sedona versions: + +<table> Review Comment: I don't think we should provide this table as it gets out of date quickly, as soon as we release a new version or databricks releases a new version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
