(sedona) 01/01: Add Microsoft Fabric tutorial

jiayu Mon, 22 Apr 2024 00:33:20 -0700

This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch fabric
in repository https://gitbox.apache.org/repos/asf/sedona.git


commit a4c0edee9d040490c584e257d38e9c07072ce393
Author: Jia Yu <[email protected]>
AuthorDate: Mon Apr 22 00:32:56 2024 -0700

    Add Microsoft Fabric tutorial
---
 docs/image/fabric/fabric-1.png | Bin 0 -> 209175 bytes
 docs/image/fabric/fabric-2.png | Bin 0 -> 75166 bytes
 docs/image/fabric/fabric-3.png | Bin 0 -> 89032 bytes
 docs/image/fabric/fabric-4.png | Bin 0 -> 97093 bytes
 docs/image/fabric/fabric-5.png | Bin 0 -> 103507 bytes
 docs/image/fabric/fabric-6.png | Bin 0 -> 189504 bytes
 docs/image/fabric/fabric-7.png | Bin 0 -> 123955 bytes
 docs/image/fabric/fabric-8.png | Bin 0 -> 146759 bytes
 docs/image/fabric/fabric-9.png | Bin 0 -> 150114 bytes
 docs/setup/databricks.md       |   3 --
 docs/setup/emr.md              |   3 --
 docs/setup/fabric.md           |  89 +++++++++++++++++++++++++++++++++++++++++
 docs/setup/wherobots.md        |   6 +--
 mkdocs.yml                     |   6 ++-
 14 files changed, 96 insertions(+), 11 deletions(-)

diff --git a/docs/image/fabric/fabric-1.png b/docs/image/fabric/fabric-1.png
new file mode 100644
index 000000000..fa00a5839
Binary files /dev/null and b/docs/image/fabric/fabric-1.png differ
diff --git a/docs/image/fabric/fabric-2.png b/docs/image/fabric/fabric-2.png
new file mode 100644
index 000000000..64992734b
Binary files /dev/null and b/docs/image/fabric/fabric-2.png differ
diff --git a/docs/image/fabric/fabric-3.png b/docs/image/fabric/fabric-3.png
new file mode 100644
index 000000000..67f0dc8bb
Binary files /dev/null and b/docs/image/fabric/fabric-3.png differ
diff --git a/docs/image/fabric/fabric-4.png b/docs/image/fabric/fabric-4.png
new file mode 100644
index 000000000..6d8b705a2
Binary files /dev/null and b/docs/image/fabric/fabric-4.png differ
diff --git a/docs/image/fabric/fabric-5.png b/docs/image/fabric/fabric-5.png
new file mode 100644
index 000000000..f4f3b7bc0
Binary files /dev/null and b/docs/image/fabric/fabric-5.png differ
diff --git a/docs/image/fabric/fabric-6.png b/docs/image/fabric/fabric-6.png
new file mode 100644
index 000000000..00b250cf2
Binary files /dev/null and b/docs/image/fabric/fabric-6.png differ
diff --git a/docs/image/fabric/fabric-7.png b/docs/image/fabric/fabric-7.png
new file mode 100644
index 000000000..2162e33b1
Binary files /dev/null and b/docs/image/fabric/fabric-7.png differ
diff --git a/docs/image/fabric/fabric-8.png b/docs/image/fabric/fabric-8.png
new file mode 100644
index 000000000..eb0ac3c2c
Binary files /dev/null and b/docs/image/fabric/fabric-8.png differ
diff --git a/docs/image/fabric/fabric-9.png b/docs/image/fabric/fabric-9.png
new file mode 100644
index 000000000..45effa93a
Binary files /dev/null and b/docs/image/fabric/fabric-9.png differ
diff --git a/docs/setup/databricks.md b/docs/setup/databricks.md
index ea86d37f2..1e26805f6 100644
--- a/docs/setup/databricks.md
+++ b/docs/setup/databricks.md
@@ -6,9 +6,6 @@ You just need to install the Sedona jars and Sedona Python on 
Databricks using D
 
 We recommend Databricks 10.x+.
 
-!!!tip
-       Wherobots Cloud provides a free tool to deploy Apache Sedona to 
Databricks. Please sign up [here](https://www.wherobots.services/).
-
 * Sedona 1.0.1 & 1.1.0 is compiled against Spark 3.1 (~ Databricks DBR 9 LTS, 
DBR 7 is Spark 3.0)
 * Sedona 1.1.1, 1.2.0 are compiled against Spark 3.2 (~ DBR 10 & 11)
 * Sedona 1.2.1, 1.3.1, 1.4.0 are complied against Spark 3.3
diff --git a/docs/setup/emr.md b/docs/setup/emr.md
index 237dff7b4..6d687f35e 100644
--- a/docs/setup/emr.md
+++ b/docs/setup/emr.md
@@ -1,8 +1,5 @@
 We recommend Sedona-1.3.1-incubating and above for EMR. In the tutorial, we 
use AWS Elastic MapReduce (EMR) 6.9.0. It has the following applications 
installed: Hadoop 3.3.3, JupyterEnterpriseGateway 2.6.0, Livy 0.7.1, Spark 
3.3.0.
 
-!!!tip
-       Wherobots Cloud provides a free tool to deploy Apache Sedona to AWS 
EMR. Please sign up [here](https://www.wherobots.services/).
-
 This tutorial is tested on EMR on EC2 with EMR Studio (notebooks). EMR on EC2 
uses YARN to manage resources.
 
 !!!note
diff --git a/docs/setup/fabric.md b/docs/setup/fabric.md
new file mode 100644
index 000000000..0326c2aaf
--- /dev/null
+++ b/docs/setup/fabric.md
@@ -0,0 +1,89 @@
+This tutorial will guide you through the process of installing Sedona on 
Microsoft Fabric Synapse Data Engineering's Spark environment.
+
+## Step 1: Open Microsoft Fabric Synapse Data Engineering
+
+Go to the [Microsoft Fabric portal](https://app.fabric.microsoft.com/) and 
choose the `Data Engineering` option.
+
+![](../../image/fabric/fabric-1.png)
+
+## Step 2: Create a Microsoft Fabric Data Engineering environment
+
+On the left side, click `My Workspace` and then click `+ New` to create a new 
`Environment`. Let's name it `ApacheSedona`.
+
+![](../../image/fabric/fabric-2.png)
+
+## Step 3: Select the Apache Spark version
+
+In the `Environment` page, click the `Home` tab and select the appropriate 
version of Apache Spark. You will need this version to install the correct 
version of Apache Sedona.
+
+![](../../image/fabric/fabric-3.png)
+
+## Step 4: Install the Sedona Python package
+
+In the `Environment` page, click the `Public libraries` tab and then type in 
`apache-sedona`. Please select the appropriate version of Apache Sedona. The 
source is `PyPI`.
+
+![](../../image/fabric/fabric-4.png)
+
+## Step 5: Save and publish the environment
+
+Click the `Save` button and then click the `Publish` button to save and 
publish the environment. This will create the environment with the Apache 
Sedona Python package installed. The publishing process will take about 10 
minutes.
+
+![](../../image/fabric/fabric-5.png)
+
+## Step 6: Download Sedona jars
+
+1. Learn the Sedona jars you need from our [Sedona maven 
coordinate](maven-coordinates.md)
+2. Download the `sedona-spark-shaded` jars from [Maven 
Central](https://search.maven.org/search?q=g:org.apache.sedona). Please pay 
attention to the Spark version and Scala version of the jars. If you select 
Spark 3.4 in the Fabric environment, you should download the Sedona jars with 
Spark 3.4 and Scala 2.12 and the jar name should be like 
`sedona-spark-shaded-3.4_2.12-1.5.1.jar`.
+3. Download the `geotools-wrapper` jars from [Maven 
Central](https://search.maven.org/search?q=g:org.datasyslab). Please pay 
attention to the Sedona verions of the jar. If you select Sedona 1.5.1, you 
should download the `geotools-wrapper` jar with version 1.5.1 and the jar name 
should be like `geotools-wrapper-1.5.1-28.2.jar`.
+
+## Step 7: Upload Sedona jars to the Fabric environment LakeHouse storage
+
+In the notebook page, choose the `Explorer` and click the `LakeHouses` option. 
If you don't have a LakeHouse, you can create one. Then choose `Files` and 
upload the 2 jars you downloaded in the previous step.
+
+After the upload, you should be able to see the 2 jars in the LakeHouse 
storage. Then please copy the `ABFS` paths of the 2 jars. In this example, the 
paths are
+
+```angular2html
+abfss://9e9d4196-870a-4901-8fa5-e24841492...@onelake.dfs.fabric.microsoft.com/e15f3695-af7e-47de-979e-473c3caa9f5b/Files/sedona-spark-shaded-3.4_2.12-1.5.1.jar
+
+abfss://9e9d4196-870a-4901-8fa5-e24841492...@onelake.dfs.fabric.microsoft.com/e15f3695-af7e-47de-979e-473c3caa9f5b/Files/geotools-wrapper-1.5.1-28.2.jar
+```
+
+![](../../image/fabric/fabric-6.png)
+
+![](../../image/fabric/fabric-7.png)
+
+## Step 8: Start the notebook with the Sedona environment and install the jars
+
+In the notebook page, select the `ApacheSedona` environment you created before.
+
+![](../../image/fabric/fabric-8.png)
+
+In the notebook, you can install the jars by running the following code. 
Please replace the `spark.jars` with the `ABFS` paths of the 2 jars you 
uploaded in the previous step.
+
+```python
+%%configure -f
+{
+    "conf": {
+        "spark.jars": 
"abfss://XXX/Files/sedona-spark-shaded-3.4_2.12-1.5.1.jar,abfss://XXX/Files/geotools-wrapper-1.5.1-28.2.jar",
+    }
+}
+```
+
+## Step 9: Verify the installation
+
+You can verify the installation by running the following code in the notebook.
+
+```python
+from sedona.spark import *
+
+
+sedona = SedonaContext.create(spark)
+
+
+sedona.sql("SELECT ST_GeomFromEWKT('SRID=4269;POINT(40.7128 
-74.0060)')").show()
+```
+
+If you see the output of the point, then the installation is successful.
+
+![](../../image/fabric/fabric-9.png)
+
diff --git a/docs/setup/wherobots.md b/docs/setup/wherobots.md
index 78c38779f..1bff8d322 100644
--- a/docs/setup/wherobots.md
+++ b/docs/setup/wherobots.md
@@ -1,7 +1,7 @@
-## SedonaDB
+## WherobotsDB
 
-Wherobots Cloud offers fully-managed and fully provisioned cloud services for 
SedonaDB, a comprehensive spatial analytics database system. You can play with 
it using Wherobots Jupyter Scala and Python kernel. No installation is needed.
+Wherobots Cloud offers fully-managed and fully provisioned cloud services for 
WherobotsDB, a comprehensive spatial analytics database system. You can play 
with it using Wherobots Jupyter Scala and Python kernel. No installation is 
needed.
 
-SedonaDB is 100% compatible with Apache Sedona 1.5.0+ in terms of public APIs 
but provides more functionalities.
+WherobotsDB is 100% compatible with Apache Sedona in terms of public APIs but 
provides more functionalities and better performance.
 
 It is easy to migrate your existing Sedona workflow to Wherobots Cloud. Please 
sign up at [Wherobots Cloud](https://www.wherobots.services/).
diff --git a/mkdocs.yml b/mkdocs.yml
index 14c64eaa8..2b9ea36bf 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -23,7 +23,8 @@ nav:
         - Install on Wherobots: setup/wherobots.md
         - Install on Databricks: setup/databricks.md
         - Install on AWS EMR: setup/emr.md
-        - Set up Spark cluster: setup/cluster.md
+        - Install on Microsfot Fabric: setup/fabric.md
+        - Set up Spark cluster manually: setup/cluster.md
       - Install with Apache Flink:
         - Install Sedona Scala/Java: setup/flink/install-scala.md
       - Install with Snowflake:
@@ -196,4 +197,5 @@ plugins:
   - macros
   - git-revision-date-localized:
       type: datetime
-  - mkdocs-jupyter
+  - mkdocs-jupyter:
+            include_source: True

(sedona) 01/01: Add Microsoft Fabric tutorial

Reply via email to