This is an automated email from the ASF dual-hosted git repository.

ephraimanierobi pushed a commit to branch v2-8-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit ff104018545582eab9d57f86705b3844e1429c06
Author: Kenten Danas <[email protected]>
AuthorDate: Tue Jan 23 13:42:31 2024 -0800

    Update Objectstore tutorial with prereqs section (#36983)
    
    * Add prerequisites section to object storage tutorial
    
    * Fix trailing whitespace
    
    (cherry picked from commit 891b9bcc690623fff364376e165e2ccade855ab1)
---
 docs/apache-airflow/tutorial/objectstorage.rst | 28 ++++++++++++++++----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/docs/apache-airflow/tutorial/objectstorage.rst 
b/docs/apache-airflow/tutorial/objectstorage.rst
index 610450b931..943e8031a7 100644
--- a/docs/apache-airflow/tutorial/objectstorage.rst
+++ b/docs/apache-airflow/tutorial/objectstorage.rst
@@ -25,17 +25,23 @@ This tutorial shows how to use the Object Storage API to 
manage objects that
 reside on object storage, like S3, gcs and azure blob storage. The API is 
introduced
 as part of Airflow 2.8.
 
-The tutorial covers a simple pattern that is often used in data engineering and
-data science workflows: accessing a web api, saving and analyzing the result. 
For the
-tutorial to work you will need to have Duck DB installed, which is a in-process
-analytical database. You can do this by running ``pip install duckdb``. The 
tutorial
-makes use of S3 Object Storage. This requires that the amazon provider is 
installed
-including ``s3fs`` by running ``pip install 
apache-airflow-providers-amazon[s3fs]``.
-If you would like to use a different storage provider, you can do so by 
changing the
-URL in the ``create_object_storage_path`` function to the appropriate URL for 
your
-provider, for example by replacing ``s3://`` with ``gs://`` for Google Cloud 
Storage.
-You will also need the right provider to be installed then. Finally, you will 
need
-``pandas``, which can be installed by running ``pip install pandas``.
+The tutorial covers a simple pattern that is often used in data engineering 
and data
+science workflows: accessing a web api, saving and analyzing the result.
+
+Prerequisites
+-------------
+To complete this tutorial, you need a few things:
+
+- DuckDB, an in-process analytical database,
+  which can be installed by running ``pip install duckdb``.
+- An S3 bucket, along with the Amazon provider including ``s3fs``. You can 
install
+  the provider package by running
+  ``pip install apache-airflow-providers-amazon[s3fs]``.
+  Alternatively, you can use a different storage provider by changing the URL 
in
+  the ``create_object_storage_path`` function to the appropriate URL for your
+  provider, for example by replacing ``s3://`` with ``gs://`` for Google Cloud
+  Storage, and installing a different provider.
+- ``pandas``, which you can install by running ``pip install pandas``.
 
 
 Creating an ObjectStoragePath

Reply via email to