kacpermuda commented on code in PR #37620:
URL: https://github.com/apache/airflow/pull/37620#discussion_r1502249368


##########
docs/apache-airflow-providers-openlineage/guides/structure.rst:
##########
@@ -17,16 +17,60 @@
     under the License.
 
 
-Structure of OpenLineage Airflow integration
+OpenLineage Airflow integration
 --------------------------------------------
 
-OpenLineage integration implements AirflowPlugin. This allows it to be 
discovered on Airflow start and
-register Airflow Listener.
+OpenLineage is an open framework for data lineage collection and analysis.
+At its core is an extensible specification that systems can use to 
interoperate with lineage metadata.
+`Check out OpenLineage docs <https://openlineage.io/docs/>`_.
 
-The listener is then called when certain events happen in Airflow - when DAGs 
or TaskInstances start, complete or fail.
-For DAGs, the listener runs in Airflow Scheduler.
-For TaskInstances, the listener runs on Airflow Worker.
+Quickstart
+==========
+
+To instrument your Airflow instance with OpenLineage, see 
:ref:`guides/user:openlineage`.
+
+To implement OpenLineage support for Airflow Operators, see 
:ref:`guides/developer:openlineage`.
+
+What's in it for me ?
+=====================
+
+The metadata collected can answer questions like:
+
+- Why did specific data transformation fail?
+- What are the upstream sources feeding into certain dataset?
+- What downstream processes rely on this specific dataset?
+- Is my data fresh?
+- Can I identify the bottleneck in my data processing pipeline?
+- How did the latest code change affect data processing times?
+- How can I trace the cause of data inaccuracies in my report?
+- How are data privacy and compliance requirements being managed through the 
data's lifecycle?
+- Are there redundant data processes that can be optimized or removed?
+- What data dependencies exist for this critical report?
+
+Understanding complex inter-DAG dependencies and providing up-to-date runtime 
visibility into DAG execution can be challenging.
+OpenLineage integrates with Airflow to collect DAG lineage metadata so that 
inter-DAG dependencies are easily maintained
+and viewable via a lineage graph, while also keeping a catalog of historical 
runs of DAGs.
+
+.. image:: 
https://openlineage.io/assets/images/af-schematic-ad8c295a182cb32b94ee27b96727fa98.svg
+   :alt: airflow_lineage
+   :width: 1792
+
+For OpenLineage backend that will receive events, you can use `Marquez 
<https://marquezproject.ai/>`_
+
+.. image:: https://marquezproject.ai/img/screenshot.png

Review Comment:
   Removed that part.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to