kacpermuda commented on code in PR #37620:
URL: https://github.com/apache/airflow/pull/37620#discussion_r1502248327
##########
airflow/providers/openlineage/provider.yaml:
##########
@@ -58,65 +58,67 @@ config:
openlineage:
description: |
This section applies settings for OpenLineage integration.
- For backwards compatibility with `openlineage-python` one can still use
- `openlineage.yml` file or `OPENLINEAGE_` environment variables. However,
below
- configuration takes precedence over those.
- More in documentation -
https://openlineage.io/docs/client/python#configuration.
+ More about configuration and it's precedence can be found at
+
https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/guides/user.html#transport-setup
options:
disabled:
description: |
- Set this to true if you don't want OpenLineage to emit events.
+ Disable sending events without uninstalling the OpenLineage Provider
by setting this to true.
type: boolean
example: ~
default: "False"
version_added: ~
disabled_for_operators:
description: |
- Semicolon separated string of Airflow Operator names to disable
+ Exclude some Operators from emitting OpenLineage events by passing a
string of semicolon separated
+ full import paths of Operators to disable.
type: string
example:
"airflow.operators.bash.BashOperator;airflow.operators.python.PythonOperator"
default: ""
version_added: 1.1.0
namespace:
description: |
- OpenLineage namespace
+ Set namespace that the lineage data belongs to, so that if you use
multiple OpenLineage producers,
+ events coming from them will be logically separated.
version_added: ~
type: string
- example: "food_delivery"
+ example: "my_airflow_instance_1"
default: ~
extractors:
description: |
- Semicolon separated paths to custom OpenLineage extractors.
+ Register custom OpenLineage Extractors by passing a string of
semicolon separated full import paths.
type: string
example: full.path.to.ExtractorClass;full.path.to.AnotherExtractorClass
default: ~
version_added: ~
config_path:
description: |
- Path to YAML config. This provides backwards compatibility to pass
config as
+ Provide path to YAML config file. This provides backwards
compatibility to pass config as
`openlineage.yml` file.
Review Comment:
Changed that.
##########
airflow/providers/openlineage/provider.yaml:
##########
@@ -58,65 +58,67 @@ config:
openlineage:
description: |
This section applies settings for OpenLineage integration.
- For backwards compatibility with `openlineage-python` one can still use
- `openlineage.yml` file or `OPENLINEAGE_` environment variables. However,
below
- configuration takes precedence over those.
- More in documentation -
https://openlineage.io/docs/client/python#configuration.
+ More about configuration and it's precedence can be found at
+
https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/guides/user.html#transport-setup
options:
disabled:
description: |
- Set this to true if you don't want OpenLineage to emit events.
+ Disable sending events without uninstalling the OpenLineage Provider
by setting this to true.
type: boolean
example: ~
default: "False"
version_added: ~
disabled_for_operators:
description: |
- Semicolon separated string of Airflow Operator names to disable
+ Exclude some Operators from emitting OpenLineage events by passing a
string of semicolon separated
+ full import paths of Operators to disable.
type: string
example:
"airflow.operators.bash.BashOperator;airflow.operators.python.PythonOperator"
default: ""
version_added: 1.1.0
namespace:
description: |
- OpenLineage namespace
+ Set namespace that the lineage data belongs to, so that if you use
multiple OpenLineage producers,
+ events coming from them will be logically separated.
version_added: ~
type: string
- example: "food_delivery"
+ example: "my_airflow_instance_1"
default: ~
extractors:
description: |
- Semicolon separated paths to custom OpenLineage extractors.
+ Register custom OpenLineage Extractors by passing a string of
semicolon separated full import paths.
type: string
example: full.path.to.ExtractorClass;full.path.to.AnotherExtractorClass
default: ~
version_added: ~
config_path:
description: |
- Path to YAML config. This provides backwards compatibility to pass
config as
+ Provide path to YAML config file. This provides backwards
compatibility to pass config as
`openlineage.yml` file.
version_added: ~
type: string
- example: ~
+ example: "full/path/to/openlineage.yml"
default: ""
transport:
description: |
- OpenLineage Client transport configuration. It should contain type
- and additional options per each type.
+ Pass OpenLineage Client transport configuration as JSON string. It
should contain type of the
+ transport and additional options (different for each transport
type). For more details see:
+ https://openlineage.io/docs/client/python/#built-in-transport-types
Currently supported types are:
* HTTP
* Kafka
* Console
+ * File
type: string
- example: '{"type": "http", "url": "http://localhost:5000"}'
+ example: '{"type": "http", "url": "http://localhost:5000", "endpoint":
"api/v1/lineage"}'
default: ""
version_added: ~
disable_source_code:
description: |
- If disabled, OpenLineage events do not contain source code of
particular
- operators, like PythonOperator.
+ Disable including source code in OpenLineage events by setting this
to true. Several Operators (f.e.
+ Python, Bash) will by default include their source code in their
OpenLineage events if not disabled.
Review Comment:
Changed that.
##########
docs/apache-airflow-providers-openlineage/guides/structure.rst:
##########
@@ -17,16 +17,60 @@
under the License.
-Structure of OpenLineage Airflow integration
+OpenLineage Airflow integration
--------------------------------------------
-OpenLineage integration implements AirflowPlugin. This allows it to be
discovered on Airflow start and
-register Airflow Listener.
+OpenLineage is an open framework for data lineage collection and analysis.
+At its core is an extensible specification that systems can use to
interoperate with lineage metadata.
Review Comment:
Changed that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]