pankajkoti commented on code in PR #31817:
URL: https://github.com/apache/airflow/pull/31817#discussion_r1227630848


##########
docs/apache-airflow-providers-openlineage/guides/developer.rst:
##########
@@ -50,4 +49,25 @@ Instead of returning complete OpenLineage event, the 
provider defines ``Operator
       run_facets: dict[str, BaseFacet] = Factory(dict)
       job_facets: dict[str, BaseFacet] = Factory(dict)
 
-OpenLineage integration takes care to enrich it with things like general 
Airflow facets, proper event time and type.
+OpenLineage integration itself takes care to enrich it with things like 
general Airflow facets, proper event time and type, creating proper OpenLineage 
RunEvent.
+
+How to properly implement OpenLineage methods?
+==============================================
+
+There are several things worth noting when implementing OpenLineage in 
operators.
+
+First, do not import OpenLineage methods on top-level, but in OL method itself.
+This allows users to use your provider even if they do not have OpenLineage 
provider installed.
+
+Another important point is to make sure your provider returns 
OpenLineage-compliant dataset names.
+It allows OpenLineage consumers to properly match information about datasets 
gathered from different sources.
+
+Naming convention is described in the `OpenLineage docs 
<https://openlineage.io/docs/spec/naming>`__.
+
+OpenLineage implementation should not waste time of users that do not use it.
+This means not doing heavy processing or network calls in the ``execute`` 
method that aren't used by it.
+Better option is to save relevant information in Operator attribute - and then 
use it
+in OpenLineage method.
+
+Good example is ``BigQueryExecuteQueryOperator``. It saves ``job_ids`` of 
queries that were executer.

Review Comment:
   ```suggestion
   Good example is ``BigQueryExecuteQueryOperator``. It saves ``job_ids`` of 
queries that were executed.
   ```



##########
docs/apache-airflow-providers-openlineage/guides/developer.rst:
##########
@@ -50,4 +49,25 @@ Instead of returning complete OpenLineage event, the 
provider defines ``Operator
       run_facets: dict[str, BaseFacet] = Factory(dict)
       job_facets: dict[str, BaseFacet] = Factory(dict)
 
-OpenLineage integration takes care to enrich it with things like general 
Airflow facets, proper event time and type.
+OpenLineage integration itself takes care to enrich it with things like 
general Airflow facets, proper event time and type, creating proper OpenLineage 
RunEvent.
+
+How to properly implement OpenLineage methods?
+==============================================
+
+There are several things worth noting when implementing OpenLineage in 
operators.
+
+First, do not import OpenLineage methods on top-level, but in OL method itself.
+This allows users to use your provider even if they do not have OpenLineage 
provider installed.
+
+Another important point is to make sure your provider returns 
OpenLineage-compliant dataset names.
+It allows OpenLineage consumers to properly match information about datasets 
gathered from different sources.
+
+Naming convention is described in the `OpenLineage docs 
<https://openlineage.io/docs/spec/naming>`__.
+
+OpenLineage implementation should not waste time of users that do not use it.
+This means not doing heavy processing or network calls in the ``execute`` 
method that aren't used by it.
+Better option is to save relevant information in Operator attribute - and then 
use it

Review Comment:
   ```suggestion
   Better option is to save relevant information in Operator attributes - and 
then use it
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to