eladkal commented on a change in pull request #7410: [AIRFLOW-6790] Add basic 
Tableau Integration
URL: https://github.com/apache/airflow/pull/7410#discussion_r381519644
 
 

 ##########
 File path: 
airflow/providers/salesforce/example_dags/example_tableau_refresh_workbook.py
 ##########
 @@ -0,0 +1,59 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+This is an example dag that performs a refresh operation on a Tableau Workbook 
aka Extract. Since this is an
+asynchronous operation we don't know when the operation actually finishes. 
That's why we have a second task
+that checks exactly that. So that you can perform further operations after the 
extract has been refreshed.
+"""
+from datetime import timedelta
+
+from airflow import DAG
+from airflow.providers.salesforce.operators.tableau_refresh_workbook import 
TableauRefreshWorkbookOperator
+from airflow.providers.salesforce.sensors.tableau_job_status import 
TableauJobStatusSensor
+from airflow.utils.dates import days_ago
+
+DEFAULT_ARGS = {
+    'owner': 'airflow',
+    'depends_on_past': False,
+    'start_date': days_ago(2),
+    'email': ['[email protected]'],
+    'email_on_failure': False,
+    'email_on_retry': False
+}
+
+with DAG(
+    dag_id='example_tableau_refresh_workbook',
+    default_args=DEFAULT_ARGS,
+    dagrun_timeout=timedelta(hours=2),
+    schedule_interval=None,
+    tags=['example'],
+) as dag:
+    task_refresh_workbook = TableauRefreshWorkbookOperator(
+        site_id='my_site',
+        workbook_name='MyWorkbook',
+        task_id='refresh_tableau_workbook',
+        dag=dag
+    )
+    task_check_job_status = TableauJobStatusSensor(
 
 Review comment:
   I think having a separate sensor or making it configurable is crucial.
   
   When you perform Extract the processing of the job is "moving" from Airflow 
worker to Tableau server. There is no reason for the Airflow operator to 
continue occupying the worker needlessly.
   
   BashOperator is different. The script runs on the Airflow worker resources. 
It can also be used to connect with another service resulted in waiting time 
but this is an option unlike the TableauRefreshWorkbookOperator where it always 
executed on the Tableau server resources.
   
   Think also about a case where many Dags uses TableauRefreshWorkbookOperator 
at the same time. If all occupy the workers till extract is finished it might 
cause the entire Airflow cluster to be paralyzed.
   
   I think @feluelle concern about 2nd refresh is a valid one.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to