Jonny Evans created AIRFLOW-7027:
------------------------------------

             Summary: The mirrored data folder for BigQuery_operators can't be 
accessed on manual runs
                 Key: AIRFLOW-7027
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-7027
             Project: Apache Airflow
          Issue Type: Bug
          Components: contrib, DAG
    Affects Versions: 1.10.9
         Environment: Windows 10 Pro, i7-4790S Processor, 16MB RAM
            Reporter: Jonny Evans


Using Airflow through the Google Cloud Composer, I've placed a series of text 
files in the /data folder of the bucket as suggested in the documentation for 
where to store external data files and have written a BigQueryOperator of the 
following format: 
{{ 
with 
open('/home/airflow/gcs/data/{0}.txt'.format(models.Variable.get('tmpcreatives')),'r')
 as tmp_file: tmp_transfer = tmp_file.read() bq_sql_tmptransfer = 
bigquery_operator.BigQueryOperator( task_id = 'task1', sql = """ {0} 
""".format(tmp_transfer.format(tradata = 
dag.params["ClientDatabase"]+dag.params["bq_param1"],rawdata = 
dag.params["ClientDatabase"]+dag.params["bq_param2"])), use_legacy_sql = False 
) 
}} 
On scheduled runs, the DAG run's fine and completes the task, however if I try 
to manually trigger the DAG or look at the run logs it comes up with the 
message 'DAG "DataCreation_DAG_" seems to be missing' This is only an issue 
when I use the open() function, if I replace that section with a hardcoded 
string then the DAG works fine even on manual runs, I think it's a bug with 
mounting the /data file from the cloud shell bucket but not entirely sure



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to