nathadfield opened a new issue #11282:
URL: https://github.com/apache/airflow/issues/11282


   **Apache Airflow version**: 1.10.*
   **Backport packages version**: 2020.10.5rc1
   
   **What happened**:
   
   Although good work has been done to improve the `BigQueryInsertJobOperator` 
by the assigning of a `job_id` based on a combination of `dag_id`, `task_id`, 
`execution_date` and an additional `uniqueness_suffix`, other operators that 
create BigQuery jobs - e.g. `GCSToBigQueryOperator ` - do not take advantage of 
this but instead still create a job_id based on a timestamp.
   
   
https://github.com/apache/airflow/blob/master/airflow/providers/google/cloud/hooks/bigquery.py#L1475
   
   If more than one task based on this operator are launched at the same time, 
this will cause an error due to a duplication in the job_id.
   
   `Already Exists: Job my-project:EU.airflow_1601894411`
   
   **What you expected to happen**:
   Any task that starts a BigQuery should create a job_id that is unique.
   
   Perhaps the method of `job_id` creation needs to move into the BigQuery 
hook, if possible?
   
https://github.com/apache/airflow/blob/master/airflow/providers/google/cloud/operators/bigquery.py#L2060
 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to