[
https://issues.apache.org/jira/browse/AIRFLOW-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Secada closed AIRFLOW-1750.
--------------------------------
Resolution: Fixed
> GoogleCloudStorageToBigQueryOperator 404 HttpError
> --------------------------------------------------
>
> Key: AIRFLOW-1750
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1750
> Project: Apache Airflow
> Issue Type: Bug
> Components: gcp
> Affects Versions: Airflow 1.8
> Environment: Python 2.7.13
> Reporter: Mark Secada
> Fix For: Airflow 1.8
>
>
> I'm trying to write a DAG which uploads JSON files to GoogleCloudStorage and
> then moves them to BigQuery. I was able to upload these files to
> GoogleCloudStorage, but when I run this second task, I get a 404 HttpError.
> The error looks like this:
> {code:bash}
> ERROR - <HttpError 404 when requesting
> https://www.googleapis.com/bigquery/v2/projects//jobs?alt=json returned "Not
> Found">
> Traceback (most recent call last):
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/airflow/models.py", line
> 1374, in run
> result = task_copy.execute(context=context)
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/airflow/contrib/operators/gcs_to_bq.py",
> line 153, in execute
> schema_update_options=self.schema_update_options)
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/airflow/contrib/hooks/bigquery_hook.py",
> line 476, in run_load
> return self.run_with_configuration(configuration)
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/airflow/contrib/hooks/bigquery_hook.py",
> line 498, in run_with_configuration
> .insert(projectId=self.project_id, body=job_data) \
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/oauth2client/util.py",
> line 135, in positional_wrapper
> return wrapped(*args, **kwargs)
> File
> "/Users/myname/anaconda/lib/python2.7/site-packages/googleapiclient/http.py",
> line 838, in execute
> raise HttpError(resp, content, uri=self.uri)
> {code}
> My code for the task is here:
> {code:python}
> // Some comments here
> t3 = GoogleCloudStorageToBigQueryOperator(
> task_id='move_'+source+'_from_gcs_to_bq',
> bucket='mybucket',
> source_objects=['news/latest_headline_'+source+'.json'],
> destination_project_dataset_table='mydataset.latest_news_headlines',
> schema_object='news/latest_headline_'+source+'.json',
> source_format='NEWLINE_DELIMITED_JSON',
> write_disposition='WRITE_APPEND'
> dag=dag)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)