[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py
** BigQueryGetDataOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4287)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4288)
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_table_delete_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py
** BigQueryGetDataOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4287)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_table_delete_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py
> ** BigQueryGetDataOperator (fix in 
> https://issues.apache.org/jira/browse/AIRFLOW-4287)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> * bigquery_to_bigquery.py
> ** BigQueryToBigQueryOperator (fix in 
> https://issues.apache.org/jira/browse/AIRFLOW-4288)
> * bigquery_to_gcs.py
> ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
> ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> ** BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py
> ** BigQueryCheckOperator
> * bigquery_table_delete_operator.py
> ** BigQueryDeleteDatasetOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-4287) location support for BigQueryGetDataOperator

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-4287 started by Yohei Onishi.
-
> location support for BigQueryGetDataOperator
> 
>
> Key: AIRFLOW-4287
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4287
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> * BigQueryGetDataOperator does not support locations except US/EU.
> * See full list of operators needs support location 
> https://issues.apache.org/jira/browse/AIRFLOW-3601



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-4288) location support for BigQueryToBigQueryOperator

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-4288 started by Yohei Onishi.
-
> location support for BigQueryToBigQueryOperator
> ---
>
> Key: AIRFLOW-4288
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4288
> Project: Apache Airflow
>  Issue Type: Improvement
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> * BigQueryGetDataOperator does not support locations except US/EU.
> * See full list of operators needs support location 
> https://issues.apache.org/jira/browse/AIRFLOW-3601



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4288) location support for BigQueryToBigQueryOperator

2019-04-11 Thread Yohei Onishi (JIRA)
Yohei Onishi created AIRFLOW-4288:
-

 Summary: location support for BigQueryToBigQueryOperator
 Key: AIRFLOW-4288
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4288
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Yohei Onishi
Assignee: Yohei Onishi


* BigQueryGetDataOperator does not support locations except US/EU.
* See full list of operators needs support location 
https://issues.apache.org/jira/browse/AIRFLOW-3601




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py
** BigQueryGetDataOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4287)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_table_delete_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py
** BigQueryGetDataOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4287)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py
> ** BigQueryGetDataOperator (fix in 
> https://issues.apache.org/jira/browse/AIRFLOW-4287)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> * bigquery_to_bigquery.py
> ** BigQueryToBigQueryOperator
> * bigquery_to_gcs.py
> ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
> ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> ** BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py
> ** BigQueryCheckOperator
> * bigquery_table_delete_operator.py
> ** BigQueryDeleteDatasetOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
 ** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
 ** BigQueryToCloudStorageOperator
* gcs_to_bq.py
 ** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py (BigQueryCheckOperator)
* bigquery_operator.py (BigQueryDeleteDatasetOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.
 * bigquery_check_operator.py
 ** 
BigQueryCheckOperator
 * bigquery_get_data.py
 ** 
BigQueryGetDataOperator
 *  bigquery_operator.py

 ** 
BigQueryOperator
BigQueryCreateEmptyTableOperator
BigQueryCreateExternalTableOperator
BigQueryDeleteDatasetOperator
BigQueryCreateEmptyDatasetOperator
 *  bigquery_table_delete_operator.py
 ** 
BigQueryTableDeleteOperator
 * bigquery_to_bigquery.py
 ** 
BigQueryToBigQueryOperator
 * bigquery_to_gcs.py
 ** 
BigQueryToCloudStorageOperator
 * gcs_to_bq.py
 ** 
GoogleCloudStorageToBigQueryOperator
 * bigquery_sensor.py
BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py (BigQueryCheckOperator)
* bigquery_operator.py (BigQueryDeleteDatasetOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py (BigQueryGetDataOperator)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> *  bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator
> * bigquery_to_bigquery.py
>  ** BigQueryToBigQueryOperator
> * bigquery_to_gcs.py
>  ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
>  ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py (BigQueryCheckOperator)
> * bigquery_operator.py (BigQueryDeleteDatasetOperator): 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py
** BigQueryGetDataOperator (fix in 
https://issues.apache.org/jira/browse/AIRFLOW-4287)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py
> ** BigQueryGetDataOperator (fix in 
> https://issues.apache.org/jira/browse/AIRFLOW-4287)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> *  bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator
> * bigquery_to_bigquery.py
> ** BigQueryToBigQueryOperator
> * bigquery_to_gcs.py
> ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
> ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> ** BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py
> ** BigQueryCheckOperator
> * bigquery_operator.py
> ** BigQueryDeleteDatasetOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-4287) location support for BigQueryGetDataOperator

2019-04-11 Thread Yohei Onishi (JIRA)
Yohei Onishi created AIRFLOW-4287:
-

 Summary: location support for BigQueryGetDataOperator
 Key: AIRFLOW-4287
 URL: https://issues.apache.org/jira/browse/AIRFLOW-4287
 Project: Apache Airflow
  Issue Type: Improvement
Reporter: Yohei Onishi


* BigQueryGetDataOperator does not support locations except US/EU.
* See full list of operators needs support location 
https://issues.apache.org/jira/browse/AIRFLOW-3601



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
** BigQueryToCloudStorageOperator
* gcs_to_bq.py
** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
** BigQueryCheckOperator
* bigquery_operator.py
** BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
 ** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
 ** BigQueryToCloudStorageOperator
* gcs_to_bq.py
 ** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
  ** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
  * BigQueryCheckOperator
* bigquery_operator.py
  * BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
  * BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py (BigQueryGetDataOperator)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> *  bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator
> * bigquery_to_bigquery.py
> ** BigQueryToBigQueryOperator
> * bigquery_to_gcs.py
> ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
> ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
> ** BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py
> ** BigQueryCheckOperator
> * bigquery_operator.py
> ** BigQueryDeleteDatasetOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
 ** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
 ** BigQueryToCloudStorageOperator
* gcs_to_bq.py
 ** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
  ** BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py
  * BigQueryCheckOperator
* bigquery_operator.py
  * BigQueryDeleteDatasetOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py
  * BigQueryTableDeleteOperator 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.

* bigquery_get_data.py (BigQueryGetDataOperator)
*  bigquery_operator.py
** BigQueryOperator
** BigQueryCreateEmptyTableOperator
** BigQueryCreateExternalTableOperator
** BigQueryDeleteDatasetOperator
** BigQueryCreateEmptyDatasetOperator
*  bigquery_table_delete_operator.py
** BigQueryTableDeleteOperator
* bigquery_to_bigquery.py
 ** BigQueryToBigQueryOperator
* bigquery_to_gcs.py
 ** BigQueryToCloudStorageOperator
* gcs_to_bq.py
 ** GoogleCloudStorageToBigQueryOperator
* bigquery_sensor.py
BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py (BigQueryCheckOperator)
* bigquery_operator.py (BigQueryDeleteDatasetOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
> * bigquery_get_data.py (BigQueryGetDataOperator)
> *  bigquery_operator.py
> ** BigQueryOperator
> ** BigQueryCreateEmptyTableOperator
> ** BigQueryCreateExternalTableOperator
> ** BigQueryDeleteDatasetOperator
> ** BigQueryCreateEmptyDatasetOperator
> *  bigquery_table_delete_operator.py
> ** BigQueryTableDeleteOperator
> * bigquery_to_bigquery.py
>  ** BigQueryToBigQueryOperator
> * bigquery_to_gcs.py
>  ** BigQueryToCloudStorageOperator
> * gcs_to_bq.py
>  ** GoogleCloudStorageToBigQueryOperator
> * bigquery_sensor.py
>   ** BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py
>   * BigQueryCheckOperator
> * bigquery_operator.py
>   * BigQueryDeleteDatasetOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py
>   * BigQueryTableDeleteOperator 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3601) update operators to BigQuery to support location

2019-04-11 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3601:
--
Description: 
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.
 * bigquery_check_operator.py
 ** 
BigQueryCheckOperator
 * bigquery_get_data.py
 ** 
BigQueryGetDataOperator
 *  bigquery_operator.py

 ** 
BigQueryOperator
BigQueryCreateEmptyTableOperator
BigQueryCreateExternalTableOperator
BigQueryDeleteDatasetOperator
BigQueryCreateEmptyDatasetOperator
 *  bigquery_table_delete_operator.py
 ** 
BigQueryTableDeleteOperator
 * bigquery_to_bigquery.py
 ** 
BigQueryToBigQueryOperator
 * bigquery_to_gcs.py
 ** 
BigQueryToCloudStorageOperator
 * gcs_to_bq.py
 ** 
GoogleCloudStorageToBigQueryOperator
 * bigquery_sensor.py
BigQueryTableSensor

The following operators does not require location since it does not use 
location internally
* bigquery_check_operator.py (BigQueryCheckOperator)
* bigquery_operator.py (BigQueryDeleteDatasetOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
* bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete


  was:
location support for BigQueryHook was merged by the PR 4324 
[https://github.com/apache/incubator-airflow/pull/4324]

The following operators needs to be updated.
 * bigquery_check_operator.py
 ** 
BigQueryCheckOperator
 * bigquery_get_data.py
 ** 
BigQueryGetDataOperator
 *  bigquery_operator.py

 ** 
BigQueryOperator
BigQueryCreateEmptyTableOperator
BigQueryCreateExternalTableOperator
BigQueryDeleteDatasetOperator
BigQueryCreateEmptyDatasetOperator
 *  bigquery_table_delete_operator.py
 ** 
BigQueryTableDeleteOperator
 * bigquery_to_bigquery.py
 ** 
BigQueryToBigQueryOperator
 * bigquery_to_gcs.py
 ** 
BigQueryToCloudStorageOperator
 * gcs_to_bq.py
 ** 
GoogleCloudStorageToBigQueryOperator
 * bigquery_sensor.py
 ** 
BigQueryTableSensor


> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
>  * bigquery_check_operator.py
>  ** 
> BigQueryCheckOperator
>  * bigquery_get_data.py
>  ** 
> BigQueryGetDataOperator
>  *  bigquery_operator.py
>  ** 
> BigQueryOperator
> BigQueryCreateEmptyTableOperator
> BigQueryCreateExternalTableOperator
> BigQueryDeleteDatasetOperator
> BigQueryCreateEmptyDatasetOperator
>  *  bigquery_table_delete_operator.py
>  ** 
> BigQueryTableDeleteOperator
>  * bigquery_to_bigquery.py
>  ** 
> BigQueryToBigQueryOperator
>  * bigquery_to_gcs.py
>  ** 
> BigQueryToCloudStorageOperator
>  * gcs_to_bq.py
>  ** 
> GoogleCloudStorageToBigQueryOperator
>  * bigquery_sensor.py
> BigQueryTableSensor
> The following operators does not require location since it does not use 
> location internally
> * bigquery_check_operator.py (BigQueryCheckOperator)
> * bigquery_operator.py (BigQueryDeleteDatasetOperator): 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/delete
> * bigquery_table_delete_operator.py (BigQueryTableDeleteOperator): 
> https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/delete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (AIRFLOW-3601) update operators to BigQuery to support location

2019-01-07 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on AIRFLOW-3601 started by Yohei Onishi.
-
> update operators to BigQuery to support location
> 
>
> Key: AIRFLOW-3601
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3601
> Project: Apache Airflow
>  Issue Type: Task
>Affects Versions: 1.10.1
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> location support for BigQueryHook was merged by the PR 4324 
> [https://github.com/apache/incubator-airflow/pull/4324]
> The following operators needs to be updated.
>  * bigquery_check_operator.py
>  ** 
> BigQueryCheckOperator
>  * bigquery_get_data.py
>  ** 
> BigQueryGetDataOperator
>  *  bigquery_operator.py
>  ** 
> BigQueryOperator
> BigQueryCreateEmptyTableOperator
> BigQueryCreateExternalTableOperator
> BigQueryDeleteDatasetOperator
> BigQueryCreateEmptyDatasetOperator
>  *  bigquery_table_delete_operator.py
>  ** 
> BigQueryTableDeleteOperator
>  * bigquery_to_bigquery.py
>  ** 
> BigQueryToBigQueryOperator
>  * bigquery_to_gcs.py
>  ** 
> BigQueryToCloudStorageOperator
>  * gcs_to_bq.py
>  ** 
> GoogleCloudStorageToBigQueryOperator
>  * bigquery_sensor.py
>  ** 
> BigQueryTableSensor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-30 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730967#comment-16730967
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

[https://github.com/apache/incubator-airflow/pull/4324]

4324 is merged so I can work on my issue.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-29 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730906#comment-16730906
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

OK will do

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-2790) snakebite syntax error: baseTime = min(time * (1L << retries), cap);

2018-12-29 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi reassigned AIRFLOW-2790:
-

Assignee: Yohei Onishi

> snakebite syntax error: baseTime = min(time * (1L << retries), cap);
> 
>
> Key: AIRFLOW-2790
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2790
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: hooks
>Affects Versions: 1.9.0
> Environment: Amazon Linux
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> Does anybody know how can I fix this issue?
>  * Got the following error when importing 
> airflow.operators.sensors.ExternalTaskSensor.
>  * apache-airflow 1.9.0 depends on snakebite 2.11.0 and it does not work with 
> Python3. https://github.com/spotify/snakebite/issues/250
> [2018-07-23 06:42:51,828] \{models.py:288} ERROR - Failed to import: 
> /home/airflow/airflow/dags/example_task_sensor2.py
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 285, 
> in process_file
> m = imp.load_source(mod_name, filepath)
>   File "/usr/lib64/python3.6/imp.py", line 172, in load_source
> module = _load(spec)
>   File "", line 675, in _load
>   File "", line 655, in _load_unlocked
>   File "", line 678, in exec_module
>   File "", line 205, in _call_with_frames_removed
>   File "/home/airflow/airflow/dags/example_task_sensor2.py", line 10, in 
> 
> from airflow.operators.sensors import ExternalTaskSensor
>   File "/usr/local/lib/python3.6/site-packages/airflow/operators/sensors.py", 
> line 34, in 
> from airflow.hooks.hdfs_hook import HDFSHook
>   File "/usr/local/lib/python3.6/site-packages/airflow/hooks/hdfs_hook.py", 
> line 20, in 
> from snakebite.client import Client, HAClient, Namenode, AutoConfigClient
>   File "/usr/local/lib/python3.6/site-packages/snakebite/client.py", line 1473
> baseTime = min(time * (1L << retries), cap);
> ^



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-27 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730064#comment-16730064
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

Sorry I misunderstood the PR. This just fixes 
[bigquery_hook.py|https://github.com/apache/incubator-airflow/pull/4324/files#diff-ee06f8fcbc476ea65446a30160c2a2b2]
 and [http://bigquery_operator.py|http://bigquery_operator.py./] only. Not 
GoogleCloudStorageToBigQueryOperator

 [https://github.com/apache/incubator-airflow/pull/4324]

 

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-27 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi reopened AIRFLOW-3571:
---

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-27 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi closed AIRFLOW-3571.
-
Resolution: Duplicate

close as duplicated

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-27 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729468#comment-16729468
 ] 

Yohei Onishi edited comment on AIRFLOW-3571 at 12/27/18 9:25 AM:
-

This PR is also fixing the operator as well so I will close this ticket once 
the PR is merged.


was (Author: yohei):
This PR is also fixing the operator as well so I will close once the PR is 
merged.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-27 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729468#comment-16729468
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

This PR is also fixing the operator as well so I will close once the PR is 
merged.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729402#comment-16729402
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

OK I will check when I have time.

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: 

[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729399#comment-16729399
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

Thanks, will do.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi reassigned AIRFLOW-3571:
-

Assignee: Yohei Onishi

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729281#comment-16729281
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

[~jackjack10] My PR is approved. Thank you for your support.

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Assignee: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] 

[jira] [Commented] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729263#comment-16729263
 ] 

Yohei Onishi commented on AIRFLOW-3571:
---

It seems {color:#d04437}GoogleCloudStorageToBigQueryOperator{color} does not 
support regions other than US / EU.

> GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS 
> to BiqQuery but a task is failed
> -
>
> Key: AIRFLOW-3571
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I am using the following service in asia-northeast1-c zone. * GCS: 
> asia-northeast1-c
>  * BigQuery dataset and table: asia-northeast1-c
>  * Composer: asia-northeast1-c
> My task created by GoogleCloudStorageToBigQueryOperator succeeded to 
> uploading CSV file from a GCS bucket to a BigQuery table but the task was 
> failed due to the following error.
>   
> {code:java}
> [2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
> bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
> {discovery.py:871} INFO - URL being requested: GET 
> https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
> [2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status 
> check failed. Final error was: %s', 404)
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 981, in run_with_configuratio
> jobId=self.running_job_id).execute(
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
> line 130, in positional_wrappe
> return wrapped(*args, **kwargs
>   File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
> 851, in execut
> raise HttpError(resp, content, uri=self.uri
> googleapiclient.errors.HttpError:  https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
>  returned "Not found: Job my-project:job_abc123"
> During handling of the above exception, another exception occurred
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
> result = task_copy.execute(context=context
>   File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
> 237, in execut
> time_partitioning=self.time_partitioning
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 951, in run_loa
> return self.run_with_configuration(configuration
>   File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
> 1003, in run_with_configuratio
> err.resp.status
> Exception: ('BigQuery job status check failed. Final error was: %s', 404
> {code}
> The task failed to find a job {color:#ff}fmy-project:job_abc123{color} 
> but the correct job id is{color:#ff} 
> my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
> not actual id.)
>  I suppose the operator does not treat zone properly.
>   
> {code:java}
> $ bq show -j my-project:asia-northeast1:job_abc123
> Job my-project:asia-northeast1:job_abc123
> Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
> Billing Tier Labels
> -- - - -- 
> -- 
> - -- -- 
> load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)
Yohei Onishi created AIRFLOW-3571:
-

 Summary: GoogleCloudStorageToBigQueryOperator succeeds to 
uploading CSV file from GCS to BiqQuery but a task is failed
 Key: AIRFLOW-3571
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3571
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib
Affects Versions: 1.10.0
Reporter: Yohei Onishi


I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
 
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/fr-stg-datalake/jobs/job_QQE9TDEu88mfdw_fJHHEo9FtjXja?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
1003, in run_with_configuratio
err.resp.status
Exception: ('BigQuery job status check failed. Final error was: %s', 404
{code}
The task failed to find a job {color:#FF}fmy-project:job_abc123{color} but 
the correct job id is{color:#FF} 
my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
not actual id.)
I suppose the operator does not treat zone properly.
 
{code:java}
$ bq show -j my-project:asia-northeast1:job_abc123
Job my-project:asia-northeast1:job_abc123

Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
Billing Tier Labels
-- - - -- 
-- 
- -- -- 
load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (AIRFLOW-3571) GoogleCloudStorageToBigQueryOperator succeeds to uploading CSV file from GCS to BiqQuery but a task is failed

2018-12-26 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3571:
--
Description: 
I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
  
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
1003, in run_with_configuratio
err.resp.status
Exception: ('BigQuery job status check failed. Final error was: %s', 404
{code}
The task failed to find a job {color:#ff}fmy-project:job_abc123{color} but 
the correct job id is{color:#ff} 
my-project:asia-northeast1:job_abc123{color}. (Note: this is just an example, 
not actual id.)
 I suppose the operator does not treat zone properly.
  
{code:java}
$ bq show -j my-project:asia-northeast1:job_abc123
Job my-project:asia-northeast1:job_abc123

Job Type State Start Time Duration User Email Bytes Processed Bytes Billed 
Billing Tier Labels
-- - - -- 
-- 
- -- -- 
load SUCCESS 27 Dec 05:35:47 0:00:01 my-service-account-id-email
{code}

  was:
I am using the following service in asia-northeast1-c zone. * GCS: 
asia-northeast1-c
 * BigQuery dataset and table: asia-northeast1-c
 * Composer: asia-northeast1-c

My task created by GoogleCloudStorageToBigQueryOperator succeeded to uploading 
CSV file from a GCS bucket to a BigQuery table but the task was failed due to 
the following error.
 
{code:java}
[2018-12-26 21:35:47,464] {base_task_runner.py:107} INFO - Job 146: Subtask 
bq_load_data_into_dest_table_from_gcs [2018-12-26 21:35:47,464] 
{discovery.py:871} INFO - URL being requested: GET 
https://www.googleapis.com/bigquery/v2/projects/fr-stg-datalake/jobs/job_QQE9TDEu88mfdw_fJHHEo9FtjXja?alt=json
[2018-12-26 21:35:47,931] {models.py:1736} ERROR - ('BigQuery job status check 
failed. Final error was: %s', 404)
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
981, in run_with_configuratio
jobId=self.running_job_id).execute(
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/_helpers.py", 
line 130, in positional_wrappe
return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 
851, in execut
raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: https://www.googleapis.com/bigquery/v2/projects/my-project/jobs/job_abc123?alt=json
 returned "Not found: Job my-project:job_abc123"

During handling of the above exception, another exception occurred

Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1633, in _run_raw_tas
result = task_copy.execute(context=context
  File "/usr/local/lib/airflow/airflow/contrib/operators/gcs_to_bq.py", line 
237, in execut
time_partitioning=self.time_partitioning
  File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 
951, in run_loa
return self.run_with_configuration(configuration
  File 

[jira] [Commented] (AIRFLOW-2939) `set` fails in case of `exisiting_files is None` and in case of `json.dumps`

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728959#comment-16728959
 ] 

Yohei Onishi commented on AIRFLOW-2939:
---

send a PR for this issue https://github.com/apache/incubator-airflow/pull/4371

> `set` fails in case of `exisiting_files is None` and in case of `json.dumps`
> 
>
> Key: AIRFLOW-2939
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2939
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: operators
>Affects Versions: 2.0.0
>Reporter: Kiyoshi Nomo
>Priority: Major
>
> h1. Problems
> h2. TypeError: 'NoneType' object is not iterable
> [https://github.com/apache/incubator-airflow/blob/06b62c42b0b55ea55b86b130317594738d2f36a2/airflow/contrib/operators/gcs_to_s3.py#L91]
>  
> {code:java}
> >>> set(None)
> Traceback (most recent call last):
> File "", line 1, in 
> TypeError: 'NoneType' object is not iterable
> {code}
>  
> h2. TypeError: set(['a']) is not JSON serializable
> [https://github.com/apache/incubator-airflow/blob/b78c7fb8512f7a40f58b46530e9b3d5562fe84ea/airflow/models.py#L4483]
>  
> {code:python}
> >>> json.dumps(set(['a']))
> Traceback (most recent call last):
> File "", line 1, in 
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/__init__.py", 
> line 244, in dumps
> return _default_encoder.encode(obj)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 207, in encode
> chunks = self.iterencode(o, _one_shot=True)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 270, in iterencode
> return _iterencode(o, 0)
> File "/usr/local/opt/pyenv/versions/2.7.11/lib/python2.7/json/encoder.py", 
> line 184, in default
> raise TypeError(repr(o) + " is not JSON serializable")
> TypeError: set(['a']) is not JSON serializable
> {code}
>  
> h1. Solution
>  * Check that the existing fils is not None.
>  * Convert it to the `set` and return it to the `list` after get to the 
> difference of files.
> {code:python}
> if existing_files is not None:
> files = list(set(files) - set(existing_files))
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728958#comment-16728958
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

sent PR https://github.com/apache/incubator-airflow/pull/4371

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 

[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728946#comment-16728946
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

ok will do

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 chunks = 

[jira] [Commented] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16728941#comment-16728941
 ] 

Yohei Onishi commented on AIRFLOW-3568:
---

Thanks, I think his suggestion can fix my issue. 
https://issues.apache.org/jira/browse/AIRFLOW-2939

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 07:56:33,222] 

[jira] [Updated] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed after succeeding in copying files from s3

2018-12-26 Thread Yohei Onishi (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-3568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yohei Onishi updated AIRFLOW-3568:
--
Summary: S3ToGoogleCloudStorageOperator failed after succeeding in copying 
files from s3  (was: S3ToGoogleCloudStorageOperator failed to copy files from 
s3)

> S3ToGoogleCloudStorageOperator failed after succeeding in copying files from 
> s3
> ---
>
> Key: AIRFLOW-3568
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 1.10.0
>Reporter: Yohei Onishi
>Priority: Major
>
> I tried to copy files from s3 to gcs using 
> S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
> the task failed with the following error.
> {code:java}
> [2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - 
> URL being requested: POST 
> https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
> [2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
> INFO - All done, uploaded 1 files to Google Cloud Storage
> [2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is 
> not JSON serializable
> Traceback (most recent call last)
>   File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
> self.xcom_push(key=XCOM_RETURN_KEY, value=result
>   File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
> execution_date=execution_date or self.execution_date
>   File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
> return func(*args, **kwargs
>   File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
> value = json.dumps(value).encode('UTF-8'
>   File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
> return _default_encoder.encode(obj
>   File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
> chunks = self.iterencode(o, _one_shot=True
>   File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
> return _iterencode(o, 0
>   File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
> o.__class__.__name__
> TypeError: Object of type 'set' is not JSON serializabl
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
> Object of type 'set' is not JSON serializable
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 Traceback (most recent call last):
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1637, in _run_raw_task
> [2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
> [2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 1983, in xcom_push
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 execution_date=execution_date or 
> self.execution_date)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
> line 74, in wrapper
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return func(*args, **kwargs)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", 
> line 4531, in set
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", 
> line 231, in dumps
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3 return _default_encoder.encode(obj)
> [2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
> gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", 
> line 199, in encode
> [2018-12-26 

[jira] [Created] (AIRFLOW-3568) S3ToGoogleCloudStorageOperator failed to copy files from s3

2018-12-26 Thread Yohei Onishi (JIRA)
Yohei Onishi created AIRFLOW-3568:
-

 Summary: S3ToGoogleCloudStorageOperator failed to copy files from 
s3
 Key: AIRFLOW-3568
 URL: https://issues.apache.org/jira/browse/AIRFLOW-3568
 Project: Apache Airflow
  Issue Type: Bug
  Components: contrib
Affects Versions: 1.10.0
Reporter: Yohei Onishi


I tried to copy files from s3 to gcs using 
S3ToGoogleCloudStorageOperator. The file successfully was uploaded to GCS but 
the task failed with the following error.
{code:java}
[2018-12-26 07:56:33,062] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 [2018-12-26 07:56:33,062] {discovery.py:871} INFO - URL 
being requested: POST 
https://www.googleapis.com/upload/storage/v1/b/stg-rfid-etl-tmp/o?name=rfid_wh%2Fuq%2Fjp%2Fno_resp_carton_1D%2F2018%2F12%2F24%2F21%2Fno_resp_carton_20181224210201.csv=json=media
[2018-12-26 07:56:33,214] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 [2018-12-26 07:56:33,213] {s3_to_gcs_operator.py:177} 
INFO - All done, uploaded 1 files to Google Cloud Storage
[2018-12-26 07:56:33,217] {models.py:1736} ERROR - Object of type 'set' is not 
JSON serializable
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1637, in _run_raw_tas
self.xcom_push(key=XCOM_RETURN_KEY, value=result
  File "/usr/local/lib/airflow/airflow/models.py", line 1983, in xcom_pus
execution_date=execution_date or self.execution_date
  File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrappe
return func(*args, **kwargs
  File "/usr/local/lib/airflow/airflow/models.py", line 4531, in se
value = json.dumps(value).encode('UTF-8'
  File "/usr/local/lib/python3.6/json/__init__.py", line 231, in dump
return _default_encoder.encode(obj
  File "/usr/local/lib/python3.6/json/encoder.py", line 199, in encod
chunks = self.iterencode(o, _one_shot=True
  File "/usr/local/lib/python3.6/json/encoder.py", line 257, in iterencod
return _iterencode(o, 0
  File "/usr/local/lib/python3.6/json/encoder.py", line 180, in defaul
o.__class__.__name__
TypeError: Object of type 'set' is not JSON serializabl
[2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 [2018-12-26 07:56:33,217] {models.py:1736} ERROR - 
Object of type 'set' is not JSON serializable
[2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 Traceback (most recent call last):
[2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", line 
1637, in _run_raw_task
[2018-12-26 07:56:33,220] {models.py:1756} INFO - Marking task as UP_FOR_RETRY
[2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 self.xcom_push(key=XCOM_RETURN_KEY, value=result)
[2018-12-26 07:56:33,220] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", line 
1983, in xcom_push
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 execution_date=execution_date or self.execution_date)
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/utils/db.py", 
line 74, in wrapper
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 return func(*args, **kwargs)
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/airflow/airflow/models.py", line 
4531, in set
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 value = json.dumps(value).encode('UTF-8')
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/__init__.py", line 
231, in dumps
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 return _default_encoder.encode(obj)
[2018-12-26 07:56:33,221] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", line 
199, in encode
[2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 chunks = self.iterencode(o, _one_shot=True)
[2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File "/usr/local/lib/python3.6/json/encoder.py", line 
257, in iterencode
[2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3 return _iterencode(o, 0)
[2018-12-26 07:56:33,222] {base_task_runner.py:107} INFO - Job 39: Subtask 
gcs_copy_files_from_s3   File