[jira] [Commented] (AIRFLOW-5224) gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ ingestion

2019-08-30 Thread Anand Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919392#comment-16919392
 ] 

Anand Kumar commented on AIRFLOW-5224:
--

[~walisko]  thanks for the update but I think it would be a strategic solution 
to add encoding to this operator

> gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ 
> ingestion
> --
>
> Key: AIRFLOW-5224
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5224
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, gcp
>Affects Versions: 1.10.0
> Environment: airflow software platform
>Reporter: Anand Kumar
>Priority: Blocker
>
> Hi,
> The current business project we are enabling has been built completely on GCP 
> components with composer with airflow being one of the key process. We have 
> built various data pipelines using airflow for multiple work-streams where 
> data is being ingested from gcs bucket to Big query.
> Based on the recent updates on Google BQ infra end, there seems to be some 
> tightened validations on UTF-8 characters which has resulted in mutiple 
> failures of our existing business process.
> On further analysis we found out that while ingesting data to BQ from a 
> Google bucket the encoding needs to be explicitly specified going forward but 
> the below operator currently doesn't  supply any params to specify explicit 
> encoding
> _*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_
>  Could someone please treat this as a priority and help us with a fix to 
> bring us back in BAU mode
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (AIRFLOW-5224) gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ ingestion

2019-09-19 Thread Anand Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933606#comment-16933606
 ] 

Anand Kumar commented on AIRFLOW-5224:
--

[~kaxilnaik] That's right and it doesn't automatically take care of it and a 
encoding specification in this operator would help.

Which version is this fix being taken care of ?

> gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ 
> ingestion
> --
>
> Key: AIRFLOW-5224
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5224
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG, gcp
>Affects Versions: 1.10.0
> Environment: airflow software platform
>Reporter: Anand Kumar
>Priority: Blocker
>
> Hi,
> The current business project we are enabling has been built completely on GCP 
> components with composer with airflow being one of the key process. We have 
> built various data pipelines using airflow for multiple work-streams where 
> data is being ingested from gcs bucket to Big query.
> Based on the recent updates on Google BQ infra end, there seems to be some 
> tightened validations on UTF-8 characters which has resulted in mutiple 
> failures of our existing business process.
> On further analysis we found out that while ingesting data to BQ from a 
> Google bucket the encoding needs to be explicitly specified going forward but 
> the below operator currently doesn't  supply any params to specify explicit 
> encoding
> _*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_
>  Could someone please treat this as a priority and help us with a fix to 
> bring us back in BAU mode
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (AIRFLOW-5224) gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ ingestion

2019-08-15 Thread Anand Kumar (JIRA)
Anand Kumar created AIRFLOW-5224:


 Summary: gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify 
Encoding for BQ ingestion
 Key: AIRFLOW-5224
 URL: https://issues.apache.org/jira/browse/AIRFLOW-5224
 Project: Apache Airflow
  Issue Type: Bug
  Components: DAG
Affects Versions: 1.10.0
 Environment: airflow software platform
Reporter: Anand Kumar


Hi,

The current business project we are enabling has been built completely on GCP 
components with composer with airflow being one of the key process. We have 
built various data pipelines using airflow for multiple work-streams where data 
is being ingested from gcs bucket to Big query.

Based on the recent updates on Google BQ infra end, there seems to be some 
tightened validations on UTF-8 characters which has resulted in mutiple 
failures of our existing business process. 

On further analysis we found out that while ingesting data to BQ from a Google 
bucket the encoding needs to be explicitly specified going forward but the 
below operator currently doesn't  supply any params to specify explicit encoding

_*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_

 Could you please treat this as a priority and help us with a fix to bring us 
back in BAU mode

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (AIRFLOW-5224) gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ ingestion

2019-08-15 Thread Anand Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/AIRFLOW-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Kumar updated AIRFLOW-5224:
-
Description: 
Hi,

The current business project we are enabling has been built completely on GCP 
components with composer with airflow being one of the key process. We have 
built various data pipelines using airflow for multiple work-streams where data 
is being ingested from gcs bucket to Big query.

Based on the recent updates on Google BQ infra end, there seems to be some 
tightened validations on UTF-8 characters which has resulted in mutiple 
failures of our existing business process.

On further analysis we found out that while ingesting data to BQ from a Google 
bucket the encoding needs to be explicitly specified going forward but the 
below operator currently doesn't  supply any params to specify explicit encoding

_*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_

 Could someone please treat this as a priority and help us with a fix to bring 
us back in BAU mode

 

  was:
Hi,

The current business project we are enabling has been built completely on GCP 
components with composer with airflow being one of the key process. We have 
built various data pipelines using airflow for multiple work-streams where data 
is being ingested from gcs bucket to Big query.

Based on the recent updates on Google BQ infra end, there seems to be some 
tightened validations on UTF-8 characters which has resulted in mutiple 
failures of our existing business process. 

On further analysis we found out that while ingesting data to BQ from a Google 
bucket the encoding needs to be explicitly specified going forward but the 
below operator currently doesn't  supply any params to specify explicit encoding

_*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_

 Could you please treat this as a priority and help us with a fix to bring us 
back in BAU mode

 


> gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ 
> ingestion
> --
>
> Key: AIRFLOW-5224
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5224
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: DAG
>Affects Versions: 1.10.0
> Environment: airflow software platform
>Reporter: Anand Kumar
>Priority: Blocker
>
> Hi,
> The current business project we are enabling has been built completely on GCP 
> components with composer with airflow being one of the key process. We have 
> built various data pipelines using airflow for multiple work-streams where 
> data is being ingested from gcs bucket to Big query.
> Based on the recent updates on Google BQ infra end, there seems to be some 
> tightened validations on UTF-8 characters which has resulted in mutiple 
> failures of our existing business process.
> On further analysis we found out that while ingesting data to BQ from a 
> Google bucket the encoding needs to be explicitly specified going forward but 
> the below operator currently doesn't  supply any params to specify explicit 
> encoding
> _*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_
>  Could someone please treat this as a priority and help us with a fix to 
> bring us back in BAU mode
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)