[ https://issues.apache.org/jira/browse/AIRFLOW-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933598#comment-16933598 ]
Eric Walisko commented on AIRFLOW-5224: --------------------------------------- [~kaxilnaik] I wish it did, however we saw this error for example: {code:java} {u'reason': u'invalid', u'message': u'Error while reading data, error message: \\xa3\\x31\\x30.99 is not a valid UTF-8 string, in row starting at position: 13662888 in column: 10', u'location': u '[redacted]'}{code} > gcs_to_bq.GoogleCloudStorageToBigQueryOperator - Specify Encoding for BQ > ingestion > ---------------------------------------------------------------------------------- > > Key: AIRFLOW-5224 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5224 > Project: Apache Airflow > Issue Type: Bug > Components: DAG, gcp > Affects Versions: 1.10.0 > Environment: airflow software platform > Reporter: Anand Kumar > Priority: Blocker > > Hi, > The current business project we are enabling has been built completely on GCP > components with composer with airflow being one of the key process. We have > built various data pipelines using airflow for multiple work-streams where > data is being ingested from gcs bucket to Big query. > Based on the recent updates on Google BQ infra end, there seems to be some > tightened validations on UTF-8 characters which has resulted in mutiple > failures of our existing business process. > On further analysis we found out that while ingesting data to BQ from a > Google bucket the encoding needs to be explicitly specified going forward but > the below operator currently doesn't supply any params to specify explicit > encoding > _*gcs_to_bq.GoogleCloudStorageToBigQueryOperator*_ > Could someone please treat this as a priority and help us with a fix to > bring us back in BAU mode > -- This message was sent by Atlassian Jira (v8.3.4#803005)