[ 
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045744#comment-17045744
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-8841:
-----------------------------------------------------

Chun, assigning the Jira to you since you already have a PR out :)

> Add ability to perform BigQuery file loads using avro
> -----------------------------------------------------
>
>                 Key: BEAM-8841
>                 URL: https://issues.apache.org/jira/browse/BEAM-8841
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-gcp
>            Reporter: Chun Yang
>            Assignee: Chun Yang
>            Priority: Minor
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, JSON format is used for file loads into BigQuery in the Python 
> SDK. JSON has some disadvantages including size of serialized data and 
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these 
> disadvantages. The Java SDK already supports loading files using avro format 
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere around 
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to