[
https://issues.apache.org/jira/browse/BEAM-8841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anonymous updated BEAM-8841:
----------------------------
Status: Triage Needed (was: Resolved)
> Add ability to perform BigQuery file loads using avro
> -----------------------------------------------------
>
> Key: BEAM-8841
> URL: https://issues.apache.org/jira/browse/BEAM-8841
> Project: Beam
> Issue Type: Improvement
> Components: io-py-gcp
> Reporter: Chun Yang
> Assignee: Chun Yang
> Priority: P3
> Fix For: 2.21.0
>
> Time Spent: 9h 40m
> Remaining Estimate: 0h
>
> Currently, JSON format is used for file loads into BigQuery in the Python
> SDK. JSON has some disadvantages including size of serialized data and
> inability to represent NaN and infinity float values.
> BigQuery supports loading files in avro format, which can overcome these
> disadvantages. The Java SDK already supports loading files using avro format
> (BEAM-2879) so it makes sense to support it in the Python SDK as well.
> The change will be somewhere aroundÂ
> [{{BigQueryBatchFileLoads}}|https://github.com/apache/beam/blob/3e7865ee6c6a56e51199515ec5b4b16de1ddd166/sdks/python/apache_beam/io/gcp/bigquery_file_loads.py#L554].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)