Hi Beam users,

We recently added a feature that makes BigQuery source for Python SDK more
performant by using fastavro library [1] instead of avro [2] library. You
can enable this by using following pipeline option for Beam 2.9.0 or later.

--experiment=use_fastavro

Please try this out and file a JIRA against me if you notice any issues.
Please note that this change only affects Dataflow runner which users Avro
export when reading from BigQuery. Direct runner uses direct table reads
and will not be affected by this option. Current Python BigQuery source is
a native source that is only supported by these two runners.

Thanks,
Cham

[1] https://pypi.org/project/fastavro/
[2] https://pypi.org/project/avro/

Reply via email to