Hi Beam users, We recently added a feature that makes BigQuery source for Python SDK more performant by using fastavro library [1] instead of avro [2] library. You can enable this by using following pipeline option for Beam 2.9.0 or later.
--experiment=use_fastavro Please try this out and file a JIRA against me if you notice any issues. Please note that this change only affects Dataflow runner which users Avro export when reading from BigQuery. Direct runner uses direct table reads and will not be affected by this option. Current Python BigQuery source is a native source that is only supported by these two runners. Thanks, Cham [1] https://pypi.org/project/fastavro/ [2] https://pypi.org/project/avro/
