[
https://issues.apache.org/jira/browse/BEAM-10917?focusedWorklogId=634890&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-634890
]
ASF GitHub Bot logged work on BEAM-10917:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 06/Aug/21 00:25
Start Date: 06/Aug/21 00:25
Worklog Time Spent: 10m
Work Description: satybald commented on pull request #15185:
URL: https://github.com/apache/beam/pull/15185#issuecomment-893912744
The fast Avro pipeline finished. And it indeed way faster than python Avro
version. It's has the same ~5GiB/s thought put as a regular batch Extract job.
So, I believe we're here worker bound(however, it's an assumption that would be
nice to back up with data)
**Elapsed time**
Batch Extract Job - 58 min
BQ Storage with Fast Avro - 1 hours 27 min
But in terms of elapsed time, it got 30min slower. I believe this case
because, the job has 3 GRPC errors. Thus, the master had to fail work item and
retry on the different place. Each such fail contributed to ~10 min to the
total execution time.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 634890)
Time Spent: 14h 40m (was: 14.5h)
> Implement a BigQuery bounded source using the BigQuery storage API
> ------------------------------------------------------------------
>
> Key: BEAM-10917
> URL: https://issues.apache.org/jira/browse/BEAM-10917
> Project: Beam
> Issue Type: New Feature
> Components: io-py-gcp
> Reporter: Kenneth Jung
> Assignee: Kanthi Subramanian
> Priority: P3
> Time Spent: 14h 40m
> Remaining Estimate: 0h
>
> The Java SDK contains a bounded source implementation which uses the BigQuery
> storage API to read from BigQuery. We should implement the same for Python.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)