[ 
https://issues.apache.org/jira/browse/BEAM-7495?focusedWorklogId=281140&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-281140
 ]

ASF GitHub Bot logged work on BEAM-7495:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Jul/19 16:51
            Start Date: 23/Jul/19 16:51
    Worklog Time Spent: 10m 
      Work Description: aryann commented on pull request #9079: [BEAM-7495] Add 
progress reporting to the BigQuery source
URL: https://github.com/apache/beam/pull/9079#discussion_r306425962
 
 

 ##########
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java
 ##########
 @@ -322,5 +332,23 @@ public synchronized void close() {
           "Successfully split BigQuery Storage API stream. Split response: 
{}", splitResponse);
       return source.fromExisting(splitResponse.getRemainderStream());
     }
+
+    @Override
+    public synchronized Double getFractionConsumed() {
+      return fractionConsumed;
+    }
+
+    private static float getFractionConsumed(ReadRowsResponse response) {
+      // TODO(aryann): Once we rebuild the generated client code, we should 
change this to
+      // use getFractionConsumed().
 
 Review comment:
   My thinking is that since our primary use case is not to be a reference 
implementation that this is okay, though I do think that sometime soon we 
should update the client.
   
   I'm not sure if it makes sense to block 2.15 on the client being updated, 
though. The comments I've added make it clear what the fields we are setting 
are for someone who is trying to understand the internals of our source.
   
   That said, I'm willing to change my mind if there is a stronger opinion on 
this. :)
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 281140)
            Time Spent: 5h  (was: 4h 50m)
    Remaining Estimate: 499h  (was: 499h 10m)

> Add support for dynamic worker re-balancing when reading BigQuery data using 
> Cloud Dataflow
> -------------------------------------------------------------------------------------------
>
>                 Key: BEAM-7495
>                 URL: https://issues.apache.org/jira/browse/BEAM-7495
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-gcp
>            Reporter: Aryan Naraghi
>            Assignee: Aryan Naraghi
>            Priority: Major
>   Original Estimate: 504h
>          Time Spent: 5h
>  Remaining Estimate: 499h
>
> Currently, the BigQuery connector for reading data using the BigQuery Storage 
> API does not support any of the facilities on the source for Dataflow to 
> split streams.
>  
> On the server side, the BigQuery Storage API supports splitting streams at a 
> fraction. By adding support to the connector, we enable Dataflow to split 
> streams, which unlocks dynamic worker re-balancing.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to