[ 
https://issues.apache.org/jira/browse/BEAM-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312674#comment-17312674
 ] 

Kenneth Knowles commented on BEAM-11326:
----------------------------------------

Based on the PR being merged, I'm assuming this is fixed?

> Enforce deadlines during splitAtFraction in BigQueryStorageStreamSource
> -----------------------------------------------------------------------
>
>                 Key: BEAM-11326
>                 URL: https://issues.apache.org/jira/browse/BEAM-11326
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.25.0
>            Reporter: Kenneth Jung
>            Assignee: Kenneth Jung
>            Priority: P2
>             Fix For: 2.29.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In the 
> [BigQueryStorageStreamSource](https://github.com/apache/beam/blob/3bb232fb098700de408f574585dfe74bbaff7230/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java#L279),
>  we perform two RPCs during splitAtFraction: one to split the current stream 
> into primary and residual child streams, and a second to validate the 
> reader's current offset within the primary stream to ensure that the reader 
> has not advanced beyond the split point during the split process. For 
> sufficiently large streams -- particularly when combined with selective 
> predicate filters -- this process can take longer than the 2 minute limit 
> beyond which the Dataflow runtime will consider the worker to be lost and can 
> ultimately cause pipeline execution failures.
> The short-term solution is to implement a consistent deadline for both RPCs 
> which will fail the split operation if it takes too long. This does not 
> address the potential for sub-optimal parallelism and dynamic work 
> rebalancing, but it should at least prevent pipeline execution failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to