[
https://issues.apache.org/jira/browse/BEAM-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004337#comment-17004337
]
Robert Burke edited comment on BEAM-7726 at 2/4/20 10:45 PM:
-------------------------------------------------------------
The data channel is correctly multiplexing bundles. There's no other way to do
the multiple streams thing in the current protocol and GRPC without the runner
having multiple endpoints, or the process doing so (eg. Multiple SDK Harnesses
per worker, which is how python handles it).
I think I have a resolution for state backed iterables blocking the
datachannel, which will work for any runners that support datasource split
requests. If the data channel is eventually split down to a the current value
and no more, we can close the reader, which will cause the channel to be
unblocked. Any buffered data will be drained. Care needs to be taken to avoid
deadlocking or dataloss or race conditions, but there should only be lock
contention when the Split thread is closing the reader.
Edit: I wasn't able to confirm that this actually worked better, and even
though there was no material locking overhead, the additional complexity to
that part of the code isn't worth questionable benefits. Tabling for now.
was (Author: lostluck):
The data channel is correctly multiplexing bundles. There's no other way to do
the multiple streams thing in the current protocol and GRPC without the runner
having multiple endpoints, or the process doing so (eg. Multiple SDK Harnesses
per worker, which is how python handles it).
I think I have a resolution for state backed iterables blocking the
datachannel, which will work for any runners that support datasource split
requests. If the data channel is eventually split down to a the current value
and no more, we can close the reader, which will cause the channel to be
unblocked. Any buffered data will be drained. Care needs to be taken to avoid
deadlocking or dataloss or race conditions, but there should only be lock
contention when the Split thread is closing the reader.
> [Go SDK] State Backed Iterables
> -------------------------------
>
> Key: BEAM-7726
> URL: https://issues.apache.org/jira/browse/BEAM-7726
> Project: Beam
> Issue Type: Improvement
> Components: sdk-go
> Affects Versions: Not applicable
> Reporter: Robert Burke
> Assignee: Robert Burke
> Priority: Major
> Fix For: Not applicable
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> The Go SDK should support the State backed iterables protocol per the proto.
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L644]
>
> Primary case is for iterables after CoGBKs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)