[
https://issues.apache.org/jira/browse/BEAM-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103272#comment-17103272
]
Niel Markwick commented on BEAM-9505:
-------------------------------------
The only workaround is to either build your own version of 2.17 patched with my
fix from the [GitHub pull request|https://github.com/apache/beam/pull/11438]
Or somehow filter out empty windows.
150,000 errors per hour is ~40/sec, I know that's a lot, but is it really
slowing down your pipeline?
If you are talking about end-end latency, you can reduce latency by reducing
the grouping factor in spannerIO.write() (at the expense of potentially more
spanner CPU usage)
2.22 will default to a groupingFactor of 1 for streaming pipelines.
> SpannerIO spurious error message with empty bundles
> ---------------------------------------------------
>
> Key: BEAM-9505
> URL: https://issues.apache.org/jira/browse/BEAM-9505
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Affects Versions: 2.18.0, 2.19.0, 2.20.0
> Reporter: Niel Markwick
> Assignee: Niel Markwick
> Priority: Minor
> Fix For: 2.22.0
>
> Attachments: Worker error log count.png
>
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> -When using DataflowRunner in streaming mode. DoFn.StartBundle is called
> multiple times for the same bundle.-
> -This does not occur with DirectRunner.-
> -This breaks DoFn's which require per-bundle setup and teardown procedures.-
> When a bundle is empty (such as in streaming if a window is empty), SpannerIO
> will report a spurious error message:
> {{IllegalStateException: Sorter should be null here}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)