[ 
https://issues.apache.org/jira/browse/BEAM-9505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103272#comment-17103272
 ] 

Niel Markwick commented on BEAM-9505:
-------------------------------------

The only workaround is to either build your own version of 2.17 patched with my 
fix from the [GitHub pull request|https://github.com/apache/beam/pull/11438]

Or somehow filter out empty windows.

150,000 errors per hour is ~40/sec, I know that's a lot, but is it really 
slowing down your pipeline?

If you are talking about end-end latency, you can reduce latency by reducing 
the grouping factor in spannerIO.write() (at the expense of potentially more 
spanner CPU usage)

2.22 will default to a groupingFactor of 1 for streaming pipelines.

> SpannerIO spurious error message with empty bundles
> ---------------------------------------------------
>
>                 Key: BEAM-9505
>                 URL: https://issues.apache.org/jira/browse/BEAM-9505
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>    Affects Versions: 2.18.0, 2.19.0, 2.20.0
>            Reporter: Niel Markwick
>            Assignee: Niel Markwick
>            Priority: Minor
>             Fix For: 2.22.0
>
>         Attachments: Worker error log count.png
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> -When using DataflowRunner in streaming mode. DoFn.StartBundle is called 
> multiple times for the same bundle.-
> -This does not occur with DirectRunner.- 
> -This breaks DoFn's which require per-bundle setup and teardown  procedures.-
> When a bundle is empty (such as in streaming if a window is empty), SpannerIO 
> will report a spurious error message:
> {{IllegalStateException: Sorter should be null here}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to