Yurii Atamanchuk created BEAM-7883:
--------------------------------------
Summary: PubsubIO (Java) write batch size can exceed request
payload limit
Key: BEAM-7883
URL: https://issues.apache.org/jira/browse/BEAM-7883
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Affects Versions: 2.13.0
Reporter: Yurii Atamanchuk
In some (probably rare) cases PubsubIO write (in Batch mode) batch size can
exceed request payload limit of 10mb. PubsubIO ensures that batch size is less
than limit (10mb by default). But then PubsubJsonClient is used that converts
message payloads into URL-Safe Base64 encoding which can inflate message size
(in my case for json strings it was up to 25-30%). As result we get 400
response (with 'Request payload size exceeds the limit: 10485760 bytes'
message), even though original batch had correct size.
Obvious workaround is to reduce batch size
(`PubsubIO.writeMessages().to(...).withMaxBatchBytesSize(... i.e. 5mb ...)`),
but it is a bit annoying.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)