Yurii Atamanchuk created BEAM-7883:
--------------------------------------

             Summary: PubsubIO (Java) write batch size can exceed request 
payload limit
                 Key: BEAM-7883
                 URL: https://issues.apache.org/jira/browse/BEAM-7883
             Project: Beam
          Issue Type: Bug
          Components: io-java-gcp
    Affects Versions: 2.13.0
            Reporter: Yurii Atamanchuk


In some (probably rare) cases PubsubIO write (in Batch mode) batch size can 
exceed request payload limit of 10mb. PubsubIO ensures that batch size is less 
than limit (10mb by default). But then PubsubJsonClient is used that converts 
message payloads into URL-Safe Base64 encoding which can inflate message size 
(in my case for json strings it was up to 25-30%). As result we get 400 
response (with 'Request payload size exceeds the limit: 10485760 bytes' 
message), even though original batch had correct size.
Obvious workaround is to reduce batch size 
(`PubsubIO.writeMessages().to(...).withMaxBatchBytesSize(... i.e. 5mb ...)`), 
but it is a bit annoying.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to