Egbert created BEAM-7107:
----------------------------
Summary: PubsubIO may exceed maximum payload size
Key: BEAM-7107
URL: https://issues.apache.org/jira/browse/BEAM-7107
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Affects Versions: 2.11.0
Reporter: Egbert
In a batch job on Dataflow that reads payload and metadata from a Bigquery
table and publishes them to PubsubIO, I sometimes experience errors:
{noformat}
com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad
Request
"message" : "Request payload size exceeds the limit: 10485760 bytes.",
{noformat}
PubsubIO Javadoc says it will use the the global limit of 10 MiB by default but
it seems that doesn't work in all circumstances. I'm handling relatively large
records here, up to 600 KiB per message.
Adding
{code:java}
.withMaxBatchBytesSize(5242880){code}
after
{code:java}
PubsubIO.writeMessages().to(topic){code}
fixes this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)