andre-sampaio commented on PR #34761: URL: https://github.com/apache/beam/pull/34761#issuecomment-2835623278
Thanks for your contribution! The error you are seeing indicates you are sending more mutations than your bt cluster can handle, which causes requests to pile up on the server side until they can be processed and once too many requests get queued you start seeing this error message. Generally speaking increasing your batch sizes can make the problem worse, specially in bursty workloads. A good signal for whether or not this is what is happening for you is checking the cpu usage for your cluster during these bursts and seeing if they are at ~100%. I don't see anything wrong with this PR, but you may want to instead add knobs for [MAX_OUTSTANDING_ELEMENTS and MAX_OUTSTANDING_BYTES](https://github.com/googleapis/python-bigtable/blob/main/google/cloud/bigtable/batcher.py#L32) and try *reducing* those (which one depends on your use case, if you are writing many small rows reduce elements, if you are writing large rows reduce outstanding bytes). Or alternatively reduce the number of workers on your beam job. Let me know if this explanation helps -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org