Re: [PR] Support configuring flush_count and max_row_bytes of WriteToBigTable [beam]

via GitHub Mon, 28 Apr 2025 08:24:10 -0700


andre-sampaio commented on PR #34761:
URL: https://github.com/apache/beam/pull/34761#issuecomment-2835623278


   Thanks for your contribution!
   
   The error you are seeing indicates you are sending more mutations than your 
bt cluster can handle, which causes requests to pile up on the server side 
until they can be processed and once too many requests get queued you start 
seeing this error message.
   
   Generally speaking increasing your batch sizes can make the problem worse, 
specially in bursty workloads. A good signal for whether or not this is what is 
happening for you is checking the cpu usage for your cluster during these 
bursts and seeing if they are at ~100%.
   
   I don't see anything wrong with this PR, but you may want to instead add 
knobs for [MAX_OUTSTANDING_ELEMENTS and 
MAX_OUTSTANDING_BYTES](https://github.com/googleapis/python-bigtable/blob/main/google/cloud/bigtable/batcher.py#L32)
 and try *reducing* those (which one depends on your use case, if you are 
writing many small rows reduce elements, if you are writing large rows reduce 
outstanding bytes). Or alternatively reduce the number of workers on your beam 
job.
   
   Let me know if this explanation helps


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Support configuring flush_count and max_row_bytes of WriteToBigTable [beam]

Reply via email to