nguymin4 opened a new pull request, #31791:
URL: https://github.com/apache/beam/pull/31791

   **Issue**
   In https://github.com/apache/beam/pull/27085/files, we replaced 
`_MutationsBatcher` with `MutationsBatcher` from 
`google.cloud.bigtable.batcher`.
   
   However we didn't set `flush_count` and `max_row_bytes` of 
`MutationsBatcher` thus both constants `FLUSH_COUNT` and `MAX_ROW_BYTES` are 
currently not in used.
   
   => This PR fixed that issue
   
   **Implication**
   In `google.cloud.bigtable` library, `MutationsBatcher` also set default 
value for `flush-count` but there is currently a discrepancy between default 
value in documentation `1000` and actual default value `100` [See 
more](https://github.com/googleapis/python-bigtable/blob/c573e9b695658f3e0c78eb3443d0cf26fea57da9/google/cloud/bigtable/batcher.py#L189)
   
   For small/medium size use case, there shouldn't be any problems. But I have 
1 case which write 1 million rows and `flush_count: 100` is too low which 
creates a lot of requests to Bigtable server.
   
   One potential improvement is to allow both `flush_count` and `max_row_bytes` 
to be configurable. But this is open-to further discussion and out of scope of 
this PR.
   
   ------------------------
   
   GitHub Actions Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   [![Build python source distribution and 
wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python 
tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java 
tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go 
tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more 
information about GitHub Actions CI or the [workflows 
README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) 
to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to