GitHub user cjmcgraw opened a pull request:

    https://github.com/apache/beam/pull/3619

    [BEAM-2660] Set PubsubIO batch size using builder

    BEAM-2660 asks for controlling batch size using the `PubsubIO.Write.Builder`
    
    This PR adds Two values configurable through the `PubsubIO.Write.Builder`:
    - `maxBatchSize` - controls the bulk batch request size
    - `maxBatchByteSize` - controls the bulk batch bytes request size
    
    In this PR I have also made a modification to the 
`PubsubIO.Write.PubsubBoundedWriter`. Now the writer will dynamically track the 
number of bytes allocated for all messages. If the number of bytes exceeds the 
threshold it will publish before adding more messages.
    
    If the message size exceeds the `maxBatchByteSize` then an exception will 
be thrown
    
    An example use case of the new parameter is:
    
    ```java
    PubsubIO.writeMessages()
        .withMaxBatchSize(100)
        .withMaxBatchByteSize(100000)
       .to("my-topic")
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cjmcgraw/beam update-pubsubIO

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3619.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3619
    
----
commit 7eff3ea99da3fad85e10ac50c2b2bc6fec89a1fc
Author: Carl McGraw <[email protected]>
Date:   2017-07-22T22:30:40Z

    Added maxPublishBatchSize parameter to PubsubBoundedWriter class.

commit 95f23cd98c2008e0f5712ed68036bfb71caaa144
Author: Carl McGraw <[email protected]>
Date:   2017-07-23T00:30:18Z

    updated BoundedPubsubWriter to dynamically flush if queued messages exceed 
a pre-defined maximum batch byte size

commit c2abeb926c71bf21bbcc9406986c340d2c9d63e0
Author: Carl McGraw <[email protected]>
Date:   2017-07-23T01:17:03Z

    updated UnboundedPubsubSink to accept new parameters.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to