Hi, Omkar, Your observation is correct. Currently bundle is implemented in a per-event basis (Code is DoFnOp.processElement, https://docs.google.com/spreadsheets/d/1pIUQ8J658B7GPNDt5dwiJBRQyWep1n2rXlkONyA6ZMM/edit#gid=1709587251). We are working on supporting bundles in Samza right now so in Beam we can take advantage of it. Bundling is also critical to have better python performance so we are trying to get it out very soon (Feb-March).
On the other hand, in java if you want to process in a batch fashion, you can use the Beam GroupIntoBatches api ( https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/transforms/GroupIntoBatches.html). This will group the elements into batches and then deliver to your ParDo afterwards. Please let us know whether this works for you. Thanks, Xinyu On Fri, Jan 18, 2019 at 10:27 AM Daniel Chen <dch...@linkedin.com> wrote: > + Xinyu > > On 1/18/19, 10:05 AM, "Deshpande, Omkar" <omkar_deshpa...@intuit.com> > wrote: > > Hello, > > I am using Samza runner with Apache Beam. Is there any documentation > available on how bundles are implemented in the Samza runner? > I have observed every Kafka record getting processed in its own > bundle. How can I get larger bundles? > > <beam.version>2.9.0 <samza.version>0.14.1 > > Thanks, > Omkar > > >