sorry, the correct link to the first reference:
https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnOp.java
.

Thanks,
Xinyu

On Fri, Jan 18, 2019 at 10:35 AM Xinyu Liu <xinyuliu...@gmail.com> wrote:

> Hi, Omkar,
>
> Your observation is correct. Currently bundle is implemented in a
> per-event basis (Code is DoFnOp.processElement,
> https://docs.google.com/spreadsheets/d/1pIUQ8J658B7GPNDt5dwiJBRQyWep1n2rXlkONyA6ZMM/edit#gid=1709587251).
> We are working on supporting bundles in Samza right now so in Beam we can
> take advantage of it. Bundling is also critical to have better python
> performance so we are trying to get it out very soon (Feb-March).
>
> On the other hand, in java if you want to process in a batch fashion, you
> can use the Beam GroupIntoBatches api (
> https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/transforms/GroupIntoBatches.html).
> This will group the elements into batches and then deliver to your ParDo
> afterwards. Please let us know whether this works for you.
>
> Thanks,
> Xinyu
>
>
>
> On Fri, Jan 18, 2019 at 10:27 AM Daniel Chen <dch...@linkedin.com> wrote:
>
>> + Xinyu
>>
>> On 1/18/19, 10:05 AM, "Deshpande, Omkar" <omkar_deshpa...@intuit.com>
>> wrote:
>>
>>     Hello,
>>
>>     I am using Samza runner with Apache Beam. Is there any documentation
>> available on how bundles are implemented in the Samza runner?
>>     I have observed every Kafka record getting processed in its own
>> bundle. How can I get larger bundles?
>>
>>      <beam.version>2.9.0 <samza.version>0.14.1
>>
>>     Thanks,
>>     Omkar
>>
>>
>>

Reply via email to