Hi Felix,

What you have described is correct. The commits are ordered i.e batch 1, batch 
2 etc in that order, even if batch 2 tuples completes processing before batch 1 
(with pipelining).

- Arun



On 2/9/16, 4:55 AM, "Felix Dreissig" <[email protected]> wrote:

>Thanks for your quick replies, Bobby and Arun!
>
>On 8 Feb 2016, at 18:03, Arun Mahadevan <[email protected]> wrote:
>> The execute phase is pipelined and only the commits are strictly ordered. 
>> 
>> So a trident bolt could receive tuples from batch1, batch2 and again batch1 
>> and so on. The framework internally maintains separate context for each 
>> batch and the execute is invoked with the respective batch’s context. The 
>> bolts could also emit tuples which are forwarded to the next bolt in the DAG 
>> without waiting for the batch to complete.
>
>Just to make sure I get this right: The intermixing of tuples from different 
>batches only happens when pipelining is enabled, doesn’t it?
>
>So, could the properties summarized as follows?
>Without pipelining: Tuples are assigned to a batch and emitted as soon as 
>possible. When all tuples of a batch have completed processing, a commit is 
>issued and afterwards, tuples of the next batch will begin processing.
>WIth pipeling: Tuples assigned to multiple different batches (at most 
>`topology.max.spout.pending` batches) may be active at a time. When all tuples 
>of a batch have completed processing, results from that batch are committed. 
>As long as the commit isn’t finished, no second commit will be started.
>
>Regards,
>Felix

Reply via email to