To undertsand this, if you haven't already read those, i'd advise you to take a 
look at 
https://github.com/nathanmarz/storm/wiki/Trident-tutorial 
then 
https://github.com/nathanmarz/storm/wiki/Trident-API-Overview 

To sumarize this and reuse the words of Nathan, you will have for instance 100 
tuples in one batch. Then partionning kicks in and you may have one partition 
with 30 tuples, an other with 50 and a last one with 20 that may be 
parallelized. 

If you use .partitionBy, you will define the "key fields" of your partionning. 
For instance partitionBy(new Fields("id")) will ensure that all the tuples from 
the batch that have a matching "id" field be grouped in the same partition. 
Aggregation and groupBy uses that mechanism as it's necessary to have the whole 
set of tuples that you want to aggregate together. 

That beeing said, you should understand that in the complete method of your 
Aggregator, only the tuples in the current partion will have been aggregated. 
You can still use .global() or .batchGlobal() to aggrega te all the tuples of 
the batch into one partition. 


Laurent 

----- Mail original -----

De: "Michal Singer" <[email protected]> 
À: [email protected] 
Envoyé: Mardi 21 Janvier 2014 08:37:59 
Objet: RE: batch and partition - differences 



So a batch can be divided into multiple partitions? And then for example a 
aggregator will aggregate all the tuples in the batch in the complete method? 
thanks 

From: [email protected] [mailto: [email protected] ] On Behalf Of 
Nathan Marz 
Sent: Tuesday, January 21, 2014 7:06 AM 
To: [email protected] 
Subject: Re: batch and partition - differences 


A batch is all the tuples being computed on at once each run of the topology. 
Each stage of the processing is split into partitions for parallelization. 



On Mon, Jan 20, 2014 at 3:31 AM, Michal Singer < [email protected] > wrote: 


Hi, it is not clear what is the different between batch and partition on the 
Trident. 
Is partition the task that the batch is performed on? 
Can someone explain the difference? 
thanks 





-- 


Twitter: @nathanmarz 
http://nathanmarz.com 

Reply via email to