To undertsand this, if you haven't already read those, i'd advise you to take a
look at
https://github.com/nathanmarz/storm/wiki/Trident-tutorial
then
https://github.com/nathanmarz/storm/wiki/Trident-API-Overview
To sumarize this and reuse the words of Nathan, you will have for instance 100
tuples in one batch. Then partionning kicks in and you may have one partition
with 30 tuples, an other with 50 and a last one with 20 that may be
parallelized.
If you use .partitionBy, you will define the "key fields" of your partionning.
For instance partitionBy(new Fields("id")) will ensure that all the tuples from
the batch that have a matching "id" field be grouped in the same partition.
Aggregation and groupBy uses that mechanism as it's necessary to have the whole
set of tuples that you want to aggregate together.
That beeing said, you should understand that in the complete method of your
Aggregator, only the tuples in the current partion will have been aggregated.
You can still use .global() or .batchGlobal() to aggrega te all the tuples of
the batch into one partition.
Laurent
----- Mail original -----
De: "Michal Singer" <[email protected]>
À: [email protected]
Envoyé: Mardi 21 Janvier 2014 08:37:59
Objet: RE: batch and partition - differences
So a batch can be divided into multiple partitions? And then for example a
aggregator will aggregate all the tuples in the batch in the complete method?
thanks
From: [email protected] [mailto: [email protected] ] On Behalf Of
Nathan Marz
Sent: Tuesday, January 21, 2014 7:06 AM
To: [email protected]
Subject: Re: batch and partition - differences
A batch is all the tuples being computed on at once each run of the topology.
Each stage of the processing is split into partitions for parallelization.
On Mon, Jan 20, 2014 at 3:31 AM, Michal Singer < [email protected] > wrote:
Hi, it is not clear what is the different between batch and partition on the
Trident.
Is partition the task that the batch is performed on?
Can someone explain the difference?
thanks
--
Twitter: @nathanmarz
http://nathanmarz.com