Trident batching under the hood?

Dong Mo Thu, 03 Apr 2014 11:54:31 -0700

Dear list,

I am trying to understand trident's batching process more.


I understand that trident's spout takes input by batches.

My question is will the notion of batches still maintained during execution
of topology.

For example, I have this trident topology

Spout(batched stream) ---- FunctionA(operate on the batch) ----
PartionByFeild(involve network transfer due to repartitioning) ----
FunctionB(on a new batch or a stream of tuple?)

Function A take batched input from spout to do some mapping for example.

So will PartionByField only execute when FunctionA finished processing on
the whole batch of input? Or is it the case that as functionA map on each
tuple and emit it to the corresponding next stage by field like a fluid?

That is, does trident internal processing logical perform discretely like
the way it takes in batches or it falls back to tuple-by-tuple fluid model?
Is it possible to reason about "barriers" in trident's internal processing?

Thanks
-Mo

Trident batching under the hood?

Reply via email to