In partial key grouping, the same fields (name) sometimes go to first and
second node even though I used "name" as partial  key grouping fields.
Is it right behavior?


partial-key-grouping does not always send the tuples of the same fields
values to the same task. This grouping computes two hash values and finds
two tasks to which those tuples can be sent. It load balances between those
two tasks. That is why both the tasks in your environment have almost equal
no of tuples processed.
For ex: If you have 10 tasks and a tuples containing with selected name
field as David can go to task-1 and task-3. So, all tuples with name as
David are load balanced between task-1 and task-3. That is why you do not
see all the tuples with the same field does not go to the same task.


When I used WordCountExample, if errors happen in "Count" bolt, how does
"fail" from Count bolt forward to spout?

1) Count bolt sends "fail" to previous bolt (split) and then split bolt
sends it to "spout"
2) Count bolt directly sends "fail" to "spout"


You can go through the below link for understanding guaranteed message
processing. You can send queries if you have any after that.
http://storm.apache.org/releases/current/Guaranteeing-message-processing.html



Thanks,
Satish.

On Mon, Jun 20, 2016 at 7:16 PM, Junguk Cho <[email protected]> wrote:

> Hi.
>
> I have two questions.
>
> First, it is about "Partial Key grouping" & "Fields Grouping".
>
> In my examples, I used employee class which has "name", "phonenumber",
> "salary" as tuple to send next worker.
> I only used "name" as key for groupings.
>
> Fields Grouping works as what I expected.
> Based on "Fields" values, it sends tuples to a next hop.
> However, Partial Key grouping did not work what I expected.
> Below are outputs from programs.
>
>
> # From fieldsgrouping
> # First node
> Mike 12345 13451
> David 12345 13451
> Andy 12345 13451
> Junguk 12345 13452
> Mike 12345 13452
> David 12345 13452
> Andy 12345 13452
>
> # Second node
> Bob 12345 13451
> Bob 12345 13452
>
>
> #From partial key grouping
> # First node
> Mike 12345 13451
> David 12345 13451
> Mike 12345 13452
> Bob 12345 13452
>
> # Second node
> Junguk 12345 13451
> Bob 12345 13451
> Andy 12345 13451
> Junguk 12345 13452
> David 12345 13452
> Andy 12345 13452
>
> In partial key grouping, the same fields (name) sometimes go to first and
> second node even though I used "name" as partial  key grouping fields.
> Is it right behavior?
> Or when we use partial key grouping, does it need other nodes to aggregate
> information from one first and second nodes?
>
> # Second question
> It is about "guaranteeing message processing"
> If I want to make a topology reliable,
> first in spout,  I used  *emit
> <https://nathanmarz.github.io/storm/doc/backtype/storm/spout/SpoutOutputCollector.html#emit(java.util.List,%20java.lang.Object)>*(java.util.List<java.lang.Object>
>  tuple,
> java.lang.Object messageId) method from SpoutOUtputCollector
> and then in bolts,   I used collector.emit(Tuple ahchor, List<Object>
> tuple).
>
> When I used WordCountExample, if errors happen in "Count" bolt, how does
> "fail" from Count bolt forward to spout?
>
> 1) Count bolt sends "fail" to previous bolt (split) and then split bolt
> sends it to "spout"
> 2) Count bolt directly sends "fail" to "spout"
>
>
> Thanks in advance.
> - Junguk
>
>

Reply via email to