Re: Storm partialkeygrouping & reliability

Navin Ipe Tue, 21 Jun 2016 04:55:25 -0700

But, as far as I know, if tuple A goes to task 1 and task 2, then tuple A
will always continue going to task 1 and task 2. Partial key grouping is
the same as fields grouping, but being balanced across tasks.


On Tue, Jun 21, 2016 at 4:24 PM, Satish Duggana <[email protected]>
wrote:

>
> In partial key grouping, the same fields (name) sometimes go to first and
> second node even though I used "name" as partial  key grouping fields.
> Is it right behavior?
>
>
> partial-key-grouping does not always send the tuples of the same fields
> values to the same task. This grouping computes two hash values and finds
> two tasks to which those tuples can be sent. It load balances between those
> two tasks. That is why both the tasks in your environment have almost equal
> no of tuples processed.
> For ex: If you have 10 tasks and a tuples containing with selected name
> field as David can go to task-1 and task-3. So, all tuples with name as
> David are load balanced between task-1 and task-3. That is why you do not
> see all the tuples with the same field does not go to the same task.
>
>
> When I used WordCountExample, if errors happen in "Count" bolt, how does
> "fail" from Count bolt forward to spout?
>
> 1) Count bolt sends "fail" to previous bolt (split) and then split bolt
> sends it to "spout"
> 2) Count bolt directly sends "fail" to "spout"
>
>
> You can go through the below link for understanding guaranteed message
> processing. You can send queries if you have any after that.
>
> http://storm.apache.org/releases/current/Guaranteeing-message-processing.html
>
>
>
> Thanks,
> Satish.
>
> On Mon, Jun 20, 2016 at 7:16 PM, Junguk Cho <[email protected]> wrote:
>
>> Hi.
>>
>> I have two questions.
>>
>> First, it is about "Partial Key grouping" & "Fields Grouping".
>>
>> In my examples, I used employee class which has "name", "phonenumber",
>> "salary" as tuple to send next worker.
>> I only used "name" as key for groupings.
>>
>> Fields Grouping works as what I expected.
>> Based on "Fields" values, it sends tuples to a next hop.
>> However, Partial Key grouping did not work what I expected.
>> Below are outputs from programs.
>>
>>
>> # From fieldsgrouping
>> # First node
>> Mike 12345 13451
>> David 12345 13451
>> Andy 12345 13451
>> Junguk 12345 13452
>> Mike 12345 13452
>> David 12345 13452
>> Andy 12345 13452
>>
>> # Second node
>> Bob 12345 13451
>> Bob 12345 13452
>>
>>
>> #From partial key grouping
>> # First node
>> Mike 12345 13451
>> David 12345 13451
>> Mike 12345 13452
>> Bob 12345 13452
>>
>> # Second node
>> Junguk 12345 13451
>> Bob 12345 13451
>> Andy 12345 13451
>> Junguk 12345 13452
>> David 12345 13452
>> Andy 12345 13452
>>
>> In partial key grouping, the same fields (name) sometimes go to first and
>> second node even though I used "name" as partial  key grouping fields.
>> Is it right behavior?
>> Or when we use partial key grouping, does it need other nodes to
>> aggregate information from one first and second nodes?
>>
>> # Second question
>> It is about "guaranteeing message processing"
>> If I want to make a topology reliable,
>> first in spout,  I used  *emit
>> <https://nathanmarz.github.io/storm/doc/backtype/storm/spout/SpoutOutputCollector.html#emit(java.util.List,%20java.lang.Object)>*(java.util.List<java.lang.Object>
>>  tuple,
>> java.lang.Object messageId) method from SpoutOUtputCollector
>> and then in bolts,   I used collector.emit(Tuple ahchor, List<Object>
>> tuple).
>>
>> When I used WordCountExample, if errors happen in "Count" bolt, how does
>> "fail" from Count bolt forward to spout?
>>
>> 1) Count bolt sends "fail" to previous bolt (split) and then split bolt
>> sends it to "spout"
>> 2) Count bolt directly sends "fail" to "spout"
>>
>>
>> Thanks in advance.
>> - Junguk
>>
>>
>


-- 
Regards,
Navin

Re: Storm partialkeygrouping & reliability

Reply via email to