But, as far as I know, if tuple A goes to task 1 and task 2, then tuple A will always continue going to task 1 and task 2. Partial key grouping is the same as fields grouping, but being balanced across tasks.
On Tue, Jun 21, 2016 at 4:24 PM, Satish Duggana <[email protected]> wrote: > > In partial key grouping, the same fields (name) sometimes go to first and > second node even though I used "name" as partial key grouping fields. > Is it right behavior? > > > partial-key-grouping does not always send the tuples of the same fields > values to the same task. This grouping computes two hash values and finds > two tasks to which those tuples can be sent. It load balances between those > two tasks. That is why both the tasks in your environment have almost equal > no of tuples processed. > For ex: If you have 10 tasks and a tuples containing with selected name > field as David can go to task-1 and task-3. So, all tuples with name as > David are load balanced between task-1 and task-3. That is why you do not > see all the tuples with the same field does not go to the same task. > > > When I used WordCountExample, if errors happen in "Count" bolt, how does > "fail" from Count bolt forward to spout? > > 1) Count bolt sends "fail" to previous bolt (split) and then split bolt > sends it to "spout" > 2) Count bolt directly sends "fail" to "spout" > > > You can go through the below link for understanding guaranteed message > processing. You can send queries if you have any after that. > > http://storm.apache.org/releases/current/Guaranteeing-message-processing.html > > > > Thanks, > Satish. > > On Mon, Jun 20, 2016 at 7:16 PM, Junguk Cho <[email protected]> wrote: > >> Hi. >> >> I have two questions. >> >> First, it is about "Partial Key grouping" & "Fields Grouping". >> >> In my examples, I used employee class which has "name", "phonenumber", >> "salary" as tuple to send next worker. >> I only used "name" as key for groupings. >> >> Fields Grouping works as what I expected. >> Based on "Fields" values, it sends tuples to a next hop. >> However, Partial Key grouping did not work what I expected. >> Below are outputs from programs. >> >> >> # From fieldsgrouping >> # First node >> Mike 12345 13451 >> David 12345 13451 >> Andy 12345 13451 >> Junguk 12345 13452 >> Mike 12345 13452 >> David 12345 13452 >> Andy 12345 13452 >> >> # Second node >> Bob 12345 13451 >> Bob 12345 13452 >> >> >> #From partial key grouping >> # First node >> Mike 12345 13451 >> David 12345 13451 >> Mike 12345 13452 >> Bob 12345 13452 >> >> # Second node >> Junguk 12345 13451 >> Bob 12345 13451 >> Andy 12345 13451 >> Junguk 12345 13452 >> David 12345 13452 >> Andy 12345 13452 >> >> In partial key grouping, the same fields (name) sometimes go to first and >> second node even though I used "name" as partial key grouping fields. >> Is it right behavior? >> Or when we use partial key grouping, does it need other nodes to >> aggregate information from one first and second nodes? >> >> # Second question >> It is about "guaranteeing message processing" >> If I want to make a topology reliable, >> first in spout, I used *emit >> <https://nathanmarz.github.io/storm/doc/backtype/storm/spout/SpoutOutputCollector.html#emit(java.util.List,%20java.lang.Object)>*(java.util.List<java.lang.Object> >> tuple, >> java.lang.Object messageId) method from SpoutOUtputCollector >> and then in bolts, I used collector.emit(Tuple ahchor, List<Object> >> tuple). >> >> When I used WordCountExample, if errors happen in "Count" bolt, how does >> "fail" from Count bolt forward to spout? >> >> 1) Count bolt sends "fail" to previous bolt (split) and then split bolt >> sends it to "spout" >> 2) Count bolt directly sends "fail" to "spout" >> >> >> Thanks in advance. >> - Junguk >> >> > -- Regards, Navin
