For your case, if messages have the same field value, they will be send to
only one executor in whole topology.

Best regards,
Dmytro Dragan
On Jun 8, 2015 08:31, "Seungtack Baek" <[email protected]>
wrote:

> Thanks a lot for such a timely response.
>
> So, even if each bolt tasks resides in different worker (different server
> in our use-case), the messages go to all 32 tasks, right?
>
> Also, this leads me into another question. (I think the answer is yes).
> Given field grouping guarantees that messages with same "field value" go
> to the same task, does "the same task" mean across all workers? or within
> same worker.
>
> For example, let's two kafka partition 0, 1, spout task s1, s2 and bolt
> tasks b1, b2, b3 and b4 distributed across two workers w1 and w2.
> So it looks like,
> w1
>  - partition_0 -> s1 -> b1 & b2
> w2
>  - partition_1 -> s2 -> b3 & b4
>
> When two messages with same field value, m1 and m2 are produced to kafka
> partition 0 and 1, respectively, does both m1 and m2 go to same bolt, say
> b3? Or, does it get sent to same bolt in each worker (say b1 in w1 and b3
> in w3)?
>
> Simply put, does field grouping groups messages in whole topology? or only
> groups in a single worker?
>
> Thanks,
> Baek
>
>
>
>
>
> *Seungtack Baek | Precocity, LLC*
>
> Tel/Direct: (972) 378-1030 | Mobile: (214) 477-5715
>
> *[email protected] <[email protected]>* |
> www.precocityllc.com
>
>
> This is the end of this message.
>
> --
>
> On Mon, Jun 8, 2015 at 12:17 AM, Dima Dragan <[email protected]>
> wrote:
>
>> Hi, Seungtack!
>>
>> Distribution of messages will be depends only from grouping (in case of
>> "shuffe grouping", Tuples are randomly distributed across the all bolt's
>> tasks in a way such that each bolt is guaranteed to get an equal number of
>> tuples.
>>
>> Best regards,
>> Dmytro Dragan
>> On Jun 8, 2015 07:12, "Seungtack Baek" <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I have read from the documentation that if you have more spout tasks
>>> than kafka partition, the excessive tasks will remain idle for entire
>>> lifecycle of the topology.
>>>
>>> Now, Let's consider 4 spout tasks, 32 bolt tasks (of one class) in 4
>>> workers (in 4 nodes) and 2 partitions in kafka. Then 2 tasks will be
>>> assigned to each partitions in kafka and the other 2 will remain idle.
>>> However, does that mean that only the bolts within the same worker will get
>>> the messages (assuming shuffle grouping)? Or, do the messages get emitted
>>> to whatever bolt taks available, regardless of which worker?
>>>
>>> Thanks,
>>> Baek
>>>
>>>
>>> *Seungtack Baek | Precocity, LLC*
>>>
>>> Tel/Direct: (972) 378-1030 | Mobile: (214) 477-5715
>>>
>>> *[email protected] <[email protected]>* |
>>> www.precocityllc.com
>>>
>>>
>>> This is the end of this message.
>>>
>>> --
>>>
>>> On Sun, Jun 7, 2015 at 10:12 PM, Seungtack Baek <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have read from the documentation that if you have more spout tasks
>>>> than kafka partition, the excessive tasks will remain idle for entire
>>>> lifecycle of the topology.
>>>>
>>>> Now, Let's consider 4 spout tasks, 32 bolt tasks (of one class) in 4
>>>> workers (in 4 nodes) and 2 partitions in kafka. Then 2 tasks will be
>>>> assigned to each partitions in kafka and the other 2 will remain idle.
>>>> However, does that mean that only the bolts within the same worker will get
>>>> the messages (assuming shuffle grouping)? Or, do the messages get emitted
>>>> to whatever bolt taks available, regardless of which worker?
>>>>
>>>> Thanks,
>>>> Baek
>>>>
>>>>
>>>> *Seungtack Baek | Precocity, LLC*
>>>>
>>>> Tel/Direct: (972) 378-1030 | Mobile: (214) 477-5715
>>>>
>>>> *[email protected] <[email protected]>* |
>>>> www.precocityllc.com
>>>>
>>>>
>>>> This is the end of this message.
>>>>
>>>> --
>>>>
>>>
>>>
>

Reply via email to