[ 
https://issues.apache.org/jira/browse/SAMZA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544919#comment-14544919
 ] 

Navina Ramesh commented on SAMZA-676:
-------------------------------------

bq. So broad-stream-partition0 will only be processed once, either in task 1 or 
task 2. Therefore, my suggestion is that, when the consumer returns the result, 
it should also return the taskName information, such as task 1 -> Map, task 2 
-> Map 
If a container has more than 1 task and you fetch a message from a broadcast 
stream partition, will you be invoking each of the tasks in order ? Just need a 
clarification. 

bq. Is it a little more clear?
Yes. Thank you for explaining. I think there is a typo in the document as well.

bq.  change to Consumer API and Chooser API will also help for 
multiple-partition subscribe
I don't think you have mentioned how your design affects the Chooser API. Can 
you please explain or point me to the section if I missed it?

bq. Should we encourage the broadcast topic to have so many partitions ?
I think we should keep things simple. We can either enforce only 1 partition 
for any broadcast stream or all partitions will be consumed by all tasks. I 
think design in this JIRA should easily enable SAMZA-353 feature. If you think 
about both the use-cases, I think you can come up with an efficient and 
simplistic configuration.
 
I have another question. In a system which consumes from a broadcast stream, 
how will we calibrate the throughput of a job (messages processed per second) ? 
The same message is handled more than once in different tasks. 



> Implement Broadcast Stream
> --------------------------
>
>                 Key: SAMZA-676
>                 URL: https://issues.apache.org/jira/browse/SAMZA-676
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>            Reporter: Yan Fang
>            Assignee: Yan Fang
>         Attachments: BroadcastStreamDesign.md, BroadcastStreamDesign.pdf
>
>
> There are a lot of discussion in SAMZA-353 about assigning the same SSP to 
> multiple taskNames. This ticket is a subset of the discussion. Only focus on 
> the broadcast stream implementation. 
> The goal is to assign one SSP to all the taskNames. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to