[ 
https://issues.apache.org/jira/browse/SAMZA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542925#comment-14542925
 ] 

Navina Ramesh commented on SAMZA-676:
-------------------------------------

Hi Yan,
I went through your design document and have a few questions/comments.

1. I think usecase #3 and #4 are very similar. I have seen many instances of #4 
coming up (reg. bootstrap stream) which will work well with global state 
implementation. For now, broadcast stream is a feature for convenience. I think 
we have been suggesting workarounds in the mailing lists :) Thanks for picking 
it up!
2.      
|TaskName: |Partition 0| Partition 1| Partition 2 |
        |                                                                       
                                          |
        |Stream A|Partition 0 |Partition 1 |Partition 2  |
        |Stream B|Partition 0 |Partition 1 |Partition 2  |
        |Stream C|Partition 0 |Partition 1 |             |
        |*Broadcast Stream* |*Partition 0* |*Partition 0* |*Partition 0*  |

bq. a. Do all broadcast streams have only 1 partition?
bq. b. How does this affect the consumer’s messagechooser priority? does it 
provide more priority to broadcast stream by default ? In general, my question 
is how will each task proceed at the same rate. We could have hot partitions 
and those tasks may not react to the broadcast stream at the same time as other 
tasks.
bq. c. Is the broadcast stream also intended to make config changes at a task 
level? Isn’t that a functionality at the JC?

3. bq. However, this is the feature we will need for the broadcast stream. 
Because all the tasks will have the broadcast stream. When more than two tasks 
are assigned to the same container, the two broadcast streams have different 
offsets, the consumer needs to consumer the same stream more than once, with 
different offsets.
> Can you explain this better?

4. 
bq. task.global.input=kafka.foo#1,kafka.doo#0
Why is partition number needed here? Are you suggesting that the tasks can 
consume from one partition of the broadcast stream only? 
If I have a broadcast topic with 32 partitions and I want all tasks to consume 
from all of them, then specifying the config will be tedious. 


> Implement Broadcast Stream
> --------------------------
>
>                 Key: SAMZA-676
>                 URL: https://issues.apache.org/jira/browse/SAMZA-676
>             Project: Samza
>          Issue Type: Improvement
>          Components: container
>            Reporter: Yan Fang
>            Assignee: Yan Fang
>         Attachments: BroadcastStreamDesign.md, BroadcastStreamDesign.pdf
>
>
> There are a lot of discussion in SAMZA-353 about assigning the same SSP to 
> multiple taskNames. This ticket is a subset of the discussion. Only focus on 
> the broadcast stream implementation. 
> The goal is to assign one SSP to all the taskNames. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to