[ 
https://issues.apache.org/jira/browse/SAMZA-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106016#comment-14106016
 ] 

Chris Riccomini commented on SAMZA-353:
---------------------------------------

bq. let AM read the global state and then broadcast to all the containers

I think this option might work for small state. You could even have the AM just 
serve the state via some HTTP interface, and let SamzaContainers pull it down 
on startup. Some things to think about with this approach:

# As you said, it seems to prohibit the state from getting very big. I don't 
think we want to be in a situation where we're serving/broadcasting 20 gigs of 
state to each SamzaContainer.
# Thus far, we've avoided having the any RPC between the AM and the 
SamzaContainers. That'll probably change as part of SAMZA-348, if we have 
containers call back to AMs for config information, though.
# This approach doesn't seem to support updating the global state after the 
fact. A concrete use-case here is a Hadoop job that periodically runs and 
updates the global state (by pushing data out to a Kafka topic after it 
crunches some data).
# The AM logic definitely gets a bit thicker with this approach. So far, 
keeping the AM light has made operating Samza jobs a bit easier, since we can 
pretty much always count on the AM working properly.

> Support assigning the same SSP to multiple tasknames
> ----------------------------------------------------
>
>                 Key: SAMZA-353
>                 URL: https://issues.apache.org/jira/browse/SAMZA-353
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.8.0
>            Reporter: Jakob Homan
>
> Post SAMZA-123, it is possible to add the same SSP to multiple tasknames, 
> although currently we check for this and error out if this is done.  We 
> should think through the implications of having the same SSP appear in 
> multiple tasknames and support this if it makes sense.  
> This could be used as a broadcast stream that's either added by Samza itself 
> to each taskname, or individual groupers could do this as makes sense.  Right 
> now the container maintains a map of SSP to TaskInstance and delivers the ssp 
> to that task instance.  With this change, we'd need to change the map to SSP 
> to Set[TaskInstance] and deliver the message to each TI in the set.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to