[ 
https://issues.apache.org/jira/browse/SAMZA-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Riccomini updated SAMZA-41:
---------------------------------

    Description: 
LocalJobFactory currently creates a single container (either in ProcessJob or 
ThreadJob) and assigns all partitions to it using:

{code}
val partitions = Util.getMaxInputStreamPartitions(config)
{code}

This works in the case where you only wish to run a single container that 
processes all messages. There are situations where one container is not enough, 
though. If you aren't using YARN, we don't provide an easy way to run multiple 
containers that split partitions between them. This support would be useful for 
running containers in EC2, for example, where you'd wish to run two EC2 
instances (for example) that host Samza containers that share partitions for a 
single job.

Some potential solutions:

1. Let developers statically assign partitions in config file.
2. Let developers define a container ID and container count, and let 
LocalJobFactory/ProcessJob/ThreadJob figure out which partitions the container 
should own. For example, a container with id 0 and container count 2 would own 
partitions 0, 2, 4, 6, 8, etc.
3. Write a different JobFactory for this case (e.g. EC2JobFactory)

  was:
LocalJobFactory currently creates a single container (either in ProcessJob or 
ThreadJob) and assigns all partitions to it using:

{code}
val partitions = Util.getMaxInputStreamPartitions(config)
{code}

This works in the case where you only wish to run a single container that 
processes all messages. There are situations where one container is not enough, 
though. If you aren't using YARN, we don't provide an easy way to run multiple 
containers that split partitions between them. This support would be useful for 
running containers in EC2, for example, where you'd wish to run two EC2 
instances (for example) that host Samza containers that share partitions for a 
single job.

Some potential solutions:

1. Let developers statically assign partitions in config file.
2. Let developers define a container ID and container count, and let 
LocalJobFactory/ProcessJob/ThreadJob figure out which partitions the container 
should own. For example, a container with id 0 and container count 2 would own 
partitions 0, 2, 4, 6, 8, etc.


    
> Support static partition assignment in LocalJobFactory
> ------------------------------------------------------
>
>                 Key: SAMZA-41
>                 URL: https://issues.apache.org/jira/browse/SAMZA-41
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>
> LocalJobFactory currently creates a single container (either in ProcessJob or 
> ThreadJob) and assigns all partitions to it using:
> {code}
> val partitions = Util.getMaxInputStreamPartitions(config)
> {code}
> This works in the case where you only wish to run a single container that 
> processes all messages. There are situations where one container is not 
> enough, though. If you aren't using YARN, we don't provide an easy way to run 
> multiple containers that split partitions between them. This support would be 
> useful for running containers in EC2, for example, where you'd wish to run 
> two EC2 instances (for example) that host Samza containers that share 
> partitions for a single job.
> Some potential solutions:
> 1. Let developers statically assign partitions in config file.
> 2. Let developers define a container ID and container count, and let 
> LocalJobFactory/ProcessJob/ThreadJob figure out which partitions the 
> container should own. For example, a container with id 0 and container count 
> 2 would own partitions 0, 2, 4, 6, 8, etc.
> 3. Write a different JobFactory for this case (e.g. EC2JobFactory)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to