[ https://issues.apache.org/jira/browse/SAMZA-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Riccomini updated SAMZA-41: --------------------------------- Labels: project (was: ) > Support static partition assignment in LocalJobFactory > ------------------------------------------------------ > > Key: SAMZA-41 > URL: https://issues.apache.org/jira/browse/SAMZA-41 > Project: Samza > Issue Type: Bug > Components: container > Affects Versions: 0.6.0 > Reporter: Chris Riccomini > Labels: project > > LocalJobFactory currently creates a single container (either in ProcessJob or > ThreadJob) and assigns all partitions to it using: > {code} > val partitions = Util.getMaxInputStreamPartitions(config) > {code} > This works in the case where you only wish to run a single container that > processes all messages. There are situations where one container is not > enough, though. If you aren't using YARN, we don't provide an easy way to run > multiple containers that split partitions between them. This support would be > useful for running containers in EC2, for example, where you'd wish to run > two EC2 instances (for example) that host Samza containers that share > partitions for a single job. > Some potential solutions: > 1. Let developers statically assign partitions in config file. > 2. Let developers define a container ID and container count, and let > LocalJobFactory/ProcessJob/ThreadJob figure out which partitions the > container should own. For example, a container with id 0 and container count > 2 would own partitions 0, 2, 4, 6, 8, etc. > 3. Write a different JobFactory for this case (e.g. EC2JobFactory) -- This message was sent by Atlassian JIRA (v6.3.4#6332)