[ 
https://issues.apache.org/jira/browse/SAMZA-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated SAMZA-662:
------------------------------
    Attachment: SAMZA-662-0.9.1-branch.patch

This is a rather large bug and is affecting users on the 0.9.0 release 
(SAMZA-685).  We should, in general, provide a patch for bugs back to the 
affected branch in order to make it easier for those on that branch to get the 
fix quickly and to minimize the work needed to create a bug-fix from that 
release.

Attaching a patch that applies to 0.9.0.  Did not apply cleanly, so is not the 
original patch, but there were no significant changes to make it apply.  
Someone else should still review the patch.

It's unfortunate there's no test for this fix, which would make it easier to be 
certain there's no interbranch funkiness with the new patch.  Opened SAMZA-686 
to address this, but we don't need to wait for a test.  

It'd be good to get started on the 0.9.1 release as soon as possible.  

> Samza auto-creates changelog stream without sufficient partitions when 
> container number > 1
> -------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-662
>                 URL: https://issues.apache.org/jira/browse/SAMZA-662
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.9.0
>            Reporter: Yan Fang
>            Assignee: Guozhang Wang
>         Attachments: SAMZA-662-0.9.1-branch.patch, SAMZA-662.v1.patch
>
>
> We currently provide auto-create for changelog streams. However, when there 
> are more than 1 containers, it is possible that Samza creates a changelog 
> stream with insufficient partitions. 
> Reason:
> assume we have an input stream with 3 partitions and then we assign 3 
> containers for this job. According to the 
> [JobCoordinator|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/coordinator/JobCoordinator.scala],
>  we will get:
> ||Container(Model) || InputStream Partition || Changelog Partition ||
> |0 | 0 | 0|
> |1 | 1 | 1|
> |2 | 2 | 2|
> If Container 0 is brought up first, it calls 
> {code}
>     val maxChangeLogStreamPartitions = containerModel.getTasks.values
>             .max(Ordering.by { task:TaskModel => 
> task.getChangelogPartition.getPartitionId })
>             .getChangelogPartition.getPartitionId + 1
> {code}
> in 
> [SamzaContainer|https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala].
> The maxChangeLogStreamPartition is 1. So we will auto-create a changelog 
> stream with only 1 partitions.
> Similarly, if the Container 2 is brought up first, we will get a stream with 
> 2 partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to