[
https://issues.apache.org/jira/browse/S4-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585212#comment-13585212
]
kishore gopalakrishna commented on S4-110:
------------------------------------------
What Aimee suggest is that its easy to think in terms of PE's instead of trying
to align partitioning of multiple streams.
Lets take the above case you mentioned where a PE consumes S1 and S2 streams.
Partitioning streams
Lets say S1 and S2 are partitioned p1 and p2 ways. If some one wants to join p1
and p2 then we have the following options
1) p1=p2, then we can align each individual partitions of S1 and S2 to be on
the same node. This avoids extra hop
2) p1 != p2, in this case we cannot need to have another re-route-PE that
listens to S1 and S2 and write the events to another stream S1-S2-merged and
this can be partitioned any number of ways. This results in one extra hop for
each stream involved in the join.
Another problem with this approach irrespective of above two options is, the
parallelism of each PE is determined by the partitioning on the stream. For
example is S1 is partitioned 10 ways, then a countPE and statisticMeanPE need
to be partitioned in the same way. This may not be desirable in some cases.
Partitioning PE's
In this case, we partition the PE irrespective of S1 and S2. Lets say PE is
partitioned p ways. With this option, the sender will have to construct a map
of stream to petype and (petype,partition) to (node). So for a given a stream
and key it will compute partition for each pe (by hashing the key) and compute
the set of nodes that it needs to send the event to.
This will avoid the need for re-route-pe we needed in the case of stream
partitioning. But in worst case, it might result in sending one event for every
PE that is interested in the stream.
So both methods have pro's and cons and one might be favorable over other based
on the scenario.
Its possible to support both mode but prefer not to add more complications. We
should pick one option( either is fine) for now.
> Enhance cluster management features in S4
> -----------------------------------------
>
> Key: S4-110
> URL: https://issues.apache.org/jira/browse/S4-110
> Project: Apache S4
> Issue Type: New Feature
> Reporter: kishore gopalakrishna
> Assignee: kishore gopalakrishna
> Fix For: 0.6
>
>
> In S4 the number of partition is fixed for all streams and is dependent on
> the number of nodes in the cluster. Adding new nodes to S4 cluster causes
> the number of partitions to change. This results in lot of data movement. For
> example if there are 4 nodes and you add another node then nearly all keys
> will be remapped which result is huge data movement where as ideally only 20%
> of the data should move.
> By using Helix, every stream can be partitioned differently and independent
> of the number of nodes. Helix distributes the partitions evenly among the
> nodes. When new nodes are added, partitions can be migrated to new nodes
> without changing the number of partitions and minimizes the data movement.
>
> In S4 handles failures by having stand by nodes that are idle most of the
> time and become active when a node fails. Even though this works, its not
> ideal in terms of efficient hardware usage since the stand by nodes are idle
> most of the time. This also increases the fail over time since the PE state
> has to be transfered to only one node.
> Helix allows S4 to have Active and Standby nodes at a partition level so that
> all nodes can be active but some partitions will be Active and some in stand
> by mode. When a node fails, the partitions that were Active on that node
> will be evenly distributed among the remaining nodes. This provides automatic
> load balancing and also improves fail over time, since PE state can be
> transfered to multiple nodes in parallel.
> I have a prototype implementation here
> https://github.com/kishoreg/incubator-s4
> Instructions to build it and try it out are in the Readme.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira