[
https://issues.apache.org/jira/browse/S4-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159548#comment-13159548
]
Leo Neumeyer commented on S4-22:
--------------------------------
Subcluster definition: group of nodes over which we distribute an S4
application AND we establish dependencies between applications across clusters.
(If there was no dependency, the subclusters would be just plain separate
clusters.)
The key here is how to establish inter-app communication across clusters. An
EventSource (ES) API is made available in a Twitter preprocessing cluster. When
configured, the app that hosts the ES needs to know how to talk to, for
example, SmatApp app hosted on a different subcluster. If nodes were symmetric,
this can be done very easily without introducing any changes to the current API
and to the apps. In fact, the app developer shouldn't have to know what the
final configuration will be (single node, one clusters, 2 subclusters, etc.)
If nodes were not symmetric across clusters, we would have to have a different
approach, that's why Bruce and I concluded that a fully symmetric cluster was a
better approach. There is very little downside and it's simpler, all nodes are
still identical. In the future we could relax this requirements but it doesn't
seem worth it to complicate things now.
The main challenge now is to support multiple senders, one for each subcluster.
Events would be handed to each sender and sender will decide if the event must
be transmitted and to what node.
Thoughts?
> Adaptor
> -------
>
> Key: S4-22
> URL: https://issues.apache.org/jira/browse/S4-22
> Project: Apache S4
> Issue Type: Improvement
> Affects Versions: 0.5
> Reporter: Leo Neumeyer
> Assignee: Bruce Robbins
> Fix For: 0.5
>
> Attachments: s4-subclusters.pdf
>
>
> Need an adaptor for v0.5
> Idea I posted earlier:
> What do you think of this idea for a simple adaptor:
> - Adaptor extends App
> - Adaptor can send events but not receive (for now)
> - Adaptor is deployed as a regular App to the S4 cluster and as an
> Adaptor type in a host (separate from the S4 cluster).
> - Adaptor, unlike regular apps, can accept event data (in any format)
> directly, not via comm layer.
> - Input data is transformed into S4 events using a modular approach
> and by providing standard modules such as JSON.
> - Output events are exposed using EventSource and consumed by other
> apps without even knowing that they are Adaptors (only the App type is
> exposed in the cluster).
> - S4 events can be processed locally using PEs and Streams as usual.
> (We kind of need to get a local Sender for the local PEs and a
> standard cluster Sender for the EventSource object.)
> So why this approach?
> The GOOD:
> - Seems to be the least disruptive way to inject external events
> - Apps can easily consume the events in a modular way without any
> dependencies. Getting events from an adaptor or from another app is
> identical.
> - The adaptor would be packaged and deployed to the cluster as if it
> was an App (no incremental cost)
> - The adaptor can do preprocessing using the same programming model
> and can reuse PEs.
> The CHALLENGE:
> - We need to also deploy the Adaptor in a separate host. On the other
> hand, this is inevitable. At least we use the same approach instead of
> creating a different system.
> - The Adaptor will need to be integrated with ZK to get the physical
> addresses.
> - We need to deal with two senders.
> for later: two-way communication and adapter clusters.
> thoughts?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira