Hi,

please find some answers inline. Hope this helps

On Dec 2, 2013, at 17:47 , Sergio Vavassori <svavass...@conwet.com> wrote:

> Good morning,
> 
> I have started using Apache S4 for a university project and I wanted to ask
> you some question about its architecture, mainly to be sure to do the
> modifications I need in the right way and to see if there is a cleaner and
> simpler one.
> 
> It's my understanding that a cluster is a group of nodes and each node has
> the same application-code copy; this means that if I want to partition the
> ProcessingElements between nodes I need to group them in different clusters.
> So, mapping S4 elements into a "classical" Stream Processing naming (Nodes,
> Operators, Slides...), would be having one application (Operator) per
> cluster and configure the ProcessingElements as singleton (1 Slide per
> Node).

you should use a key to partition your stream. Use the KeyFinder to identify 
keys in events.

> 
> About inter-cluster streaming:
> Is it possible to have broadcast stream between one cluster and all nodes
> of another cluster? Or should I re-implement RemoteSenders to do that? In
> this last case, is there a way to unbind Module mappring between interface
> and class used to resolve @inject?

Normally when key is null, dispatch mode is broadcast. Except for inter-cluster 
communication, where events are sent in round-robin mode by default.
To change that, you'd need to define your own senders, a configuration module, 
and override or replace the existing related modules.

> Is there a way to have that feature as per-stream configuration rather than
> all-stream cluster-wide?

That would be possible. Maybe even just by injecting your own implementations.

> Is there any functional difference (or limitation) between "RemoteStreams"
> and "Streams" beyond the naming to recognize inter-cluster vs intra-cluster
> streaming?

Differentiating inter and intra cluster communications is the reason. 
It's useful for publishing and performing bindings. 
And usually inter-cluster events are generic types, whereas you may have 
application-specific typed events for intra cluster communications. 

> 
> I saw there is an ongoing integration with helix project, which has a
> slightly different concept for partition since it can host more than one on
> the same node, but I couldn't find any example. Is there any work on it?
> 
> 
> Regards,
> Sergio Vavassori

Reply via email to