Hi, please find some answers inline. Hope this helps
On Dec 2, 2013, at 17:47 , Sergio Vavassori <svavass...@conwet.com> wrote: > Good morning, > > I have started using Apache S4 for a university project and I wanted to ask > you some question about its architecture, mainly to be sure to do the > modifications I need in the right way and to see if there is a cleaner and > simpler one. > > It's my understanding that a cluster is a group of nodes and each node has > the same application-code copy; this means that if I want to partition the > ProcessingElements between nodes I need to group them in different clusters. > So, mapping S4 elements into a "classical" Stream Processing naming (Nodes, > Operators, Slides...), would be having one application (Operator) per > cluster and configure the ProcessingElements as singleton (1 Slide per > Node). you should use a key to partition your stream. Use the KeyFinder to identify keys in events. > > About inter-cluster streaming: > Is it possible to have broadcast stream between one cluster and all nodes > of another cluster? Or should I re-implement RemoteSenders to do that? In > this last case, is there a way to unbind Module mappring between interface > and class used to resolve @inject? Normally when key is null, dispatch mode is broadcast. Except for inter-cluster communication, where events are sent in round-robin mode by default. To change that, you'd need to define your own senders, a configuration module, and override or replace the existing related modules. > Is there a way to have that feature as per-stream configuration rather than > all-stream cluster-wide? That would be possible. Maybe even just by injecting your own implementations. > Is there any functional difference (or limitation) between "RemoteStreams" > and "Streams" beyond the naming to recognize inter-cluster vs intra-cluster > streaming? Differentiating inter and intra cluster communications is the reason. It's useful for publishing and performing bindings. And usually inter-cluster events are generic types, whereas you may have application-specific typed events for intra cluster communications. > > I saw there is an ongoing integration with helix project, which has a > slightly different concept for partition since it can host more than one on > the same node, but I couldn't find any example. Is there any work on it? > > > Regards, > Sergio Vavassori