Max,
Between two operators the communication depends on stream locality. For
thread local, the tuple is passed via thread stack; for container local
there is a queue in between. For the rest there is a buffer server. This is
effectively a pub-sub mechanism built over tcp-ip. Default is kryo
serialization, but you can add your own. Addressing is via pub-sub; i.e.
sender does not bother about addressing, the receiver connects to the
buffer server showing interest. Routing is kicked off during launch by the
master (Stram) and is a launch time or "re-do physical plan" time decision.
Re-do will happen in outage, or dynamic changes.

Thks
Amol


On Fri, Nov 25, 2016 at 8:33 AM, Max Bridgewater <[email protected]>
wrote:

> Hi Folks,
>
> I was giving an Apex demo the other day and people asked following
> questions:
>
> 1) what is the communication protocol between operators when they are on
> distant nodes. That means, how does Apex transport the tuples from one node
> to the other?
> Is it a custom protocol on top of TCP/IP or is it RPC?
> 2) What is the serialization algorithm used?
> 3) What is the addressing scheme between operators? That means how does
> Apex know where an operator is located and how to route data to it? Is
> there an operator registry? If so, where does it reside?
>
> Thoughts?
>
> Thanks,
> Max.
>
>

Reply via email to